R/datadoc_cd-info-long.R
cd_info_long.Rdcd_info_long provides a "long" version of the yearly cd_info_20** datasets.
cd_info_longcd_info_long is a dataframe with 13050 rows,
covering the maps of 9 election years
(2008, 2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024) for each of the 435 congressional districts.
linesIs the year corresponding to the geography (district line). For example, lines = 2008 and cd = "AL-01
indicates that the row is representing AL-01's geography as used in the 2008 election.
cdIs the CD corresponding to the year of the geography (district line). Note that districts can change drastically
by redistricting; a state's "first congressional district" from one lines can cover a different area
than the same first congressional district for another.
elecIs the year of the election for the presidential election data that follows
party, candidateDefine the presidential candidate that corresponds to the elec
(which may not be the same as lines). For example, lines = 2012, cd = AL-01 combined with
elec = 2008 represents the 2008 election results in the newly redistricted (2012) AL-01 geography
pctThe two party voteshares of the candidate
presvotes_totalThe total number of votes for President in that CD
presvotes_DRThe total number of Democrat + Republican votes for President in that CD
cd_info for versions with one row per district, and documentation of sources
library(dplyr)
# get only data for proximate years
cd_info_long |> filter((elec == lines) | (elec + 2 == lines))
#> # A tibble: 6,960 × 8
#> lines cd elec party candidate pct presvotes_DR presvotes_total
#> <dbl> <chr> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 2008 AK-01 2008 D obama 0.389 317435 326197
#> 2 2008 AK-01 2008 R mccain 0.611 317435 326197
#> 3 2008 AL-01 2008 D obama 0.386 302483 304618
#> 4 2008 AL-01 2008 R mccain 0.614 302483 304618
#> 5 2008 AL-02 2008 D obama 0.371 290287 292038
#> 6 2008 AL-02 2008 R mccain 0.629 290287 292038
#> 7 2008 AL-03 2008 D obama 0.437 286127 288595
#> 8 2008 AL-03 2008 R mccain 0.563 286127 288595
#> 9 2008 AL-04 2008 D obama 0.227 265569 269450
#> 10 2008 AL-04 2008 R mccain 0.773 265569 269450
#> # ℹ 6,950 more rows
# this subset returns exactly 2 * 435 districts per cycle:
cd_info_long |> filter((elec == lines) | (elec + 2 == lines)) |> count(lines, party)
#> # A tibble: 16 × 3
#> lines party n
#> <dbl> <chr> <int>
#> 1 2008 D 435
#> 2 2008 R 435
#> 3 2010 D 435
#> 4 2010 R 435
#> 5 2012 D 435
#> 6 2012 R 435
#> 7 2014 D 435
#> 8 2014 R 435
#> 9 2016 D 435
#> 10 2016 R 435
#> 11 2018 D 435
#> 12 2018 R 435
#> 13 2020 D 435
#> 14 2020 R 435
#> 15 2022 D 435
#> 16 2022 R 435
# this will show where the districts lines changed between 2022 and 2024
# (same election, same candidate, different map)
cd_info_long |>
filter(lines %in% c(2022, 2024), elec == 2020, candidate == "biden") |>
arrange(cd, lines)
#> # A tibble: 870 × 8
#> lines cd elec party candidate pct presvotes_DR presvotes_total
#> <dbl> <chr> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 2022 AK-01 2020 D biden 0.447 343729 357569
#> 2 2024 AK-01 2020 D biden 0.447 343729 357569
#> 3 2022 AL-01 2020 D biden 0.357 323682. 327039.
#> 4 2024 AL-01 2020 D biden 0.245 322370 325715
#> 5 2022 AL-02 2020 D biden 0.351 316699. 319817.
#> 6 2024 AL-02 2020 D biden 0.563 309384 312225
#> 7 2022 AL-03 2020 D biden 0.328 313569. 316695.
#> 8 2024 AL-03 2020 D biden 0.293 318717 322031
#> 9 2022 AL-04 2020 D biden 0.188 323091. 326261.
#> 10 2024 AL-04 2020 D biden 0.186 322594 325713
#> # ℹ 860 more rows