R/datadoc_cd-info-long.R
cd_info_long.Rd
cd_info_long
provides a "long" version of the yearly cd_info_20**
datasets.
cd_info_long
cd_info_long
is a dataframe with 13050 rows,
covering the maps of 9 election years
(2008, 2010, 2012, 2014, 2016, 2018, 2020, 2022, 2024) for each of the 435 congressional districts.
lines
Is the year corresponding to the geography (district line). For example, lines = 2008
and cd = "AL-01
indicates that the row is representing AL-01's geography as used in the 2008 election.
cd
Is the CD corresponding to the year of the geography (district line). Note that districts can change drastically
by redistricting; a state's "first congressional district" from one lines
can cover a different area
than the same first congressional district for another.
elec
Is the year of the election for the presidential election data that follows
party
, candidate
Define the presidential candidate that corresponds to the elec
(which may not be the same as lines
). For example, lines = 2012, cd = AL-01
combined with
elec = 2008
represents the 2008 election results in the newly redistricted (2012) AL-01 geography
pct
The two party voteshares of the candidate
presvotes_total
The total number of votes for President in that CD
presvotes_DR
The total number of Democrat + Republican votes for President in that CD
cd_info for versions with one row per district, and documentation of sources
library(dplyr)
# get only data for proximate years
cd_info_long |> filter((elec == lines) | (elec + 2 == lines))
#> # A tibble: 6,960 × 8
#> lines cd elec party candidate pct presvotes_DR presvotes_total
#> <dbl> <chr> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 2008 AK-01 2008 D obama 0.389 317435 326197
#> 2 2008 AK-01 2008 R mccain 0.611 317435 326197
#> 3 2008 AL-01 2008 D obama 0.386 302483 304618
#> 4 2008 AL-01 2008 R mccain 0.614 302483 304618
#> 5 2008 AL-02 2008 D obama 0.371 290287 292038
#> 6 2008 AL-02 2008 R mccain 0.629 290287 292038
#> 7 2008 AL-03 2008 D obama 0.437 286127 288595
#> 8 2008 AL-03 2008 R mccain 0.563 286127 288595
#> 9 2008 AL-04 2008 D obama 0.227 265569 269450
#> 10 2008 AL-04 2008 R mccain 0.773 265569 269450
#> # ℹ 6,950 more rows
# this subset returns exactly 2 * 435 districts per cycle:
cd_info_long |> filter((elec == lines) | (elec + 2 == lines)) |> count(lines, party)
#> # A tibble: 16 × 3
#> lines party n
#> <dbl> <chr> <int>
#> 1 2008 D 435
#> 2 2008 R 435
#> 3 2010 D 435
#> 4 2010 R 435
#> 5 2012 D 435
#> 6 2012 R 435
#> 7 2014 D 435
#> 8 2014 R 435
#> 9 2016 D 435
#> 10 2016 R 435
#> 11 2018 D 435
#> 12 2018 R 435
#> 13 2020 D 435
#> 14 2020 R 435
#> 15 2022 D 435
#> 16 2022 R 435
# this will show where the districts lines changed between 2022 and 2024
# (same election, same candidate, different map)
cd_info_long |>
filter(lines %in% c(2022, 2024), elec == 2020, candidate == "biden") |>
arrange(cd, lines)
#> # A tibble: 870 × 8
#> lines cd elec party candidate pct presvotes_DR presvotes_total
#> <dbl> <chr> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 2022 AK-01 2020 D biden 0.447 343729 357569
#> 2 2024 AK-01 2020 D biden 0.447 343729 357569
#> 3 2022 AL-01 2020 D biden 0.357 323682. 327039.
#> 4 2024 AL-01 2020 D biden 0.245 322370 325715
#> 5 2022 AL-02 2020 D biden 0.351 316699. 319817.
#> 6 2024 AL-02 2020 D biden 0.563 309384 312225
#> 7 2022 AL-03 2020 D biden 0.328 313569. 316695.
#> 8 2024 AL-03 2020 D biden 0.293 318717 322031
#> 9 2022 AL-04 2020 D biden 0.188 323091. 326261.
#> 10 2024 AL-04 2020 D biden 0.186 322594 325713
#> # ℹ 860 more rows