Get one column of outcome questions for each model

get_cces_question(
  qcode,
  year,
  qID,
  dataframe = NULL,
  y_named_as = "response",
  data_dir = "data/input/cces",
  verbose = TRUE
)

Arguments

qcode

A string, the variable name of the outcome variable of interest, exactly as it appears in the CCES dataset. Together with year, uniquely identifies the outcome question data

year

A string for the year the question was asked, which defines the the dataset. If using a dataset from the cumulative, set to "cumulative". Together with qcode, uniquely identifies the outcome question data

qID

A string, the user's unique name for the question. For example, we use the syntax in our question table.

dataframe

An optional argument to pass the entire CCES dataframe in-environment. If left empty, this will supersede the arguments year and dataframe and use the provided datasets instead.

y_named_as

What to name the response variable. Defaults to "response", which is also the default assumption for cces_join_slim.

data_dir

A path for the directory where flat files for the CCES common content is stored. Currently the data must be of the form cces_{year}.rds (e.g. "cces_2016.rds") and it must exist wherever data_dir is. We recommend looping over get_cces_dataverse to download data first.

verbose

whether to print a message; defaults to TRUE

Value

A three-column dataframe with the columns

year

The year

case_id

The respondent identifier within year

qID

The question ID

response

The outcome question, of class factor

The object will also have an attribute called question, which will save the question identifier qID

Details

This transformation currently only supports Yes/No questions. For some common ordinal questions that can, with manual recodes, be set to a Yes/No question, there is some hard-coding of the recode in the question. In the future, this should be defined outside of the function in a taxonomy.

Examples


 # with sample data
 get_cces_question("pid3", dataframe = cc18_samp, year = 2018, qID = "pid3")
#> Using the dataframe provided 
#> # A tibble: 1,000 × 4
#>     year case_id   qID   response   
#>    <int> <chr>     <chr> <fct>      
#>  1  2018 415395741 pid3  Independent
#>  2  2018 414164923 pid3  Republican 
#>  3  2018 412379892 pid3  Democrat   
#>  4  2018 414203529 pid3  Republican 
#>  5  2018 412148048 pid3  Independent
#>  6  2018 412329835 pid3  Independent
#>  7  2018 417352072 pid3  Democrat   
#>  8  2018 414614677 pid3  Democrat   
#>  9  2018 416797006 pid3  Republican 
#> 10  2018 412962561 pid3  Independent
#> # ℹ 990 more rows

 # need data/input/cces/cces_2018.rds to run this
 if (FALSE) {
  get_cces_question(qcode = "CC18_326", year = "2018", qID = "TCJA")

  # A tibble: 60,000 x 4
  # year   case_id qID   response
  # <int>     <int> <chr> <fct>
  # 1  2018 123464282 TCJA  Support
  # 2  2018 170169205 TCJA  Support
  # 3  2018 175996005 TCJA  Support
  # 4  2018 176818556 TCJA  Oppose
  # 5  2018 202120533 TCJA  Oppose
  # 6  2018 226449148 TCJA  Oppose
  # 7  2018 238205342 TCJA  Oppose
  # 8  2018 238806466 TCJA  Support
  # 9  2018 267564481 TCJA  Support
  # … with 59,991 more rows
 }