Get the data from dataverse into the current R environment. You must use the development version of IQSS/dataverse-client-r. The function also does

get_cces_dataverse(
  name = "cumulative",
  year_subset = 2006:2020,
  std_index = TRUE,
  dataverse_paths = ccesMRPprep::cces_dv_ids
)

Arguments

name

The name of the dataset as defined in data(cces_dv_ids).

year_subset

The year (or years, a vector) to subset too. If name is a year specific dataset, this argument is redundant, but if name == "cumulative", then the output will be the cumulative dataset subsetted to that year. This is useful when using the cumulative dataset for its harmonized variables.

std_index

Whether to standardize the unique case identifier. These have different column names in different datasets, but setting this to TRUE (the default) will all rename them "case_id" and also add the year of the dataset. This way, every dataset that gets downloaded will have the unique identifier defined by the variables c("year", "case_id").

dataverse_paths

A dataframe where one row represents metadata for one CCES dataset. Built-in data cces_dv_ids is used as a default and should not be changed.

Details

The current dataverse package downloads the raw data, so this function writes the raw binary into a tempfile and loads it into a tibble with the appropriate file data type. We find it convenient to loop over this function for all values in cces_dv_ids and populate the MRP directory with all datasets (about 2GB in total). Each dataset has slightly different formats; using get_cces_question will standardize, for example, the name of the case ID.

Examples


# read in cumulative common content, subsetted to 2018, into environemt
if (FALSE) {
 ccc <- get_cces_dataverse("cumulative", year_subset = 2018)
 }

# Example code to read _and_ write a series of common content datasets
# in a directory "data/input/cces/
if (FALSE) {
dir_create("data/cces")
for (d in c("cumulative", "2018")) {
if (file_exists(glue("data/input/cces/cces_{d}.rds")))
    next
  write_rds(get_cces_dataverse(d), glue("data/input/cces/cces_{d}.rds")) # takes a few minutes
}
}