R/get_dataverse.R
get_cces_dataverse.Rd
Get the data from dataverse into the current R environment. You must use the development version of IQSS/dataverse-client-r. The function also does
get_cces_dataverse( name = "cumulative", year_subset = 2006:2019, std_index = TRUE, dataverse_paths = ccesMRPprep::cces_dv_ids )
name | The name of the dataset as defined in |
---|---|
year_subset | The year (or years, a vector) to subset too. If |
std_index | Whether to standardize the unique case identifier. These
have different column names in different datasets, but setting this to |
dataverse_paths | A dataframe where one row represents metadata for one CCES dataset. Built-in data cces_dv_ids is used as a default and should not be changed. |
The current dataverse package downloads the raw data, so this function writes the raw binary into a tempfile and loads it into a tibble with the appropriate file data type. We find it convenient to loop over this function for all values in cces_dv_ids and populate the MRP directory with all datasets (about 2GB in total). Each dataset has slightly different formats; using get_cces_question will standardize, for example, the name of the case ID.
# read in cumulative common content, subsetted to 2018, into environemt if (FALSE) { ccc <- get_cces_dataverse("cumulative", year_subset = 2018) } # Example code to read _and_ write a series of common content datasets # in a directory "data/input/cces/ if (FALSE) { dir_create("data/cces") for (d in c("cumulative", "2018")) { if (file_exists(glue("data/input/cces/cces_{d}.rds"))) next write_rds(get_cces_dataverse(d), glue("data/input/cces/cces_{d}.rds")) # takes a few minutes } }