R/get_dataverse.R
get_cces_dataverse.Rd
Get the data from dataverse into the current R environment. You must use the development version of IQSS/dataverse-client-r. The function also does
get_cces_dataverse(
name = "cumulative",
year_subset = NULL,
std_index = TRUE,
dataverse_paths = ccesMRPprep::cces_dv_ids
)
The name of the dataset as defined in data(cces_dv_ids)
.
The year (or years, a vector) to subset too. If name
is a year
specific dataset, this argument is redundant, but if name == "cumulative"
, then
the output will be the cumulative dataset subsetted to that year. This is useful
when using the cumulative dataset for its harmonized variables.
Whether to standardize the unique case identifier. These
have different column names in different datasets, but setting this to TRUE
(the default) will all rename them "case_id"
and also add the year of the dataset.
This way, every dataset that gets downloaded will have the unique identifier
defined by the variables c("year", "case_id")
.
A dataframe where one row represents metadata for one CCES dataset. Built-in data cces_dv_ids is used as a default and should not be changed.
The current dataverse package downloads the raw data, so this function writes the raw binary into a tempfile and loads it into a tibble with the appropriate file data type. We find it convenient to loop over this function for all values in cces_dv_ids and populate the MRP directory with all datasets (about 2GB in total). Each dataset has slightly different formats; using get_cces_question will standardize, for example, the name of the case ID.
# read in cumulative common content, subsetted to 2018, into environemt
if (FALSE) { # \dontrun{
ccc <- get_cces_dataverse("cumulative", year_subset = 2018)
} # }
# Example code to read _and_ write a series of common content datasets
# in a directory "data/input/cces/
if (FALSE) { # \dontrun{
dir_create("data/cces")
for (d in c("cumulative", "2018")) {
if (file_exists(glue("data/input/cces/cces_{d}.rds")))
next
write_rds(get_cces_dataverse(d), glue("data/input/cces/cces_{d}.rds")) # takes a few minutes
}
} # }