R/get_dataverse.R
get_cces_dataverse.Rd
Wrapper function to get CCES/CES data from dataverse into the current R environment using the dataverse
package.
get_cces_dataverse(
name = "cumulative",
year_subset = NULL,
std_index = TRUE,
dataverse_paths = ccesMRPprep::cces_dv_ids
)
The name of the dataset as defined in data(cces_dv_ids)
. e.g. "cumulative"
or "2018"
.
The year (or years, a vector) to subset too. If name
is a year
specific dataset, this argument is redundant, but if name == "cumulative"
, then
the output will be the cumulative dataset subsetted to that year. This is useful
when using the cumulative dataset for its harmonized variables.
Whether to standardize the unique case identifier. These
have different column names in different datasets, but setting this to TRUE
(the default) will all rename them "case_id"
and also add the year of the dataset.
This way, every dataset that gets downloaded will have the unique identifier
defined by the variables c("year", "case_id")
.
A dataframe where one row represents metadata for one CCES dataset. Built-in data cces_dv_ids is used as a default and should not be changed.
This function is a simple wrapper around the dataverse
pacakge on CRAN.
It downloads the dataset from the dataverse, and loads it into a tibble with the appropriate
file data type. Using get_cces_question does some standardization across years, for example,
the name of the case ID variable, so that it makes downstream.
As of v0.3.15, dataverse
accepts a cache by default if you specify a dataverse version.
get_cces_dataverse
does not specify a version and simplify re-downloads whatever
is the latest version of the dataset on dataverse at the time, it does not support caching.
You may be interested in customizing your download following https://cran.r-project.org/web/packages/dataverse/vignettes/C-download.html,
or downloading the feather version of the CCES cumulative, which reads much
faster than the default .dta file in this function.
# read in cumulative common content, subsetted to 2018, into environemt
if (FALSE) { # \dontrun{
ccc <- get_cces_dataverse("cumulative", year_subset = 2018)
} # }
# Example code to read _and_ write a series of common content datasets
# in a directory "data/input/cces/
if (FALSE) { # \dontrun{
dir_create("data/cces")
for (d in c("cumulative", "2018")) {
if (file_exists(glue("data/input/cces/cces_{d}.rds")))
next
write_rds(get_cces_dataverse(d), glue("data/input/cces/cces_{d}.rds")) # takes a few minutes
}
} # }