Collapse CCES data to be analyzed in binomial model

Currently only is compatible with question of type "yesno".

build_counts(
  formula,
  data,
  keep_vars = NULL,
  name_ones_as = "yes",
  name_trls_as = "n_response",
  multiple_qIDs = FALSE,
  verbose = TRUE
)

Arguments

formula: the model formula used to fit the multilevel regression model. Should be of the form y ~ x1 + x2 + (1|x3) where y is a binary variable and only categorical variables should be used in the random effects notation.
data: A cleaned CCES dataset, e.g. from ccc_std_demographics which is then combined with outcome and contextual data in cces_join_slim. By default it expects the LHS outcome to be named response, and expects the dataset to have that variable. This variable must be binary or it must be a character vector that can be coerced by yesno_to_binary into a binary variable.
keep_vars: Variables that will be kept as a cell variable, regardless of whether it is specified in a formula. Input as character vector.
name_ones_as: What to name the variable that represents the number of successes in the binomial
name_trls_as: What to name the variable that represents the number of trials in the binomial.
multiple_qIDs: Does the data contain multiple outcomes in long form and therefore require the counts to be built for each outcome? Defaults to FALSE.
verbose: Show warning messages? Defaults to TRUE

Value

A dataframe of cells. The following variables have fixed names and will be assumed by ccesMRPrun::fit_brms_binomial:

yes: the number of successes observed in the cell
n_response the number of non-missing responses, representing the number of trials.

Examples


library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union

ccc_samp_std <- ccc_samp %>%
  mutate(y = sample(c("For", "Against"), size = n(), replace = TRUE)) %>%
  ccc_std_demographics()
#> age variable modified to bins. Original age variable is now in age_orig. 

ccc_samp_out <- build_counts(y ~ age + gender + educ + (1|cd),
                             ccc_samp_std)

ccc_samp_out
#> # A tibble: 945 × 6
#>    age            gender educ         cd    n_response   yes
#>    <fct>          <fct>  <fct>        <chr>      <int> <dbl>
#>  1 18 to 24 years Male   HS or Less   CA-39          1     1
#>  2 18 to 24 years Male   HS or Less   FL-25          1     0
#>  3 18 to 24 years Male   HS or Less   GA-10          1     1
#>  4 18 to 24 years Male   HS or Less   IL-07          1     0
#>  5 18 to 24 years Male   HS or Less   LA-01          1     0
#>  6 18 to 24 years Male   HS or Less   MA-05          1     1
#>  7 18 to 24 years Male   HS or Less   NC-09          1     1
#>  8 18 to 24 years Male   HS or Less   NC-11          1     1
#>  9 18 to 24 years Male   HS or Less   PA-11          1     1
#> 10 18 to 24 years Male   Some College IA-02          1     1
#> # ℹ 935 more rows

# alternative options
build_counts(y ~ educ + (1|cd), ccc_samp_std,
             name_ones_as = "success", name_trls_as = "trials")
#> # A tibble: 716 × 4
#>    educ       cd    trials success
#>    <fct>      <chr>  <int>   <dbl>
#>  1 HS or Less AK-01      1       0
#>  2 HS or Less AL-02      3       2
#>  3 HS or Less AL-03      2       0
#>  4 HS or Less AL-05      1       0
#>  5 HS or Less AR-02      2       1
#>  6 HS or Less AR-03      2       2
#>  7 HS or Less AZ-04      1       1
#>  8 HS or Less AZ-06      1       0
#>  9 HS or Less AZ-08      1       0
#> 10 HS or Less CA-06      2       1
#> # ℹ 706 more rows
build_counts(y ~ educ + (1|cd), ccc_samp_std,
             keep_vars = "state")
#> # A tibble: 716 × 5
#>    educ       cd    state      n_response   yes
#>    <fct>      <chr> <chr>           <int> <dbl>
#>  1 HS or Less AK-01 Alaska              1     0
#>  2 HS or Less AL-02 Alabama             3     2
#>  3 HS or Less AL-03 Alabama             2     0
#>  4 HS or Less AL-05 Alabama             1     0
#>  5 HS or Less AR-02 Arkansas            2     1
#>  6 HS or Less AR-03 Arkansas            2     2
#>  7 HS or Less AZ-04 Arizona             1     1
#>  8 HS or Less AZ-06 Arizona             1     0
#>  9 HS or Less AZ-08 Arizona             1     0
#> 10 HS or Less CA-06 California          2     1
#> # ℹ 706 more rows