Currently only is compatible with question of type "yesno".

build_counts(
  formula,
  data,
  keep_vars = NULL,
  name_ones_as = "yes",
  name_trls_as = "n_response",
  multiple_qIDs = FALSE,
  verbose = TRUE
)

Arguments

formula

the model formula used to fit the multilevel regression model. Should be of the form y ~ x1 + x2 + (1|x3) where y is a binary variable and only categorical variables should be used in the random effects notation.

data

A cleaned CCES dataset, e.g. from ccc_std_demographics which is then combined with outcome and contextual data in cces_join_slim. By default it expects the LHS outcome to be named response, and expects the dataset to have that variable. This variable must be binary or it must be a character vector that can be coerced by yesno_to_binary into a binary variable.

keep_vars

Variables that will be kept as a cell variable, regardless of whether it is specified in a formula. Input as character vector.

name_ones_as

What to name the variable that represents the number of successes in the binomial

name_trls_as

What to name the variable that represents the number of trials in the binomial.

multiple_qIDs

Does the data contain multiple outcomes in long form and therefore require the counts to be built for each outcome? Defaults to FALSE.

verbose

Show warning messages? Defaults to TRUE

Value

A dataframe of cells. The following variables have fixed names and will be assumed by ccesMRPrun::fit_brms_binomial:

  • yes: the number of successes observed in the cell

  • n_response the number of non-missing responses, representing the number of trials.

Examples

#> #> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’: #> #> filter, lag
#> The following objects are masked from ‘package:base’: #> #> intersect, setdiff, setequal, union
ccc_samp_std <- ccc_samp %>% mutate(y = sample(c("For", "Against"), size = n(), replace = TRUE)) %>% ccc_std_demographics()
#> age variable modified to bins. Original age variable is now in age_orig.
ccc_samp_out <- build_counts(y ~ age + gender + educ + (1|cd), ccc_samp_std) ccc_samp_out
#> # A tibble: 945 x 6 #> age gender educ cd n_response yes #> <fct> <fct> <fct> <chr> <int> <dbl> #> 1 18 to 24 years Male HS or Less CA-39 1 1 #> 2 18 to 24 years Male HS or Less FL-25 1 0 #> 3 18 to 24 years Male HS or Less GA-10 1 1 #> 4 18 to 24 years Male HS or Less IL-07 1 0 #> 5 18 to 24 years Male HS or Less LA-01 1 0 #> 6 18 to 24 years Male HS or Less MA-05 1 1 #> 7 18 to 24 years Male HS or Less NC-09 1 1 #> 8 18 to 24 years Male HS or Less NC-11 1 1 #> 9 18 to 24 years Male HS or Less PA-11 1 1 #> 10 18 to 24 years Male Some College IA-02 1 1 #> # … with 935 more rows
# alternative options build_counts(y ~ educ + (1|cd), ccc_samp_std, name_ones_as = "success", name_trls_as = "trials")
#> # A tibble: 716 x 4 #> educ cd trials success #> <fct> <chr> <int> <dbl> #> 1 HS or Less AK-01 1 0 #> 2 HS or Less AL-02 3 2 #> 3 HS or Less AL-03 2 0 #> 4 HS or Less AL-05 1 0 #> 5 HS or Less AR-02 2 1 #> 6 HS or Less AR-03 2 2 #> 7 HS or Less AZ-04 1 1 #> 8 HS or Less AZ-06 1 0 #> 9 HS or Less AZ-08 1 0 #> 10 HS or Less CA-06 2 1 #> # … with 706 more rows
build_counts(y ~ educ + (1|cd), ccc_samp_std, keep_vars = "state")
#> # A tibble: 716 x 5 #> educ cd state n_response yes #> <fct> <chr> <chr> <int> <dbl> #> 1 HS or Less AK-01 Alaska 1 0 #> 2 HS or Less AL-02 Alabama 3 2 #> 3 HS or Less AL-03 Alabama 2 0 #> 4 HS or Less AL-05 Alabama 1 0 #> 5 HS or Less AR-02 Arkansas 2 1 #> 6 HS or Less AR-03 Arkansas 2 2 #> 7 HS or Less AZ-04 Arizona 1 1 #> 8 HS or Less AZ-06 Arizona 1 0 #> 9 HS or Less AZ-08 Arizona 1 0 #> 10 HS or Less CA-06 California 2 1 #> # … with 706 more rows