Estimates joint distribution by simply assuming independence and multiplying proportions.

synth_prod(formula, poptable, newtable, area_var, count_var = "count")

Arguments

formula

A representation of the aggregate imputation or "outcome" model, of the form X_{K} ~ X_1 + ... X_{K - 1}

poptable

The population table, collapsed in terms of counts. Must contain all variables in the RHS of formula, as well as the variables specified in area_var and count_var below.

newtable

A dataset that contains marginal counts or proportions. Will be collapsed internally to get simple proportions.

area_var

A character vector of the area of interest.

count_var

A character variable that specifies which variable in poptable indicates the count

Details

That is, we already know p(X_{1}, ..., X_{K - 1}, A) from poptable and a marginal p(X_{K}, A)from the additional distribution to weight to. Then p(X_{1}, .., X_{K - 1}, X_{K}, A) = p(X_{1}, ..., X_{K - 1}, A) x p(X_{K}, A).

See also

synth_mlogit() for a more nuanced model that uses survey data as the basis of the joint estimation.

Examples


library(dplyr)
library(ccesMRPprep)

# suppose we want know the distribution of (age x female) and we know the
# distribution of (race), by CD, but we don't know the joint of the two.

race_target <- count(acs_race_NY, cd, race, wt = count, name = "count")

pop_prod <- synth_prod(race ~ age + female,
                       poptable = acs_race_NY,
                       newtable = race_target,
                       area_var = "cd")

# In this example, we know the true joint. Does it match?
pop_val <- left_join(pop_prod,
                     count(acs_race_NY,  cd, age, female, race, wt = count, name = "count"),
                     by = c("cd", "age", "female", "race"),
                     suffix = c("_est", "_truth"))

# AOC's district in the bronx
pop_val %>%
  filter(cd == "NY-14", age == "35 to 44 years", female == 0) %>%
  select(cd, race, count_est, count_truth)
#> # A tibble: 6 × 4
#>   cd    race            count_est count_truth
#>   <chr> <fct>               <dbl>       <dbl>
#> 1 NY-14 White              15269.       11312
#> 2 NY-14 Black               7444.        6578
#> 3 NY-14 Hispanic           30824.       34668
#> 4 NY-14 Asian              11730.       11373
#> 5 NY-14 Native American        0            0
#> 6 NY-14 All Other          16904.       18239