Estimates joint distribution by simply assuming independence and multiplying proportions.

synth_prod(formula, poptable, newtable, area_var, count_var = "count")

formula | A representation of the aggregate imputation or "outcome" model,
of the form |
---|---|

poptable | The population table, collapsed in terms of counts. Must contain
all variables in the RHS of |

newtable | A dataset that contains marginal counts or proportions. Will be collapsed internally to get simple proportions. |

area_var | A character vector of the area of interest. |

count_var | A character variable that specifies which variable in |

That is, we already know `p(X_{1}, ..., X_{K - 1}, A)`

from `poptable`

and a marginal `p(X_{K}, A)`

from the additional distribution to weight to. Then
`p(X_{1}, .., X_{K - 1}, X_{K}, A) = p(X_{1}, ..., X_{K - 1}, A) x p(X_{K}, A)`

.

`synth_mlogit()`

for a more nuanced model that uses survey data as
the basis of the joint estimation.

library(dplyr) # suppose we want know the distribution of (age x female) and we know the # distribution of (race), by CD, but we don't know the joint of the two. race_target <- count(acs_race_NY, cd, race, wt = count, name = "count") pop_prod <- synth_prod(race ~ age + female, poptable = acs_race_NY, newtable = race_target, area_var = "cd") # In this example, we know the true joint. Does it match? pop_val <- left_join(pop_prod, count(acs_race_NY, cd, age, female, race, wt = count, name = "count"), by = c("cd", "age", "female", "race"), suffix = c("_est", "_truth")) # AOC's district in the bronx pop_val %>% filter(cd == "NY-14", age == "35 to 44 years", female == 0) %>% select(cd, race, count_est, count_truth)#> # A tibble: 6 x 4 #> cd race count_est count_truth #> <chr> <fct> <dbl> <dbl> #> 1 NY-14 White 15269. 11312 #> 2 NY-14 Black 7444. 6578 #> 3 NY-14 Hispanic 30824. 34668 #> 4 NY-14 Asian 11730. 11373 #> 5 NY-14 Native American 0 0 #> 6 NY-14 All Other 16904. 18239