A kernel balancing approach for reducing specification assumptions in survey weighting


Response rates to surveys have declined precipitously. Some researchers have responded by relying more heavily on convenience-based internet samples. This leaves researchers asking not if, but how, to weight survey results to represent their target population. Though practitioners often call upon expert knowledge in constructing their auxiliary vector, X, to use in weighting methods, they face difficult, feasibility-constrained choices regarding which variables to choose, how to coarsen them, and what interactions of other functions of those variables to include in X. Most approaches seek weights on the sampled units that make X have the same mean in the sample as in the population. However, such weights ensure that an outcome variable of interest Y is correctly reweighted only if the expectation of Y is linear in X, an unrealistic assumption. We describe kernel balancing for population reweighting (KPop) to make samples more similar to populations on the distribution of X, beyond the first moment margin. This approach effectively replaces the design matrix X with a kernel matrix, K, that encodes high-order information about X via the “kernel trick”. We then reweight the sampled units so that their average row of K is approximately equal to that of the population, working through a spectral decomposition. This produces good calibration on a wide range of smooth functions of X, without relying on the user to select those functions. We describe the method and illustrate its use in reweighting political survey samples, including from the 2016 American presidential election.

Erin Hartman
Erin Hartman
Assistant Professor of
Political Science