When RCTs are infeasible, researchers must rely on statistical methods for causal effects, however such methods often require untenable, and untestable, assumptions about how units come to receive treatment. Falsification tests are statistical tests that researchers conduct to marshal evidence that their design is valid their conclusions are sound. These tests are conducted on observable implications of the assumptions necessary to draw causal inferences.
Current practice in falsification testing does not allow researchers to provide statistical evidence that their assumptions are warranted. By failing to detect a problem with the design assumptions, rather than providing evidence of a valid design, researchers conflate statistical insignificance with substantive homogeneity. My work An equivalence approach to balance and placebo tests (with F. Daniel Hidalgo), argues that falsification tests should be structured so that the responsibility lies with researchers to positively demonstrate that the data is consistent with their identification assumptions or theory.
This means that researchers should begin with the initial hypothesis that the data is inconsistent with a valid research design, and only reject this hypothesis if they provide sufficient statistical evidence in favor of data consistent with a valid design. The conceptual distinction between beginning with a null hypothesis of no difference, as is standard in current practice, versus beginning with a null hypothesis of a substantive difference, as we advocate, may seem small, but the practical implications are substantial. Statistical tests of equivalence are a natural way to conduct a falsification test, and allow researchers to provide statistical evidence that their research design is sound and their conclusions are warranted. The power of this cannot be understated–these methods stand to improve the quality and reproducibility of observational findings across the social sciences.
I have extended this work to the regression discontinuity setting in Equivalence Testing for Regression Discontinuity Designs. See the full list of relevant publications below!