"Causality" Table of Contents
Thomas Coleman
Harris School of Public Policy, University of Chicago
tscoleman@uchicago.edu
Abstract
The story of John Snow's 1855 treatise
On the mode of communication of cholera is a rollicking good tale - full of heroism, death, and
statistics. But more fundamentally Snow's work is a sustained effort to convince skeptics, through
argument and a wide variety of evidence, of the waterborne theory of cholera articulated in the 1849
essay of the same name. Snow's data and analysis provide a template for how to convincingly demonstrate a
causal effect, a template as applicable today as in 1855. I consider two of strands of Snow's evidence -
the Broad Street outbreak and the south London "Grand Experiment" - as pedagogical examples of using non-
experimental data to support a causal effect. In doing so I discuss extensions to Snow's analysis using
modern techniques and tools: most importantly difference-in-differences regression and count (Poisson)
regression for error analysis in quasi-randomized control experiments. These provide clear and compelling
examples of the modern techniques and tools, while confirming and strengthening Snow's original
conclusion on the causal effect of water supply on cholera mortality.
- Introduction
- Nineteenth-Century London and Cholera
- Overview of The 1854 Broad Street Outbreak and the South London "Grand Experiment"
- 3.1 1854 Broad Street Outbreak
- 3.2 South London "Grand Experiment"
- Causation versus Correlation
- Broad Street August 31 - September 18
- 5.1 Confronting the Waterborne Theory with Evidence
- 5.2 Weight of Evidence
- South London "Grand Experiment" - Snow (1855)
- 6.1 Comparison of Aggregate Groups Using Difference-in-Differences Regression
- 6.2 Quasi-Randomized Comparison: 1854 Cholera Mortality Within Joint-Supply Sub-Districts
- 6.3 Summary For South London "Grand Experiment" - Snow (1855)
- Extending South London "Grand Experiment" With Detailed Population Data - Snow (1856)
- 7.1 Review of Snow (1856)
- 7.2 Extending Difference-in-Differences and Quasi-Randomized Trials Using Detailed Population Data
- 7.2.1 1849 versus 1854 Difference-in-Differences
- 7.2.2 Seven weeks, by Sub-districts: Quasi-Randomized Trial
- 7.2.3 1854, Full Outbreak, by Registration District: Quasi-Randomized Trial
- 7.3 Error Analysis for Randomized Control Trial Incorporating Overdispersion
- Causal Assessment Procedure - (based on Katz and Singer (2007) - preliminary & incomplete)
- Conclusion
- Appendix
- 10.1 Statistical Framework for Count Data in Difference-in-Differences Analysis - Poisson and Negative Binomial Regression
- 10.2 Differences Between Snow's Tables VII & VIII (Deaths by Sub-District & Supplier for 4 weeks ending 6th August versus 7 weeks ending 26th August)
- 10.3 Error Analysis for Randomized Control Trial Incorporating Overdispersion - NEEDS TO BE REVISED
- 10.4 Explanation / List of Tables for Snow (1855)
- 10.5 Detailed Tables and Figures for South London "Grand Experiment"