The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials

Donald B Rubin

doi:10.1002/sim.2739

The design versus the analysis of observational studies for causal effects: parallels with the design of randomized trials

Stat Med. 2007 Jan 15;26(1):20-36. doi: 10.1002/sim.2739.

Author

Donald B Rubin¹

Affiliation

¹ Department of Statistics, Harvard University, 1 Oxford Street, 7th Floor, Cambridge, MA 02138, USA. rubin@stat.harvard.edu

PMID: 17072897
DOI: 10.1002/sim.2739

Abstract

For estimating causal effects of treatments, randomized experiments are generally considered the gold standard. Nevertheless, they are often infeasible to conduct for a variety of reasons, such as ethical concerns, excessive expense, or timeliness. Consequently, much of our knowledge of causal effects must come from non-randomized observational studies. This article will advocate the position that observational studies can and should be designed to approximate randomized experiments as closely as possible. In particular, observational studies should be designed using only background information to create subgroups of similar treated and control units, where 'similar' here refers to their distributions of background variables. Of great importance, this activity should be conducted without any access to any outcome data, thereby assuring the objectivity of the design. In many situations, this objective creation of subgroups of similar treated and control units, which are balanced with respect to covariates, can be accomplished using propensity score methods. The theoretical perspective underlying this position will be presented followed by a particular application in the context of the US tobacco litigation. This application uses propensity score methods to create subgroups of treated units (male current smokers) and control units (male never smokers) who are at least as similar with respect to their distributions of observed background characteristics as if they had been randomized. The collection of these subgroups then 'approximate' a randomized block experiment with respect to the observed covariates.

Publication types

Historical Article

MeSH terms

Biometry / history
Causality*
Data Interpretation, Statistical
History, 20th Century
History, 21st Century
Humans
Jurisprudence
Male
Models, Statistical*
Randomized Controlled Trials as Topic / history
Randomized Controlled Trials as Topic / statistics & numerical data*
Smoking / adverse effects
United States