Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Journal of Causal Inference

Ed. by Imai, Kosuke / Pearl, Judea / Petersen, Maya Liv / Sekhon, Jasjeet / van der Laan, Mark J.

See all formats and pricing
More options …

Propensity Score Weighting for Causal Inference with Clustered Data

Shu YangORCID iD: http://orcid.org/0000-0001-7703-707X
Published Online: 2018-08-24 | DOI: https://doi.org/10.1515/jci-2017-0027


Propensity score weighting is a tool for causal inference to adjust for measured confounders in observational studies. In practice, data often present complex structures, such as clustering, which make propensity score modeling and estimation challenging. In addition, for clustered data, there may be unmeasured cluster-level covariates that are related to both the treatment assignment and outcome. When such unmeasured cluster-specific confounders exist and are omitted in the propensity score model, the subsequent propensity score adjustment may be biased. In this article, we propose a calibration technique for propensity score estimation under the latent ignorable treatment assignment mechanism, i. e., the treatment-outcome relationship is unconfounded given the observed covariates and the latent cluster-specific confounders. We impose novel balance constraints which imply exact balance of the observed confounders and the unobserved cluster-level confounders between the treatment groups. We show that the proposed calibrated propensity score weighting estimator is doubly robust in that it is consistent for the average treatment effect if either the propensity score model is correctly specified or the outcome follows a linear mixed effects model. Moreover, the proposed weighting method can be combined with sampling weights for an integrated solution to handle confounding and sampling designs for causal inference with clustered survey data. In simulation studies, we show that the proposed estimator is superior to other competitors. We estimate the effect of School Body Mass Index Screening on prevalence of overweight and obesity for elementary schools in Pennsylvania.

Keywords: Calibration; Inverse probability weighting; Survey sampling; Unmeasured confounding


  • 1.

    Rubin DB. Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol. 1974;66:688–701.CrossrefGoogle Scholar

  • 2.

    Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70:41–55.CrossrefGoogle Scholar

  • 3.

    Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. Am Stat. 1985;39:33–8.Google Scholar

  • 4.

    Stuart EA. Matching methods for causal inference: A review and a look forward. Stat Sci. 2010;25:1–21.CrossrefGoogle Scholar

  • 5.

    Abadie A, Imbens GW. Matching on the estimated propensity score. Econometrica. 2016;84:781–807.CrossrefGoogle Scholar

  • 6.

    Hirano K, Imbens GW. Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization. Health Serv Outcomes Res Methodol. 2001;2:259–78.CrossrefGoogle Scholar

  • 7.

    Bang H, Robins JM. Doubly robust estimation in missing data and causal inference models. Biometrics. 2005;61:962–73.CrossrefGoogle Scholar

  • 8.

    Cao W, Tsiatis AA, Davidian M. Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika. 2009;96:723–34.CrossrefGoogle Scholar

  • 9.

    Rosenbaum PR, Rubin DB. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc. 1984;79:516–24.CrossrefGoogle Scholar

  • 10.

    Yang S, Imbens GW, Cui Z, Faries DE, Kadziola Z. Propensity score matching and subclassification in observational studies with multi-level treatments. Biometrics. 2016;72:1055–65.CrossrefGoogle Scholar

  • 11.

    Imbens GW, Rubin DB. Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge UK: Cambridge University Press; 2015.Google Scholar

  • 12.

    Hong G, Raudenbush SW. Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data. J Am Stat Assoc. 2006;101:901–10.CrossrefGoogle Scholar

  • 13.

    Griswold ME, Localio AR, Mulrow C. Propensity score adjustment with multilevel data: setting your sites on decreasing selection bias. Ann Intern Med. 2010;152:393–5.CrossrefGoogle Scholar

  • 14.

    Li F, Zaslavsky AM, Landrum MB. Propensity score weighting with multilevel data. Stat Med. 2013;32:3373–87.CrossrefGoogle Scholar

  • 15.

    Rubin DB. Bayesian inference for causal effects: The role of randomization. Ann Stat. 1978;6:34–58.CrossrefGoogle Scholar

  • 16.

    Ross R. An application of the theory of probabilities to the study of a priori pathometry. part i. Proc R Soc Lond, a Contain Pap Math Phys Character. 1916;92:204–30.CrossrefGoogle Scholar

  • 17.

    Hudgens MG, Halloran ME. Toward causal inference with interference. J Am Stat Assoc. 2008;103:832–42.CrossrefGoogle Scholar

  • 18.

    Oakes JM. The (mis) estimation of neighborhood effects: causal inference for a practicable social epidemiology. Soc Sci Med. 2004;58:1929–52.CrossrefGoogle Scholar

  • 19.

    VanderWeele TJ. Ignorability and stability assumptions in neighborhood effects research. Stat Med. 2008;27:1934–43.CrossrefGoogle Scholar

  • 20.

    Hong G, Yu B. Early-grade retention and children’s reading and math learning in elementary years. Educ Eval Policy Anal. 2007;29:239–61.CrossrefGoogle Scholar

  • 21.

    Hong G, Yu B. Effects of kindergarten retention on children’s social-emotional development: An application of propensity score method to multivariate, multilevel data. Dev Psychol. 2008;44:407–21.CrossrefGoogle Scholar

  • 22.

    Kim J, Seltzer M. Causal inference in multilevel settings in which selection processes vary across schools. Technical Report Working Paper 708. University of California, Los Angeles, Center for the Study of Evaluation; 2007.

  • 23.

    Kelcey BM. Improving and assessing propensity score based causal inferences in multilevel and nonlinear settings. PhD thesis. University of Michigan; 2009.

  • 24.

    Arpino B, Mealli F. The specification of the propensity score in multilevel observational studies. Comput Stat Data Anal. 2011;55:1770–80.CrossrefGoogle Scholar

  • 25.

    Thoemmes FJ, West SG. The use of propensity scores for nonrandomized designs with clustered data. Multivar Behav Res. 2011;46:514–43.CrossrefGoogle Scholar

  • 26.

    Kim J-S, Steiner PM. Multilevel propensity score methods for estimating causal effects: A latent class modeling strategy. In: Quantitative Psychology Research. Springer; 2015. p. 293–306.Google Scholar

  • 27.

    Leite WL, Jimenez F, Kaya Y, Stapleton LM, MacInnes JW, Sandbach R. An evaluation of weighting methods based on propensity scores to reduce selection bias in multilevel observational studies. Multivar Behav Res. 2015;50:265–84.CrossrefGoogle Scholar

  • 28.

    Schuler MS, Chu W, Coffman D. Propensity score weighting for a continuous exposure with multilevel data. Health Serv Outcomes Res Methodol. 2016;16:271–92.CrossrefGoogle Scholar

  • 29.

    Xiang Y, Tarasawa B. Propensity score stratification using multilevel models to examine charter school achievement effects. J School Choice. 2015;9:179–96.CrossrefGoogle Scholar

  • 30.

    Su Y-S, Cortina J. What do we gain? combining propensity score methods and multilevel modeling. In: Annual Meeting of the American Political Science Association. Toronto, Canada; 2009.Google Scholar

  • 31.

    Eckardt P. Propensity score estimates in multilevel models for causal inference. Nurs Res. 2012;61:213–23.CrossrefGoogle Scholar

  • 32.

    Robins JM, Rotnitzky A, Zhao LP. Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc. 1994;89:846–66.CrossrefGoogle Scholar

  • 33.

    Lunceford JK, Davidian M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med. 2004;23:2937–60.CrossrefGoogle Scholar

  • 34.

    Kang JD, Schafer JL. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci. 2007;22:523–39.CrossrefGoogle Scholar

  • 35.

    Rubin DB. On principles for modeling propensity scores in medical research. Pharmacoepidemiol Drug Saf. 2004; 13:855–7.CrossrefGoogle Scholar

  • 36.

    Holland PW. Statistics and causal inference. J Am Stat Assoc. 1986;81:945–60.CrossrefGoogle Scholar

  • 37.

    Dawid AP. Conditional independence in statistical theory. J R Stat Soc, Ser B, Stat Methodol. 1979;41:1–31.Google Scholar

  • 38.

    Stuart EA. Estimating causal effects using school-level data sets. Educ Res. 2007;36:187–98.CrossrefGoogle Scholar

  • 39.

    Baltagi B. Econometric Analysis of Panel Data. New York: John Wiley & Sons, Wiley; 1995.Google Scholar

  • 40.

    Wooldridge JM. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT press; 2002.Google Scholar

  • 41.

    Wallace TD, Hussain A. The use of error components models in combining cross section with time series data. Econometrica. 1969;37:55–72.CrossrefGoogle Scholar

  • 42.

    Skinner CJ, et al.. Inverse probability weighting for clustered nonresponse. Biometrika. 2011;98:953–66.CrossrefGoogle Scholar

  • 43.

    Kullback S, Leibler RA. On information and sufficiency. Ann Math Stat. 1951;22:79–86.CrossrefGoogle Scholar

  • 44.

    Wu C, Sitter RR. A model-calibration approach to using complete auxiliary information from survey data. J Am Stat Assoc. 2001;96:185–93.CrossrefGoogle Scholar

  • 45.

    Chen J, Sitter R, Wu C. Using empirical likelihood methods to obtain range restricted weights in regression estimators for surveys. Biometrika. 2002;89:230–7.CrossrefGoogle Scholar

  • 46.

    Särndal C-E, Lundström S. Estimation in Surveys with Nonresponse. New York: John Wiley & Sons; 2005.Google Scholar

  • 47.

    Kott PS. Using calibration weighting to adjust for nonresponse and coverage errors. Surv Methodol. 2006;32:133–42.Google Scholar

  • 48.

    Chang T, Kott PS. Using calibration weighting to adjust for nonresponse under a plausible model. Biometrika. 2008;95:555–71.CrossrefGoogle Scholar

  • 49.

    Kim JK, Kwon Y, Paik MC. Calibrated propensity score method for survey nonresponse in cluster sampling. Biometrika. 2016;103:461–73.CrossrefGoogle Scholar

  • 50.

    Qin J, Zhang B. Empirical-likelihood-based inference in missing response problems and its application in observational studies. J R Stat Soc B. 2007;69:101–22.Google Scholar

  • 51.

    Hainmueller J. Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Polit Anal. 2012;20:25–46.CrossrefGoogle Scholar

  • 52.

    Graham BS, Pinto CCDX, Egel D. Inverse probability tilting for moment condition models with missing data. Rev Econ Stud. 2012;79:1053–79.CrossrefGoogle Scholar

  • 53.

    Imai K, Ratkovic M. Covariate balancing propensity score. J R Stat Soc B. 2014;76:243–63.CrossrefGoogle Scholar

  • 54.

    Chan KCG, Yam SCP, Zhang Z. Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting. J R Stat Soc B. 2015;78:673–700.Google Scholar

  • 55.

    Park M, Fuller WA. Generalized regression estimators. Encycl Environmetrics. 2012;2:1162–6.Google Scholar

  • 56.

    Newey WK, Smith RJ. Higher order properties of GMM and generalized empirical likelihood estimators. Econometrica. 2004;72:219–55.CrossrefGoogle Scholar

  • 57.

    Kitamura Y, Stutzer M. An information-theoretic alternative to generalized method of moments estimation. Econometrica. 1997;65:861–74.CrossrefGoogle Scholar

  • 58.

    Imbens G, Johnson P, Spady RH. Information theoretic approaches to inference in moment condition models. Econometrica. 1998;66:333–57.CrossrefGoogle Scholar

  • 59.

    Schennach SM. Point estimation with exponentially tilted empirical likelihood. Ann Stat. 2007;35:634–72.CrossrefGoogle Scholar

  • 60.

    McCaffrey DF, Ridgeway G, Morral AR. Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychol Methods. 2004. 403–425.

  • 61.

    Setoguchi S, Schneeweiss S, Brookhart MA, Glynn RJ, Cook EF. Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiol Drug Saf. 2008;17:546–55.CrossrefGoogle Scholar

  • 62.

    Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29:337–46.Google Scholar

  • 63.

    Pirracchio R, Petersen ML, van der Laan M. Improving propensity score estimators’ robustness to model misspecification using super learner. Am J Epidemiol. 2014;181:108–19.Google Scholar

  • 64.

    Deville J-C, Särndal C-E. Calibration estimators in survey sampling. J Am Stat Assoc. 1992;87:376–82.CrossrefGoogle Scholar

  • 65.

    Fuller WA. Sampling Statistics. Hoboken, NJ: Wiley; 2009.Google Scholar

  • 66.

    Harris KC, Kuramoto LK, Schulzer M, Retallack JE. Effect of school-based physical activity interventions on body mass index in children: a meta-analysis. Can Med Assoc J. 2009;180:719–26.CrossrefGoogle Scholar

  • 67.

    Ebbeling CB, Feldman HA, Chomitz VR, Antonelli TA, Gortmaker SL, Osganian SK, Ludwig DS. A randomized trial of sugar-sweetened beverages and adolescent body weight. N Engl J Med. 2012;367:1407–16.CrossrefGoogle Scholar

  • 68.

    Thompson JW, Card-Higginson P. Arkansas’ experience: statewide surveillance and parental information on the child obesity epidemic. Pediatrics. 2009;124:73–82.CrossrefGoogle Scholar

  • 69.

    Peyer KL, Welk G, Bailey-Davis L, Yang S, Kim J-K. Factors associated with parent concern for child weight and parenting behaviors. Childhood Obesity. 2015;11:269–74.CrossrefGoogle Scholar

  • 70.

    Robins J, Sued M, Lei-Gomez Q, Rotnitzky A. Comment: Performance of double-robust estimators when “inverse probability” weights are highly variable. Stat Sci. 2007;22:544–59.CrossrefGoogle Scholar

  • 71.

    Crump R, Hotz VJ, Imbens G, Mitnik O. Moving the goalposts: Addressing limited overlap in the estimation of average treatment effects by changing the estimand. Technical report, 330. Cambridge, MA: National Bureau of Economic Research; 2006.

  • 72.

    Li F, Morgan KL, Zaslavsky AM. Balancing covariates via propensity score weighting. J Am Stat Assoc. 2017. .CrossrefGoogle Scholar

  • 73.

    Yang S, Ding P. Asymptotic inference of causal effects with observational studies trimmed by the estimated propensity scores. Biometrika. 2018;105:487–93.CrossrefGoogle Scholar

  • 74.

    Hahn J. On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica. 1998;66:315–31.CrossrefGoogle Scholar

  • 75.

    van der Vaart. Asymptotic Statistics. vol. 3. Cambridge: Cambridge university press; 2000.Google Scholar

  • 76.

    Hoeffding W, Robbins H, et al.. The central limit theorem for dependent random variables. Duke Math J. 1948;15:773–80.CrossrefGoogle Scholar

  • 77.

    Serfling RJ. Contributions to central limit theory for dependent variables. Ann Math Stat. 1968;39:1158–75.CrossrefGoogle Scholar

  • 78.

    Loève M. Probability Theory. 2nd ed. Princeton: Van Nostrand; 1960.Google Scholar

About the article

Received: 2017-12-05

Revised: 2018-08-18

Accepted: 2018-08-19

Published Online: 2018-08-24

Published in Print: 2018-09-25

Funding Source: Division of Mathematical Sciences

Award identifier / Grant number: 1811245

The author acknowledges the support in part by Ralph E. Powe Junior Faculty Enhancement Award from Oak Ridge Associated Universities and NSF grant DMS 1811245.

Citation Information: Journal of Causal Inference, Volume 6, Issue 2, 20170027, ISSN (Online) 2193-3685, DOI: https://doi.org/10.1515/jci-2017-0027.

Export Citation

© 2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in