Accessible Requires Authentication Published by De Gruyter October 28, 2015

Unbiased Estimation of the Average Treatment Effect in Cluster-Randomized Experiments

Joel A. Middleton and Peter M. Aronow


Many estimators of the average treatment effect, including the difference-in-means, may be biased when clusters of units are allocated to treatment. This bias remains even when the number of units within each cluster grows asymptotically large. In this paper, we propose simple, unbiased, location-invariant, and covariate-adjusted estimators of the average treatment effect in experiments with random allocation of clusters, along with associated variance estimators. We then analyze a cluster-randomized field experiment on voter mobilization in the US, demonstrating that the proposed estimators have precision that is comparable, if not superior, to that of existing, biased estimators of the average treatment effect.

Corresponding author: Peter M. Aronow, Department of Political Science, Yale University, New Haven, CT, USA, e-mail:


The authors acknowledge support from the Yale University Faculty of Arts and Sciences High Performance Computing facility and staff. The authors would also like to thank Allison Carnegie, Adam Dynes, Don Green, Jennifer Hill, Mary McGrath, David Nickerson, Cyrus Samii and two helpful reviewers for helpful comments. The authors thank Kyle Peyton for research assistance and manuscript preparation. Any errata are the sole responsibility of the authors.


A Proof of Non-invariance of the Horvitz-Thompson Estimator

To prove that the HT estimator is not invariant to location shifts, we need only replace YjT with its linear transformation:


B Bias from Estimating k from Within-Sample Data

Consider the situation where one wishes to improve upon the HT estimator by adjusting for cluster size; in other words, one wishes to estimate k in equations 20 and 21 from the data to approximate the optimal value of k with an estimator k^. In this scenario, the expected value of equation 20 yields

(29)E[Y1,R1T^]=E[MmtjJ1(YjTk^(njN/M))]=Mmt(E[jJ1YjT]E[jJ1k^nj]+E[jJ1k^N/M])=Mmt(E[mtY1jT¯]E[k^mtntj¯]+E[k^mtN/M])=Y1TM(E[k^ntj¯]E[k^]E[ntj¯])=Y1TMCov (k^, ntj¯), (29)

where ntj¯ is the mean value of nj for clusters in the treatment condition in a given randomization. In the third line of equation 29, k^ moves outside the summation operator because it is a constant for a given randomization. Likewise,

(30)E [Y0,R1T^]=Y0TMCov (k^,ncj¯), (30)

where ncj¯ is the mean value of nj for units in the control condition in a given randomization. So the expected value of the estimator will be

(31)E[Y1,R1T^Y0,R1T^N]=Δ+MN(Cov (k^, ncj¯)Cov (k^, ncj¯)). (31)

The term on the right of equation 31 represents the bias. A special case with no bias is when the sharp null hypothesis of no treatment effect holds and treatment and control groups have equal numbers of clusters. We refer the reader to Williams (1961), Freedman (2008a) and Freedman (2008b) for additional reading on the particular bias associated with the regression adjustment of random samples and experimental data.

C Derivation of the Optimal Value of k

To identify a single optimal value of k, koptim*, we refer to the first line of equation 17,

(32)vV (ΔR1^)=cσ2(Uj0T)+tσ2(Uj1T)+2σ(Uj0T,Uj1T) (32)

where v=(M1)N2M2,c=Mmcmc, and t=Mmtmt. Now note that the terms σ2(Uj0T),σ2(Uj0T), and σ(Uj0T,Uj1T) in equation 32 can be written as follows:

(33)σ2(Uj1T)=σ2(Yj1T)+k2σ2(nj)2kσ(Yj1T,nj), (33)
(34)σ2(Uj0T)=σ2(Yj0T)+k2σ2(nj)2kσ(Yj0T,nj), (34)

and, defining δj=(njN/M),

(35)σ(Uj0T,Uj1T)=E [Uj0TUj1T]U0T¯U1T¯=E [Yj0Tkδj](Yj1Tkδj)Y0T¯Y1T¯=E [Yj0TYj1TYj0TkδjYj1Tkδj+k2δj2]Y0T¯Y1T¯=E[Yj0TYj1T]Y0T¯Y1T¯E[Yj0Tkδj]E[Yj1Tkδj]+E[k2δj2]=σ(Yj0T,Yj1T)k[σ(Yj0T,nj)+E[Yj0T]E[δj]]k[σ(Yj1T,nj)+E[Yj1T]E[δj]]+k2σ2(nj)=σ(Yj0T,Yj1T)k[σ(Yj0T,nj)+E[Yj0T]0]k[σ(Yj1T,nj)+E[Yj1T]0]+k2σ2(nj)=σ(Yj0T,Yj1T)kσ(Yj0T,nj)kσ(Yj1T,nj)+k2σ2(nj), (35)

respectively. Substituting equations 33, 34, and 35 into equation 32,


Setting the first derivative with respect to k equal to zero,








The Des Raj estimator will be more efficient than the HT estimator when



Angrist, J. D. and J. Pischke (2009) Mostly Harmless Econometrics. Princeton: Princeton University Press. Search in Google Scholar

Aronow, P. M., D. P. Green and D. K. K. Lee (2014) “Sharp Bounds on the Variance in Randomized Experiments,” Annals of Statistics, 42(3):850–871. Search in Google Scholar

Bates, D. and M. Maechler (2010) lme4: Linear mixed-effects models using S4 classes. R package, version 0.999375-37. Search in Google Scholar

Brewer, K. R. W. (1979) “A Class of Robust Sampling Designs for Large-Scale Surveys,” Journal of the American Statistical Association, 74:911–915. Search in Google Scholar

Chaudhuri, A. and H. Stenger (2005) Survey Sampling. Boca Raton: Chapman and Hall. Search in Google Scholar

Cochran, W. G. (1977) Sampling Techniques, 3rd ed. New York: John Wiley. Search in Google Scholar

Des Raj. (1965) “On A Method of Using Multi-Auxiliary Information in Sample Surveys,” Journal of The American Statistical Association, 60:270–277. Search in Google Scholar

Ding, P. (2014) “A Paradox from Randomization-Based Causal Inference,” arXiv preprint arXiv:1402.0142. Search in Google Scholar

Donner, A. and N. Klar (2000) Design and Analysis of Cluster Randomization Trials in Health Research. New York: Oxford Univ. Press. Search in Google Scholar

Freedman, D. A. (2006) “On the So-Called ‘Huber Sandwich Estimator’ and ‘Robust’ Standard Errors,” American Statistician, 60:299–302. Search in Google Scholar

Freedman, D. A. (2008a) “On Regression Adjustments to Experimental Data,” Advances in Applied Mathematics, 40:180–193. Search in Google Scholar

Freedman, D. A. (2008b) “On Regression Adjustments in Experiments with Several Treatments.” Annals of Applied Statistics, 2:176–196. Search in Google Scholar

Freedman, D. A., R. Pisani and R. A. Purves (1998) Statistics, 3rd ed. New York: W. W. Norton, Inc. Search in Google Scholar

Green, D. P. and L. Vavreck (2008) “Analysis of Cluster-Randomized Experiments: A Comparison of Alternative Estimation Approaches,” Political Analysis, 16:138–152. Search in Google Scholar

Hansen, B. and J. Bowers (2008) “Covariate Balance in Simple, Stratified and Clustered Comparative Studies,” Statistical Science, 23:219–236. Search in Google Scholar

Hansen, B. and J. Bowers (2009) “Attributing Effects to a Cluster-Randomized Get-Out-the-Vote Campaign,” Journal of the American Statistical Association, 104:873–885. Search in Google Scholar

Hartley, H. O. and A. Ross (1954) “Unbiased Ratio Estimators,” Nature, 174:270. Search in Google Scholar

Hoffman, E. B., P. K. Sen and C. R. Weinberg (2001) “Within-Cluster Resampling,” Biometrika, 88: 1121–1134. Search in Google Scholar

Horvitz, D. G. and D. J. Thompson (1952) “A Generalization of Sampling Without Replacement From a Finite Universe,” Journal of the American Statistical Association, 47:663–684. Search in Google Scholar

Humphreys, M. (2009) Bounds on Least Squares Estimates of Causal Effects in the Presence of Heterogeneous Assignment Probabilities. Working paper. Available at: Search in Google Scholar

Imai, K., G. King and C. Nall (2009) “The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation,” Statistical Science, 24:29–53. Search in Google Scholar

King, G. and M. Roberts (2014) “How Robust Standard Errors Expose Methodological Problems They Do Not Fix, and What to Do About It,” Political Analysis, 1–12. Search in Google Scholar

Lachin, J. M. (1988) “Properties of Simple Randomization in Clinical Trials,” Controlled Clinical Trials, 9(4):312–326. Search in Google Scholar

Lin, W. (2013) “Agnostic Notes on Regression Adjustments to Experimental Data: Reexamining Freedman’s Critique,” Annals of Applied Statistics, 7(1):295–318. Search in Google Scholar

Middleton, J. A. (2008) “Bias of the Regression Estimator for Experiments Using Clustered Random Assignment,” Statistics and Probability Letters, 78:2654–2659. Search in Google Scholar

Miratrix, L., J. Sekhon and B. Yu (2013) “Adjusting Treatment Effect Estimates by Post- Stratification in Randomized Experiments,” Journal of the Royal Statistical Society. Series B (Methodological), 75(2):369–396. Search in Google Scholar

Neyman, J. (1923) “On the Application of Probability Theory to Agricultural Experiments: Essay on Principles, Section 9,” Statistical Science, 5:465–480. (Translated in 1990). Search in Google Scholar

Neyman, J. (1934) “On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection,” Journal of the Royal Statistical Society, 97(4):558–625. Search in Google Scholar

R Development Core Team. (2010) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0. Version 2.12.0. Search in Google Scholar

Rosenbaum, P. R. (2002) Observational Studies, 2nd ed. New York: Springer. Search in Google Scholar

Rubin, D. (1974) “Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies,” Journal of Educational Psychology, 66:688–701. Search in Google Scholar

Rubin, D. B. (1978) “Bayesian Inference for Causal Effects: The Role of Randomization,” The Annals of Statistics, 6:34–58. Search in Google Scholar

Rubin, D. B. (2005) “Causal Inference Using Potential Outcomes: Design, Modeling, Decisions,” Journal of the American Statistical Association, 100:322–331. Search in Google Scholar

Samii, C. and P. M. Aronow (2012) “On Equivalencies Between Design-Based and Regression-Based Variance Estimators for Randomized Experiments,” Statistics and Probability Letters, 82:365–370. Search in Google Scholar

Sarndal, C.-E. (1978) “Design-Based and Model-Based Inference in Survey Sampling,” Scandinavian Journal of Statistics, 5(1):27–52. Search in Google Scholar

Williams, W. H. (1961) “Generating Unbiased Ratio and Regression Estimators,” Biometrics, 17:267–274. Search in Google Scholar

Published Online: 2015-10-28
Published in Print: 2015-12-1

©2015 by De Gruyter