Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter July 9, 2015

Estimation and Inference in an Ecological Inference Model

Yanqin Fan, Robert Sherman and Matthew Shum

Abstract

We interpret an ecological inference model as a treatment effects model in which the outcomes of interest and the conditional covariates come from separate datasets. In this setting, the counterfactual distributions and policy parameters of interest are only partially identified under a selection on observables assumption. In this paper, we provide estimation and inference procedures for structural prediction and counterfactual analysis in such models. We also illustrate the procedures with an application to US presidential elections.

JEL Codes: C14; C31

Corresponding author: Matthew Shum, Caltech, HSS, 1200 East California Blvd., Pasadena, CA 91125, USA, E-mail:

Acknowledgments

We are grateful to Cheng Hsiao, Sergio Firpo, Chuck Manski, Kevin Song, and Jeff Wooldridge for valuable comments and discussions. We thank SangMok Lee for excellent research assistance, and seminar participants at Michigan State, USC, and the Canadian Econometrics Study Group meetings (2011, Toronto) for useful comments.

Appendix

Appendix: Technical Proofs

The proofs of Theorems 6.1 and 6.2 are similar; they rely heavily on the lemma below which is adapted from the proof of Theorem 1 in Linton, Song, and Whang (2010). Closely related work include Andrews (1994), Chen, Linton, and van Keilegom (2003), and Linton, Maasoumi, and Whang (2005).

Let

Xi(θ,τ)=φ(Zi;θ,τ),

where φ(·; θ, τ) is a real valued function known up to the parameter (θ, τ)∈Θ×𝒯 with Θ a compact subset of a Euclidean space and 𝒯 an infinite dimensional space. For d=1, 0, let νn(·;d) be the stochastic process on 𝒳 with

νn(x;d)=n[n1i=1nI{Xi(θ^,τ^)x}I{Di=d}E(I{Xi(θ0,τ0)x}I{Di=d})],

Where x∈𝒳, (θ0, τ0)∈Θ×𝒯, and (θ^,τ^) are consistent estimators of (θ0, τ0).

Lemma A.1 below presents conditions under which the process {νn(·; d)} converges weakly to a Gaussian process.

Let BΘ×𝒯(δ)={θ, τ}∈Θ×𝒯: ‖θθ0‖+‖ττ0<δ} for δ>0 and 𝒫 be the collection of all the potential distributions of (Zi, Di) that satisfy Assumptions 1−3 below.

Assumption 1 (i) {Zi,Di}i=1n is a random sample.

(ii) log N(ε, 𝒯, ‖·‖)≤d for some d∈(0, 1].

(iii) Let

FX|D(x;θ,τ|d)=Pr(Xi(θ,τ)x|Di=d).

For some δ>0, there exists a functional ΓF,P(x|d)[θθ0, ττ0] of (θθ0, ττ0), (θ, τ)∈BΘ×𝒯(δ) such that

|FX|D(x;θ,τ|d)FX|D(x;θ0,τ0|d)ΓF,P(x|d)[θθ0,ττ0]|C1θθ02+C2ττ02,

with constants C1 and C2 that do not depend on P.

Assumption 2 (i) Xi(θ0, τ0) is a continuous random variable with a bounded support 𝒳.

(ii) There exists δ, C>0 and a subvector Z1 of Z such that: (a) the conditional density of Z given Z1 is bounded uniformly over (θ, τ)∈BΘ×𝒯(δ) and over P∈𝒫, (b) for each (θ, τ)∈BΘ×𝒯(δ) and (θ′, τ′)∈BΘ×𝒯(δ), φ(Z; θ, τ)−φ(Z; θ′, τ′) is measurable with respect to the σ-field of Z1, and (c) for each (θ1, τ1)∈BΘ×𝒯(δ) and for each δ>0,

supP𝒫supz1EP[sup(θ2,τ2)BΘ×T(δ)|φ(Z;θ1,τ1)φ(Z;θ2,τ2)|2|Z1=z1]Cδ2s,

for some s∈(d, 1] with d in Assumption 1 (ii), where the supremum over z1 runs in the support of Z1.

Assumption 3 (i) For each ε>0, supP𝒫P(θ^θ0+τ^τ0>ε)=o(1) and supP𝒫P(τ^T)1 as n→∞ such that θ^θ0=oP(n1/4) and τ^τ0=oP(n1/4).

(ii) For each ε>0,

supP𝒫P(|nΓF,P(x|d)[θ^θ0,τ^τ0]1ni=1nψx,F(Zi,Di;θ0,τ0,d)|>ε)0,

where ψx,F(Zi, Di, θ0, τ0, d) satisfies that there exists η>0 such that for all x∈𝒳 EP[ψx,F(Zi, Di, θ0, τ0, d)]=0 and

supP𝒫EP[supxX|ψx,F(Zi,Di;θ0,τ0,d)|2+η]<.

(iii) There exist constants C>0 and s1∈(d/2,1] with d in Assumption 1 (ii) such that for each x1∈𝒳 and for each ε>0,

E[supxX:|xx1|ε|ψx,F(Zi,Di;θ0,τ0,d)ψx1,F(Zi,Di;θ0,τ0,d)|2]Cε2s1.

Let ν (·; d) be a mean zero Gaussian process on 𝒳 with a covariance kernel given by

C(x1,x2;d)=Cov(Vi(x1;θ0,τ0,d),Vi(x2;θ0,τ0,d)),

where

Vi(x;θ0,τ0,d)=I{Xi(θ0,τ0)x}I{Di=1}+ψx,F(Zi,Di;θ0,τ0,d).

Lemma A.1Suppose that Assumptions 1–3 hold. Then

νn(x;d)=1ni=1n[Vi(x;θ0,τ0,d)E(Vi(x;θ0,τ0,d))]+oP(1)

uniformly inx∈𝒳 andP∈𝒫 and henceνn(·; d) weakly converges toν(·; d) uniformly inP∈𝒫.

Proof of Theorem 6.1: We will show that for any constants c1, c2, c3, c4, the linear combination c1μ^1L+c2μ^1U+c3μ^0L+c4μ^0U is asymptotically normally distributed with variance (c1, c2, c3, c4μ(c1, c2, c3, c4)′.

Assumption (s) ensures that GW|D(·|1)GV|D(·|0) have compact supports and the corresponding pdfs are bounded away from zero on their supports. As a result , the map ϕF1,F0:D(𝒲)×D(𝒱)→R defined as

ϕF1,F0=c101P01F11(u)du+c2P011F11(u)du+c301P00F01(u)du+c4P001F01(u)du

is Hadamard-differentiable at (GW|D(·|1), GV|D(·|0)) tangentially to C(𝒲)×C(𝒱) with the derivative:

ϕF1,F0(αW,αV)c101P01αWgW|DGW|D1(u|1)du+c2P011αWgW|DGW|D1(u|1)du+c301P00αVgV|DGV|D1(u|0)du+c4P001αVgV|DGV|D1(u|0)du,

see van der Vaart and Wellner (1996).

We will complete the proof by establishing the weak convergence of the stochastic process:

{n(G^W|D(w|1)GW|D(w|1),G^V|D(v|0)GV|D(v|0)):(w,v)W×V}

and invoking the Functional Delta method.

Let

νnG,W(w)=n[G^W|D(w|1)-GW|D(w|1)],wW,νnG,V(v)=n[G^V|D(v|0)-GV|D(v|0)],vV.

Step 1 We show: νnG,W(w)=n1/2j=1nVj,W(w;θ0,τ0)+oP(1) uniformly in w∈𝒲 and P∈𝒫.

By the definition of G^W|D, we have:

νnG,W(w)=n[1n1j=1nI{p(Zj;β^,τ^)p^11/w}I{Dj=1}p^1GW|D(w|1)]=n1/2j=1nI{p(Zj;β^,τ^)/p^11/w}I{Dj=1}p1[1GW|D(w|1)]p^1+n(p^1p1)p^1[1GW|D(w|1)]=n1/2j=1nI{p(Zj;β^,τ^)/p^11/w}I{Dj=1}p1[1GW|D(w|1)]p1+n(p^1p1)p1[1GW|D(w|1)]+oP(1).

We apply Lemma A.1 to the first term on the right hand side of the last equation with Xj(θ, τ)=p(Zj; β, τ)/p1 and θ=(p1, β). We verify Assumptions 1–3 in Lemma A.1 under Assumptions (s), (p) and (b).

Assumption 1 (i) and (ii) hold under Assumption (p) (i) and (ii). Now we verify Assumption 1 (iii). Note that for x=1/w,

FX|D(x;θ,τ|1)=EP[I{Xi(θ,τ)x}|Di=1]=EP[I{p(Zi;β,τ)p1x}|Di=1]=FP|D(p1x;β,τ|1).

Let

ΓF,P(x|1)[θθ0,ττ0]=xfP|D(p1ox|1)(p1p1o)+ΓP(p1ox|1)[ββ0,ττ0],

where ΓP(p1ox|1)[ββ0, ττ0] is defined in Assumption (p) (iii). Then by Assumption (p) (iii), we conclude: for some δ>0, (θ, τ)∈BΘ×𝒯(δ),

|FX|D(x;θ,τ|1)FX|D(x;θ0,τ0|1)ΓF,P(x|1)[θθ0,ττ0]|=|FP|D(p1x;β,τ|1)FP|D(p1ox|1)xfP|D(p1ox|1)(p1p1o)ΓP(p1ox|1)[ββ0,ττ0]|=|FP|D(p1x;β,τ|1)FP|D(p1x|1)ΓP(p1x|1)[ββ0,ττ0]+ΓP(p1x|1)[ββ0,ττ0]ΓP(p1ox|1)[ββ0,ττ0]+[FP|D(p1x|1)FP|D(p1ox|1)xfP|D(p1ox|1)(p1p1o)]||FP|D(p1x;β,τ|1)FP|D(p1x|1)ΓP(p1x|1)[ββ0,ττ0]|+|ΓP(p1x|1)[ββ0,ττ0]ΓP(p1ox|1)[ββ0,ττ0]|+12|x22x2FP|D(p1x|1)(p1p1o)2|C1θθ02+C2ττ02,

where p1 lies between p1o and p1.

Assumption 2 holds under Assumption (s) (i) and Assumption (p) (iv).

It remains to verify Assumption 3. Assumption 3 (i) holds because of Assumption (b) (i). For Assumption 3 (ii), we let

ψx,F(Zi,Di;θ0,τ0,1)=xfP|D(p1ox|1)(I{Di=1}p1o)+ψi(p1ox;β0,τ0,1),

where ψi(p1ox; β0, τ0, 1) is defined in Assumption (b) (ii). Then by Assumption (b) (ii), we obtain:

supPP(|nΓF,P(x|1)[θ^θ0,τ^τ0]1ni=1nψx,F(Zi,Di;θ0,τ0,1)|>ε)=supPP(|nΓP(p1ox|1)[β^β0,τ^τ0]1ni=1nψi(p1ox;β0,τ0,1)|>ε)0,

where EP[ψx,F(Zi, Di; θ0, τ0, 1)]=0 and

supPEP[supxX|ψx,F(Zi,Di;θ0,τ0,1)|2+η]=supPEP[supxX|xfP|D(p1ox|1)(I{Di=1}p1o)+ψi(p1ox;β0,τ0,1)|2+η]<

by Assumption (p) (iii) and Assumption (b) (ii). It remains to verify Assumption 3 (iii):

E[supxX:|xx1|ε|ψx,F(Zi,Di;θ0,τ0,1)ψx1,F(Zi,Di;θ0,τ0,1)|2]=E[supxX:|xx1|ε|xfP|D(p1ox|1)(I{Di=1}p1o)+ψi(p1ox;β0,τ0,1)x1fP|D(p1ox1|1)(I{Di=1}p1o)ψi(p1ox;β0,τ0,1)|2]CE[supxX:|xx1|ε|xx1|2]+CE[supxX:|xx1|ε|ψi(p1ox;β0,τ0,d)ψi(p1ox1;β0,τ0,d)|2]Cε2+Cε2s1

by Assumption (b) (iii) and Assumption (p) (iii).

Using Lemma A.1, we now obtain:

νnG,W(w)=1p1on1/2j=1n[I{p(Zj;β0)/p1o1/w}I{Dj=1}p1o[1GW|D(w|1)]]1wfP|D(p1o/w;β0,τ0|d)(p^1p1o)ΓP(p1o/w|d)[β^β0,τ^τ0]+1p1on1/2j=1n[I{Dj=1}p1o][1GW|D(w|1)]+oP(1)=n1/2j=1n{1p1o[I{p(Zj;β0,τ0)/p1o1/w}[1GW|D(w|1)]]I{Dj=1}+1wfP|D(p1o/w;β0,τ0|d)[I{Dj=1}p10]+ψj(p1o/w;β0,τ0,1)}+oP(1)=n1/2j=1nVj,W(w;θ0,τ0)+oP(1).

Step 2 We show: νnG,V(v)=n1/2j=1nVj,V(v;θ0,τ0)+oP(1)..

Assumptions 1–3 in Lemma A.1 can be verified by following Step 1. So we just provide the main expressions. Note that G^V|D(v|d)=F^P|D(11p^1v|d) and

EP[I{p(Zi;β,τ)11p1v}|Di=0]=FP|D(11p1v;β,τ|0).

We let

ΓF,P(x|d)[θθ0,ττ0]=1vfP|D(11p1ov|0)(p1p1o)+ΓP(11p1ov|0)[ββ0,ττ0].

Thus,

νnG,V(v)=n[G^V|D(v|0)GV|D(v|0)]=n[F^P|D(11p^1v|0)GV|D(v|0)]=n1/2j=1n[1p^0I{p(Zj;β0,τ0)11p^1v}I{Dj=0}GV|D(v|0)]=n1/2j=1n[I{p(Zj;β0,τ0)11p^1v}I{Dj=0}p0GV|D(v|0)]1p^0+n(p0p^0)p^0GV|D(v|0)=n1/2j=1n[I{p(Zj;β0,τ0)11p^1v}I{Dj=0}p0GV|D(v|0)]1p0+n(p0p^0)p0GV|D(v|0)+oP(1)=n1/2j=1n1(1p1o)[I{p(Zj;β0,τ0)11p1ov}GV|D(v|0)]I{Dj=0}+n1/2j=1n1(1p1o)ψj(11p1ov;β0,τ0,d)+n1/2j=1n1v(1p1o)fP|D(11p1ov|0)(I{Dj=1}p1o)+oP(1)=n1/2j=1nVj,V(v;θ0,τ0)+oP(1).

Step 3 Steps 1 and 2 imply: uniformly in P∈𝒫,

{n(G^W|D(w|1)GW|D(w|1),G^V|D(v|0)GV|D(v|0)):(w,v)W×V}{νW(w|1),νV(v|1):(w,v)W×V},

where {νW(w|1), νV(ν|1):(w, v)∈𝒲×𝒱} is a vector-valued Gaussian process on 𝒲×𝒱 with zero mean and a covariance kernel given by C((w1, v1),(w2, v2)). Finally, we obtain: uniformly in P∈𝒫,

n(c1μ^1L+c2μ^1U+c3μ^0L+c4μ^0U[c1μ1L+c2μ1U+c3μ0L+c4μ0U])c101P01νWgW|DGW|D1(u|1)du+c2P011νWgW|DGW|D1(u|1)du+c301P00νVgV|DGV|D1(u|0)du+c4P001νVgV|DGV|D1(u|0)du.

Q.E.D.

Proof of Theorem 6.2: We need to show that uniformly in P∈𝒫,

[c101P00G^V/W|D1(u|0)du+c2P001G^V/W|D1(u|0)du]

is asymptotically normal for all constants c1, c2. It is sufficient to show that the process

{n(G^V/W|D(a|0)GV/W|D(a|0)):aA}

converges weakly to a Gaussian process uniformly in P∈𝒫.

Step 1 We show: uniformly in a∈𝒜 and P∈𝒫

n(G^V/W|D(a|0)GV/W|D(a|0))=n1/2j=1nVj,V/W(a;θ0,τ0)+oP(1).

Let

Xi(θ,τ)=(1p1)p(Zi;β,τ)p1[1p(Zi;β,τ)].

Then

I{p(Zi;β^,τ^)ap^11(1a)p^1}=I{Xi(θ^,τ^)a}.

Note that

EP[I{Xi(θ,τ)x}|Di=0]=EP[I{p(Zi;β,τ)p1x1(1x)p1}|Di=0]=FP|D(p1x1(1x)p1;β,τ|0).

We have:

ΓF,P(x|d)[θθ0,ττ0]=x[1(1x)p1o]2fP|D(p1ox1(1x)p1o|0)(p1p1o)+ΓP(p1ox1(1x)p1o|0)[ββ0,ττ0].

Thus,

n(G^V/W|D(a|0)GV/W|D(a|0))=n(F^P|D(ap^111(1a)p^1|0)GV/W|D(a|0))=n1/2j=1n1(1p1o)[I{p(Zj;β0,τ0)ap1o1(1a)p1o}GV?W|D(v|0)]I{Dj=0}+n1/2j=1n1(1p1o)ψj(p1oa1(1a)p1o;β0,τ0,0)+n1/2j=1na(1p1o)[1(1a)p1o]2fP|D(p1oa1(1a)p1o|0)(I{Dj=1}p1o)+oP(1)=n1/2j=1nVj,V/W(a;θ0,τ0)+oP(1).

Step 2 Step 1 implies:

{n(G^V/W|D(a|0)GV/W|D(a|0)):aA}

weakly converges to a Gaussian process νV/W(·) with zero mean and covariance kernel:

E(Vj,V/W(a1;θ0,τ0)Vj,V/W(a2;θ0,τ0))

By the Functional Delta method, we obtain:

n([c1μ^0|1L+c2μ^0|1U][c1μ0|1L+c2μ0|1U])c101P00νV/WgV/W|DGV/W|D1(u|0)du+c2P001νV/WgV/W|DGV/W|D1(u|0)du.

Q.E.D.

References

Andrews, D. W. K. 1994. “Empirical Process Methods in Econometrics.” In Handbook of Econometrics, vol. IV, edited by R. F. Engle and D. L. McFadden. North-Holland, Amsterdam.10.1016/S1573-4412(05)80006-6Search in Google Scholar

Andrews, D. W. K. and G. Soares. 2010. “Inference for Parameters Defined by Moment Inequalities Using Generalized Moment Selection.” Econometrica 78: 119–157.10.3982/ECTA7502Search in Google Scholar

Bhattacharya, D. 2007. “Inference on Inequality from Household Survey Data.” Journal of Econometrics 137: 674–707.10.1016/j.jeconom.2005.09.003Search in Google Scholar

Chen, X., O. Linton, and I. van Keilegom. 2003. “Estimation of Semiparametric Models when the Criterion Function is Not Smooth.” Econometrica 71: 1591–1608.10.1111/1468-0262.00461Search in Google Scholar

Chernozhukov, V., H. Hong, and E. Tamer. 2007. “Parameter Set Inference in a Class of Econometric Models.” Econometrica 75: 1243–1284.10.1111/j.1468-0262.2007.00794.xSearch in Google Scholar

Cho, W. and C. F. Manski. 2008. “Cross Level/Ecological Inference.” In Oxford Handbook of Political Methodology, edited by H. Brady, D. Collier, and J. Box-Steffensmeier, pp. 547–569. Oxford: Oxford University Press.10.1093/oxfordhb/9780199286546.003.0024Search in Google Scholar

Cross, P. J., and C. F. Manski. 1999. “Regressions, Short and Long.” Manuscript.Search in Google Scholar

Cross, P. J., and C. F. Manski. 2002. “Regressions, Short and Long.” Econometrica 70 (1): 357–368.10.1111/1468-0262.00279Search in Google Scholar

DiNardo, J., N. Fortin, and T. Lemieux. 1996. “Labor Market Institutions and the Distribution of Wages, 1973–1992: A Semiparametric Approach.” Econometrica 64: 1001–1044.10.2307/2171954Search in Google Scholar

Dehejia, R., and S. Wahba. 1999. “Causal Effects in Non-Experimental Studies: Re-Evaluating the Evaluation of Training Programs.” Journal of the American Statistical Association 94: 1053–1062.10.1080/01621459.1999.10473858Search in Google Scholar

Duncan, O., and B. Davis. 1953. “An Alternative to Ecological Correlation.” American Sociological Review 18: 665–666.10.2307/2088122Search in Google Scholar

Fan, Y., and S. Park. 2009. “Partial Identification of the Distribution of Treatment Effects and its Confidence Sets.” Advances in Econometrics: Nonparametric Econometric Methods 25: 3–70.10.1108/S0731-9053(2009)0000025004Search in Google Scholar

Fan, Y., and S. Park. 2010. “Sharp Bounds on the Distribution of Treatment Effects and Their Statistical Inference.” Econometric Theory 26: 931–951.10.1017/S0266466609990168Search in Google Scholar

Fan, Y., and K. Song. 2011. “Confidence Sets for the Distribution of Treatment Effects with Covariates.” Working Paper.Search in Google Scholar

Fan, Y., R. Sherman, and M. Shum. 2014. “Identifying Treatment Effects under Data Combination.” Econometrica 82: 811–822.10.3982/ECTA10601Search in Google Scholar

Firpo, S., N. Fortin, and T. Lemieux. 2010. “Decomposition Methods in Economics.” In Handbook of Labor Economics, edited by David Card and Orley Ashenfelter, 4. New York: North-Holland.Search in Google Scholar

Frank, M., R. Nelson, and B. Schweizer. 1987. “Best-Possible Bounds on the Distribution of a Sum – a Problem of Kolmogorov.” Probability Theory and Related Fields 74: 199–211.10.1007/BF00569989Search in Google Scholar

Goldie, C. 1977. “Convergence Theorems for Empirical Lorenz Curves and their Inverses.” Journal of Applied Probability 9: 765–791.10.1017/S0001867800029177Search in Google Scholar

Goodman, L. 1953. “Ecological Regressions and Behavior of Individuals.” American Sociological Review 18: 663–664.10.2307/2088121Search in Google Scholar

Hahn, J. 1998. “On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects.” Econometrica 66: 315–331.10.2307/2998560Search in Google Scholar

Heckman, J., H. Ichimura, J. Smith, and P. Todd. 1998. “Characterizing Selection Bias Using Experimental Data.” Econometrica 66: 1017–1098.10.2307/2999630Search in Google Scholar

Hirano, K., G. W. Imbens, and G. Ridder. 2000. “Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score,” NBER Technical Working Papers 0251, National Bureau of Economic Research, Inc.10.3386/t0251Search in Google Scholar

King, G. 1997. A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data. Princeton: Princeton University Press.10.3886/ICPSR01132Search in Google Scholar

Linton, O., E. Maasoumi, and Y.-J. Whang. 2005. “Consistent Testing for Stochastic Dominance Under General Sampling Schemes.” Review of Economic Studies 72: 735–765.10.1111/j.1467-937X.2005.00350.xSearch in Google Scholar

Linton, O., K. Song, and Y.-J. Whang. 2010. “An Improved Bootstrap Tests of Stochastic Dominance.” Journal of Econometrics 154: 186–202.10.1016/j.jeconom.2009.08.002Search in Google Scholar

Manski, C. F. 1990. “Nonparametric Bounds on Treatment Effects.” American Economic Review 80: 319–323.Search in Google Scholar

Rosenbaum, P. R., and D. B. Rubin. 1983a. “Assessing Sensitivity to an Unobserved Binary Covariate in an Observational Study with Binary Outcome.” Journal of the Royal Statistical Society, Series B 45: 212–218.Search in Google Scholar

Rosenbaum, P. R., and D. B. Rubin. 1983b. “The Central Role of the Propensity Score in Observational Studies for Causal Effects.” Biometrika 70: 41–55.10.1093/biomet/70.1.41Search in Google Scholar

Stoye, J. 2009. “More on Confidence Intervals for Partially Identified Parameters.” Econometrica 77: 1299–1315.10.3982/ECTA7347Search in Google Scholar

van der Vaart, A., and J. Wellner. 1996. Weak Convergence and Empirical Processes. Heidelberg: Springer Verlag.10.1007/978-1-4757-2545-2Search in Google Scholar

Wooldridge, J. M. 2010. Econometric Analysis of Cross Section and Panel Data. Cambridge: MIT Press.Search in Google Scholar


Supplemental Material:

The online version of this article (DOI: 10.1515/jem-2015-0006) offers supplementary material, available to authorized users.


Published Online: 2015-7-9
Published in Print: 2016-1-1

©2016 by De Gruyter