Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter November 8, 2016

Testing Exogeneity of Multinomial Regressors in Count Data Models: Does Two-stage Residual Inclusion Work?

Andrea Geraci, Daniele Fabbri and Chiara Monfardini


We study a simple exogeneity test in count data models with possibly endogenous multinomial treatment. The test is based on Two Stage Residual Inclusion (2SRI), an estimation method which has been proved to be consistent for a general class of nonlinear parametric models. Results from a broad set of simulation experiments provide novel evidence on important features of this approach. We find differences in the finite sample performance of various likelihood-based tests, analyze their robustness to misspecification arising from neglected over-dispersion or from incorrect specification of the first stage model, and uncover that standardizing the variance of the first stage residuals leads to better results. An original application to testing the endogeneity status of insurance in a model of healthcare demand corroborates our Monte Carlo findings.

Correction note

[Correction added after online publication 8 November 2016: For consistency, qij was changed to qij* on p. 4, line 35 and p. 5, line 10. Also, the formatting of qi was corrected from italics to bold in eqs. 9 and 10.]

Appendix 1

Table A1:

Summary Statistics of Dependent Variables Generated in Monte Carlo Study.

Multinomial treatment dummies
Count variable – yi

Summary statistics are computed on the 5000 observations of the first replication of the experiment.

Appendix 2

Figure A1: Empirical Power Plot of Wald Tests using Raw and Standardized Residuals.

Figure A1:

Empirical Power Plot of Wald Tests using Raw and Standardized Residuals.

Figure A2: Empirical Power Plot of LR and LM Tests using Raw and Standardized Residuals.

Figure A2:

Empirical Power Plot of LR and LM Tests using Raw and Standardized Residuals.

Figure A3: True Latent Factors against Estimated Residuals.

Figure A3:

True Latent Factors against Estimated Residuals.

Appendix 3

Table A2:

NB2 Estimator with Correctly Specified Residuals: Rejection Frequencies of Exogeneity Tests.

Nom. sizeRaw residualsStandardized residuals
Emp. sizeEmp. powerEmp. sizeEmp. power
Wald test (Murphy Topel correction)
Wald test (no correction)
Likelihood ratio test
Lagrange multiplier test

No of replications of the Monte Carlo experiment (R)=5.000; Saple size for each replication (N)=5.000. Raw residuals and Standardized residuals are computed after estimation of the first stage equations using, respectively:q^ij=(dijq^ij),for j=0, 1, 2 and q^ij=p^ij1/2(1p^ij)1/2(dijp^ij), for j=0, 1, 2.


Abrevaya, J., J. A. Hausman, and S. Khan. 2010. “Testing for Causal Effects in a Generalized Regression Model with Endogenous Regressors.” Econometrica 78: 2043–2061.10.3982/ECTA7133Search in Google Scholar

Bago d’Uva, T., M. Lindeboom, O. O’Donnell, and E. Van Doorslaer. 2011. “Education-related Inequity in Healthcare with Heterogeneous Reporting of Health.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 174: 639–664.10.1111/j.1467-985X.2011.00706.xSearch in Google Scholar

Bhattacharya, J., D. Goldman, and D. McCaffrey. 2006. “Estimating Probit Models with Self-selected Treatments.” Statistics in Medicine 25: 389–413.10.1002/sim.2226Search in Google Scholar

Bratti, M., and A. Miranda. 2011. “Endogenous Treatment Effects for Count Data Models with Endogenous Participation or Sample Selection.” Health Economics 20: 1090–1109.10.1002/hec.1764Search in Google Scholar

Buchmueller, T. C., A. Couffinhal, M. Grignon, and M. Perronnin. 2004. “Access to Physician Services: Does Supplemental Insurance Matter? Evidence from France.” Health Economics 13: 669–687.10.1002/hec.879Search in Google Scholar

Cameron, A. C., and F. A. Windmeijer. 1996. “R-squared Measures for Count Data Regression Models with Applications to Health-care Utilization.” Journal of Business & Economic Statistics 14: 209–220.Search in Google Scholar

Cheng, T. C., and F. Vahid. 2011. “Demand for Hospital Care and Private Health Insurance in a Mixed Public-private System: Empirical Evidence using a Simultaneous Equation Modeling Approach.” Melbourne Institute Working Paper Series, No. 26. Melbourne Institute of Applied Economic and Social Research, The University of Melbourne.10.2139/ssrn.1752581Search in Google Scholar

Deb, P., and P. K. Trivedi. 1997. “Demand for Medical Care by the Elderly: A Finite Mixture Approach.” Journal of Applied Econometrics 12: 313–336.10.1002/(SICI)1099-1255(199705)12:3<313::AID-JAE440>3.0.CO;2-GSearch in Google Scholar

Deb, P., and P. K. Trivedi. 2006. “Specification and Simulated Likelihood Estimation of a Non-normal Treatment-outcome Model with Selection: Application to Health Care Utilization.” The Econometrics Journal 9: 307–331.10.1111/j.1368-423X.2006.00187.xSearch in Google Scholar

Fabbri, D., and C. Monfardini. 2009. “Rationing the Public Provision of Healthcare in the Presence of Private Supplements: Evidence from the Italian Nhs.” Journal of Health Economics 28: 290–304.10.1016/j.jhealeco.2008.11.004Search in Google Scholar

Fabbri, D., and C. Monfardini. 2016. “Opt Out or Top Up? Voluntary Health Care Insurance and the Public vs. Private Substitution.” Oxford Bulletin of Economics and Statistics 78: 75–93.10.1111/obes.12107Search in Google Scholar

Fang, H., M. P. Keane, and D. Silverman. 2008. “Sources of Advantageous Selection: Evidence from the Medigap Insurance Market.” Journal of Political Economy 116: 303–350.10.1086/587623Search in Google Scholar

Garrido, M. M., P. Deb, J. F. Burgess, and J. D. Penrod. 2012. “Choosing Models for Health Care Cost Analyses: Issues of Nonlinearity and Endogeneity.” Health Services Research 47: 2377–2397.10.1111/j.1475-6773.2012.01414.xSearch in Google Scholar

Geraci, A., D. Fabbri, and C. Monfardini. 2014. “Testing exogeneity of multinomial regressors in count data models: does two stage residual inclusion work?” Working Paper DSE 921, DSE – University of Bologna.10.2139/ssrn.2383923Search in Google Scholar

Gourieroux, C., A. Monfort, E. Renault, and A. Trognon. 1987. “Generalised Residuals.” Journal of Econometrics 34: 5–32.10.1016/0304-4076(87)90065-0Search in Google Scholar

Grignon, M., and B. Kambia-Chopin. 2009. “Income and the demand for complementary health insurance in france,” IRDES Document de Travail 24, Institut de recherche et documentation en économie de la santé (IRDES).Search in Google Scholar

Grignon, M., M. Perronnin, and J. N. Lavis. 2008. “Does Free Complementary Health Insurance Help the Poor to Access Health Care? Evidence from France.” Health Economics 17: 203–219.10.1002/hec.1250Search in Google Scholar PubMed

Hausman, J. A. 1978. “Specification Tests in Econometrics.” Econometrica: Journal of the Econometric Society 46: 1251–1271.10.2307/1913827Search in Google Scholar

Hole, A. R. 2006. “Calculating Murphy-topel Variance Estimates in Stata: A Simplified Procedure.” Stata Journal 6: 521–529.10.1177/1536867X0600600405Search in Google Scholar

Kapetanios, G. 2010. “Testing for Exogeneity in Threshold Models.” Econometric Theory 26: 231–259.10.1017/S0266466609090665Search in Google Scholar

Kenkel, D. S., and J. V. Terza. 2001. “The Effect of Physician Advice on Alcohol Consumption: Count Regression with an Endogenous Treatment Effect.” Journal of Applied Econometrics 16: 165–184.10.1002/jae.596Search in Google Scholar

Mullahy, J. 1997. “Instrumental-variable Estimation of Count Data Models: Applications to Models of Cigarette Smoking Behavior.” Review of Economics and Statistics 79: 586–593.10.1162/003465397557169Search in Google Scholar

Pagan, A., and F. Vella. 1989. “Diagnostic Tests for Models based on Individual Data: A Survey.” Journal of Applied Econometrics 4: S29–S59.10.1002/jae.3950040504Search in Google Scholar

Staub, K. E. 2009. “Simple Tests for Exogeneity of a Binary Explanatory Variable in Count Data Regression Models.” Communications in Statistics-Simulation and Computation 38: 1834–1855.10.1080/03610910903147789Search in Google Scholar

Terza, J. V. 2016. “Simpler Standard Errors for Two-stage Optimization Estimators Estimation in Normal Linear Models.” Stata Journal 16: 368–385.10.1177/1536867X1601600206Search in Google Scholar

Terza, J. V., A. Basu, and P. J. Rathouz. 2008. “Two-stage Residual Inclusion Estimation: Addressing Endogeneity in Health Econometric Modeling.” Journal of Health Economics 27: 531–543.10.1016/j.jhealeco.2007.09.009Search in Google Scholar

Weeks, M. and C. Orne. 1999. “The Statistical Relationship between Bivariate and Multinomial Choice Models.” Cambridge Working Papers in Economics, No. 9912. Faculty of Economics, University of Cambridge.Search in Google Scholar

Windmeijer, F. A. and J. M. Santos Silva. 1997. “Endogeneity in Count Data Models: An Application to Demand for Health Care.” Journal of Applied Econometrics 12: 281–294.10.1002/(SICI)1099-1255(199705)12:3<281::AID-JAE436>3.0.CO;2-1Search in Google Scholar

Wooldridge, J. M. 2010. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: The MIT Press.Search in Google Scholar

Wooldridge, J. M. 2014. “Quasi-maximum Likelihood Estimation and Testing for Nonlinear Models with Endogenous Explanatory Variables.” Journal of Econometrics 182: 226–234.10.1016/j.jeconom.2014.04.020Search in Google Scholar

Zimmer, D. M. 2010. “Health Insurance and Health Care Demand among the Self-employed.” Journal of Labor Research 31: 1–19.10.1007/s12122-010-9081-6Search in Google Scholar

Supplemental Material:

The online version of this article (DOI: offers supplementary material, available to authorized users.

Published Online: 2016-11-8

©2018 Walter de Gruyter GmbH, Berlin/Boston

Scroll Up Arrow