Abstract
Linear regression with measurement error in the covariates is a heavily studied topic, however, the statistics/econometrics literature is almost silent to estimating a multi-equation model with measurement error. This paper considers a seemingly unrelated regression model with measurement error in the covariates and introduces two novel estimation methods: a pure Bayesian algorithm (based on Markov chain Monte Carlo techniques) and its mean field variational Bayes (MFVB) approximation. The MFVB method has the added advantage of being computationally fast and can handle big data. An issue pertinent to measurement error models is parameter identification, and this is resolved by employing a prior distribution on the measurement error variance. The methods are shown to perform well in multiple simulation studies, where we analyze the impact on posterior estimates for different values of reliability ratio or variance of the true unobserved quantity used in the data generating process. The paper further implements the proposed algorithms in an application drawn from the health literature and shows that modeling measurement error in the data can improve model fitting.
Funding source: Science and Engineering Research Board, Department of Science and Technology, Government of India
Award Identifier / Grant number: MTR/2019/000033/MS
Acknowledgment
We dedicate this article to the memory of Viren K. Srivastava. We thank the editor Antoine Chambaz, associate editor Laura Sangalli and an anonymous referee for their valuable comments. We are also grateful to David Brownstone, Ivan Jeliazkov, Dale Poirier and the participants of the research seminar (2015) at the University of California, Irvine for a variety of helpful comments and suggestions on an earlier version. The last author (Shalabh) acknowledges funding under the MATRICS scheme from the Science and Engineering Research Board, Department of Science and Technology, Government of India.
Author contribution: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: None declared.
Conflict of interest statement: The authors declare no conflicts of interest regarding this article.
References
1. Zellner, A. An efficient method of estimating seemingly unrelated regression and tests for aggregation bias. J Am Stat Assoc 1962;57:348–68. https://doi.org/10.1080/01621459.1962.10480664.Search in Google Scholar
2. Srivastava, VK, Dwivedi, TD. Estimation of seemingly unrelated regression equations. J Econom 1979;10:15–32. https://doi.org/10.1016/0304-4076(79)90061-7.Search in Google Scholar
3. Srivastava, VK, Giles, DEA. Seemingly unrelated regression equations models: estimation and inference. New York: Marcel Dekker; 1987.Search in Google Scholar
4. Fiebig, DG. Seemingly unrelated regression. In: Baltagi, BH, editor. A companion to theoretical econometrics. Massachusett: Blackwell Publishing; 2001:101–21 pp.10.1002/9780470996249.ch6Search in Google Scholar
5. Zellner, A. An introduction to bayesian inference in econometrics. New York: John Wiley & Sons; 1971.Search in Google Scholar
6. Geman, S, Geman, D. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 1984;6:721–41. https://doi.org/10.1109/tpami.1984.4767596.Search in Google Scholar
7. Casella, G, George, EI. Explaining the Gibbs sampler. Am Statistician 1992;46:167–74. https://doi.org/10.2307/2685208.Search in Google Scholar
8. Percy, DF. Prediction for seemingly unrelated regressions. J Roy Stat Soc B 1992;54:243–52. https://doi.org/10.1111/j.2517-6161.1992.tb01878.x.Search in Google Scholar
9. Griffiths, WE, Chotikapanich, D. Bayesian methodology for imposing inequality constraints on a linear expenditure system with demographic factors. Aust Econ Pap 1997;36:321–41. https://doi.org/10.1111/j.1467-8454.1997.tb00852.x.Search in Google Scholar
10. Griffiths, WE, Valenzuela, MR. Gibbs samplers for a set of seemingly unrelated regressions. Aust N Z J Stat 2006;48:335–51. https://doi.org/10.1111/j.1467-842x.2006.00444.x.Search in Google Scholar
11. Zellner, A, Ando, T. A direct Monte Carlo approach for Bayesian analysis of the seemingly unrelated regression model. J Econom 2010;159:33–45. https://doi.org/10.1016/j.jeconom.2010.04.005.Search in Google Scholar
12. Ando, T, Zellner, A. Hierarchical Bayesian analysis of the seemingly unrelated regression and simultaneous equation models using a combination of direct Monte Carlo and importance sampling techniques. Bayesian Anal 2010;5:65–95. https://doi.org/10.1214/10-ba503.Search in Google Scholar
13. Steel, MFJ. Posterior analysis of restricted seemingly unrelated regression equation models: a recursive analytical approach. Econom Rev 1992;11:129–42. https://doi.org/10.1080/07474939208800228.Search in Google Scholar
14. Chib, S, Greenberg, E. Hierarchical analysis of SUR models with extension to correlated serial errors and time-varying parameter models. J Econom 1995;68:339–60. https://doi.org/10.1016/0304-4076(94)01653-h.Search in Google Scholar
15. Koop, G, Poirier, DJ, Tobias, J. Semiparametric Bayesian inference in multiple equation models. J Appl Econom 2005;20:723–47. https://doi.org/10.1002/jae.810.Search in Google Scholar
16. Carroll, RJ, Ruppert, D, Stefanski, LA, Crainiceanu, CM. Measurement error in nonlinear models: a modern perspective. Boca Raton: Chapman & Hall; 2006a.10.1201/9781420010138Search in Google Scholar
17. Carroll, RJ, Midthune, D, Freedman, LS, Kipnis, V. Seemingly unrelated measurement error models with application to nutritional epidemiology. Biometrics 2006b;62:75–84. https://doi.org/10.1111/j.1541-0420.2005.00400.x.Search in Google Scholar
18. Cheng, CL, Van Ness, JW. Statistical regression with measurement error. London: Arnold Publishers; 1999.Search in Google Scholar
19. Fuller, WA. Measurement error models. New York: John Wiley & Sons; 1987.10.1002/9780470316665Search in Google Scholar
20. Wansbeek, T, Meijer, E. Measurement error and latent variables in econometrics. Amsterdam: Noth Holland; 2000.Search in Google Scholar
21. Rao, CR, Toutenburg, H, Shalabh, Heumann, C. Linear models and generalizations: least squares and alternatives. Berlin: Springer; 2008.Search in Google Scholar
22. Hu, Y, Wansbeek, T. Measurement error models: editor’s introduction. J Econom 2017;200:151–3. https://doi.org/10.1016/j.jeconom.2017.06.001.Search in Google Scholar
23. Shalabh. Consistent estimation of coefficients in measurement error models with replicated observations. J Multivariate Anal 2003;86:227–41. https://doi.org/10.1016/s0047-259x(03)00021-6.Search in Google Scholar
24. Owen, AB. Statistically efficient thinning of a Markov chain sampler. J Comput Graph Stat 2017;26:738–44. https://doi.org/10.1080/10618600.2017.1336446.Search in Google Scholar
25. Link, WA, Eaton, MJ. On thinning of chains in MCMC. Methods in Ecology and Evolution 2012;3:112–15. https://doi.org/10.1111/j.2041-210x.2011.00131.x.Search in Google Scholar
26. Liu, JS. The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. J Am Stat Assoc 1994;89:958–66. https://doi.org/10.1080/01621459.1994.10476829.Search in Google Scholar
27. van Dyk, DA, Park, T. Partially collapsed Gibbs samplers: theory and methods. J Am Stat Assoc 2008;103:790–6. https://doi.org/10.1198/016214508000000409.Search in Google Scholar
28. MacEachern, SN, Berliner, LM. Subsampling the Gibbs sampler. Am Statistician 1994;48:188–90. https://doi.org/10.2307/2684714.Search in Google Scholar
29. Geyer, CJ. Markov chain Monte Carlo maximum likelihood. In: Kemramides, EM, editor. Computing science and statistics: proceedings of the 23rd symposium on the interface: Interface Foundation of North America. Fairfax Station, VA, USA; 1991:156–63.Search in Google Scholar
30. Bishop, CM. Pattern recognition and machine learning. New York: Springer; 2006.Search in Google Scholar
31. Ormerod, JT, Wand, MP. Explaining variational approximations. Am Statistician 2010;64:140–53. https://doi.org/10.1198/tast.2010.09058.Search in Google Scholar
32. Pham, TH, Ormerod, JT, Wand, MP. Mean field variational Bayesian inference for nonparametric regression with measurement error. Comput Stat Data Anal 2013;68:375–87. https://doi.org/10.1016/j.csda.2013.07.014.Search in Google Scholar
33. Lee, CYY, Wand, MP. Streamlined mean field variational Bayesian inference in multiple equation models. Biom J 2016;58:868–95. https://doi.org/10.1002/bimj.201500007.Search in Google Scholar
34. Blei, DM, Kuckelbir, A, McAuliffe, JD. Variational inference: a review for statisticians. J Am Stat Assoc 2017;112:859–77. https://doi.org/10.1080/01621459.2017.1285773.Search in Google Scholar
35. Faes, C, Ormerod, JT, Wand, MP. Variational Bayesian inference for parametric and nonparametric regression with missing data. J Am Stat Assoc 2011;106:959–71. https://doi.org/10.1198/jasa.2011.tm10301.Search in Google Scholar
36. Spiegelhalter, DJ, Best, NG, Carlin, BP, van der Linde, A. Bayesian measures of model complexity and fit. J Roy Stat Soc B 2002;64:583–639. https://doi.org/10.1111/1467-9868.00353.Search in Google Scholar
37. Celeux, G, Forbes, F, Robert, CP, Titterington, DM. Deviance information criteria for missing data models. Bayesian Analysis 2006;1:651–74. https://doi.org/10.1214/06-ba122.Search in Google Scholar
38. Spiegelhalter, DJ, Best, NG, Carlin, BP, van der Linde, A. The deviance information criterion: 12 years on. J Roy Stat Soc B 2014;76:485–93. https://doi.org/10.1111/rssb.12062.Search in Google Scholar
39. Chan, J, Grant, AD. Fast computation of the deviance information criterion for latent variable models. Comput Stat Data Anal 2016;100:847–59. https://doi.org/10.1016/j.csda.2014.07.018.Search in Google Scholar
40. Liu, X, Liang, KY. Efficacy of repeated measures in regression models with measurement error. Biometrics 199.2;48:645–54. https://doi.org/10.2307/2532318.Search in Google Scholar
41. Kannel, WB, Neaton, JD, Wentworth, D, Thomas, HE, Stamler, J, Hulley, SB, et al.. Overall and coronary heart disease mortality rates in relation to major risk factors in 325,348 men screened for the MRFIT. Am Heart J 1986;112:825–36. https://doi.org/10.1016/0002-8703(86)90481-3.Search in Google Scholar
42. Tao, G, Chen, Y, Wen, C, Bi, M. Statistical analysis of blood pressure measurement errors by oscillometry during surgical operations. Blood Pres Monit 2011;16:285–90. https://doi.org/10.1097/mbp.0b013e32834dc5bc.Search in Google Scholar
43. Jeliazkov, I. Nonparametric vector autoregressions: specification, estimation and inference. Adv Econom 2013;32:327–59. https://doi.org/10.1108/s0731-9053(2013)0000031009.Search in Google Scholar
44. Cheng, CL, Shalabh, Garg, G. Coefficient of determination for multiple measurement error models. J Multivariate Anal 2014;126:137–52. https://doi.org/10.1016/j.jmva.2014.01.006.Search in Google Scholar
45. Giordano, R, Broderick, T, Jordan, MI. Covariances, robustness and variational Bayes. J Mach Learn Res 2018;19:1–49.Search in Google Scholar
46. Yang, Y, Pati, D, Bhattacharya, A. α variational inference with statistical guarantees 2018, https://arxiv.org/abs/1710.03266.Search in Google Scholar
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/ijb-2019-0120).
© 2020 Walter de Gruyter GmbH, Berlin/Boston