Jump to ContentJump to Main Navigation
Show Summary Details
More options …

The International Journal of Biostatistics

Ed. by Chambaz, Antoine / Hubbard, Alan E. / van der Laan, Mark J.

IMPACT FACTOR 2018: 1.309

CiteScore 2018: 1.11

SCImago Journal Rank (SJR) 2018: 1.325
Source Normalized Impact per Paper (SNIP) 2018: 0.715

Mathematical Citation Quotient (MCQ) 2018: 0.03

See all formats and pricing
More options …

Simple Quasi-Bayes Approach for Modeling Mean Medical Costs

Grace YoonORCID iD: https://orcid.org/0000-0003-3263-1352 / Wenxin Jiang / Lei Liu / Ya-Chen Tina Shih
Published Online: 2019-06-05 | DOI: https://doi.org/10.1515/ijb-2018-0122


Several statistical issues associated with health care costs, such as heteroscedasticity and severe skewness, make it challenging to estimate or predict medical costs. When the interest is modeling the mean cost, it is desirable to make no assumption on the density function or higher order moments. Another challenge in developing cost prediction models is the presence of many covariates, making it necessary to apply variable selection methods to achieve a balance of prediction accuracy and model simplicity. We propose Spike-or-Slab priors for Bayesian variable selection based on asymptotic normal estimates of the full model parameters that are consistent as long as the assumption on the mean cost is satisfied. In addition, the scope of model searching can be reduced by ranking the Z-statistics. This method possesses four advantages simultaneously: robust (due to avoiding assumptions on the density function or higher order moments), parsimonious (feature of variable selection), informative (due to its Bayesian flavor, which can compare posterior probabilities of candidate models) and efficient (by reducing model searching scope with the use of Z-ranking). We apply this method to the Medical Expenditure Panel Survey dataset.

Keywords: Spike-or-Slab prior; variable selection; sandwich variance estimator; health econometrics


  • [1]

    Keehan S, Stone D, Poisal J, Cuckler G, Sisko A, Smith S, Madison A, Wolfe C, Lizonitz J. National health expenditure projections, 2016–25: Price increases, aging push sector to 20 percent of economy. Health Affairs. 2017;36:553–63.CrossrefWeb of ScienceGoogle Scholar

  • [2]

    Duan N. Smearing estimate: a nonparametric retransformation method. J Am Stat Assoc. 1983;78:605–10.CrossrefGoogle Scholar

  • [3]

    Manning W. The logged dependent variable, heteroscedasticity, and the retransformation problem. J Health Econ. 1998;17:283–95.PubMedCrossrefGoogle Scholar

  • [4]

    Chen J, Liu L, Zhang D, Shih Y-C. A flexible model for the mean and variance functions, with application to medical cost data. Stat Med. 2013;32:4306–18.Web of ScienceCrossrefPubMedGoogle Scholar

  • [5]

    Chen J, Liu L, Zhang D, Shih Y-C, Severini T. A flexible model for correlated medical costs, with application to medical expenditure panel survey data. Stat. Med. 2016;35:883–894.CrossrefPubMedWeb of ScienceGoogle Scholar

  • [6]

    Chernozhukov V, Hong H. An MCMC approach to classical estimation. J Econ. 2003;115:293–346.CrossrefGoogle Scholar

  • [7]

    Inoue A, Shintani M. Quasi-Bayesian model selection, 2014. https://my.vanderbilt.edu/inoue/files/2014/08/submitted_version.pdf, technical report, Vanderbilt University.

  • [8]

    Jiang W, Liu X. Consistent model selection based on parameter estimates. J Stat Plann Infer. 2004;121:265–83.CrossrefGoogle Scholar

  • [9]

    Li C, Jiang W. On oracle property and asymptotic validity of Bayesian generalized method of moments. J Multivariate Anal. 2016;145:132–47.Web of ScienceCrossrefGoogle Scholar

  • [10]

    Ročková V, George E. The spike-and-slab LASSO. J Am Stat Assoc. 2017; in press. DOI: .CrossrefWeb of ScienceGoogle Scholar

  • [11]

    Tang Z, Shen Y, Zhang X, Yi N. The spike-and-slab lasso generalized linear models for prediction and associated genes detection. Genetics. 2017;205:77–88.PubMedWeb of ScienceCrossrefGoogle Scholar

  • [12]

    White H. Maximum likelihood estimation of misspecified models. Econometrica. 1982;50:1–25.CrossrefGoogle Scholar

  • [13]

    Zheng X, Loh W-Y. Consistent variable selection in linear models. J Am Stat Assoc. 1995;90:151–6.CrossrefGoogle Scholar

  • [14]

    Blough DK, Madden CW, Hornbrook MC. Modeling risk using generalized linear models. J Health Econ. 1999;18:153–71.CrossrefPubMedGoogle Scholar

  • [15]

    Buntin MB, Zaslavsky AM. Too much ado about two-part models and transformation? comparing methods of modeling medicare expenditures. J Health Econ. 2004;23:525–42.CrossrefPubMedGoogle Scholar

  • [16]

    Manning WG, Basu A, Mullahy J. Generalized modeling approaches to risk adjustment of skewed outcomes data. J Health Econ. 2005;24:465–88.CrossrefPubMedGoogle Scholar

  • [17]

    Manning WG, Mullahy J. Estimating log models: to transform or not to transform? J Health Econ. 2001;20:461–94.PubMedCrossrefGoogle Scholar

  • [18]

    Mullahy J. Much ado about two: reconsidering retransformation and the two-part model in health econometrics. J Health Econ. 1998;17:247–81.CrossrefPubMedGoogle Scholar

  • [19]

    Basu A, Rathouz PJ. Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics. 2005;6:93–109.CrossrefPubMedGoogle Scholar

About the article

Received: 2018-01-05

Accepted: 2019-04-26

Published Online: 2019-06-05

This work was supported by the Agency for Healthcare Research and Quality (Grant Number: R01 HS 020263, Funder Id: http://dx.doi.org/10.13039/100000133) and National Cancer Institute (Grant Number: T32-CA090301, Funder Id: http://dx.doi.org/10.13039/100000054).

Citation Information: The International Journal of Biostatistics, 20180122, ISSN (Online) 1557-4679, DOI: https://doi.org/10.1515/ijb-2018-0122.

Export Citation

© 2019 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in