Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter September 1, 2017

Confidence intervals for heritability via Haseman-Elston regression

Tamar Sofer

Abstract

Heritability is the proportion of phenotypic variance in a population that is attributable to individual genotypes. Heritability is considered an important measure in both evolutionary biology and in medicine, and is routinely estimated and reported in genetic epidemiology studies. In population-based genome-wide association studies (GWAS), mixed models are used to estimate variance components, from which a heritability estimate is obtained. The estimated heritability is the proportion of the model’s total variance that is due to the genetic relatedness matrix (kinship measured from genotypes). Current practice is to use bootstrapping, which is slow, or normal asymptotic approximation to estimate the precision of the heritability estimate; however, this approximation fails to hold near the boundaries of the parameter space or when the sample size is small. In this paper we propose to estimate variance components via a Haseman-Elston regression, find the asymptotic distribution of the variance components and proportions of variance, and use them to construct confidence intervals (CIs). Our method is further developed to obtain unbiased variance components estimators and construct CIs by meta-analyzing information from multiple studies. We demonstrate our approach on data from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL).

Acknowledgement

The author thanks Dr. Bruce Weir and Dr. Bill Hill for reviewing earlier versions of the manuscripts, the anonymous reviewers, and the staff and participants of HCHS/SOL for their important contributions. This work was supported in part by NHLBI HHSN268201300005C. The Hispanic Community Health Study/Study of Latinos was carried out as a collaborative study supported by contracts from the National Heart, Lung, and Blood Institute (NHLBI) to the University of North Carolina (N01-HC65233), University of Miami (N01-HC65234), Albert Einstein College of Medicine (N01-HC65235), Northwestern University (N01-HC65236), and San Diego State University (N01-HC65237). The following Institutes/Centers/Offices contribute to the HCHS/SOL through a transfer of funds to the NHLBI: National Institute on Minority Health and Health Disparities, National Institute on Deafness and Other Communication Disorders, National Institute of Dental and Craniofacial Research, National Institute of Diabetes and Digestive and Kidney Diseases, National Institute of Neurological Disorders and Stroke, NIH Institution-Office of Dietary Supplements.

References

Asif, M., S. Karim, Z. Umar, A. Malik, T. Ismail, A. Chaudhary, M. H. Alqahtani, and M. Rasool (2013): “Effect of cigarette smoking based on hematological parameters: comparison between male smokers and nonsmokers,” Turk. J. Biochemistry–Turk J. Biochem., 38, 75–-80.10.5505/tjb.2013.68077Search in Google Scholar

Bulik-Sullivan, B., H. K. Finucane, V. Anttila, A. Gusev, F. R. Day, P. R. Loh, ReproGen Consortium, Psychiatric Genomics Consortium, Genetic Consortium for Anorexia Nervosa of the Wellcome Trust Case Control Consortium 3, L. Duncan, J. R. Perry, N. Patterson, E. B. Robinson, M. J. Daly, A. L. Price, and B. M. Neale (2015): “An atlas of genetic correlations across human diseases and traits,” Nat. Gen., 47, 1236–1241.10.1038/ng.3406Search in Google Scholar

Burch, B. D. (2011): “Assessing the performance of normal-based and REML-based confidence intervals for the intraclass correlation coefficient,” Comput. Stat. Data Anal., 55, 1018–1028.10.1016/j.csda.2010.08.007Search in Google Scholar

Conomos, M., C. Laurie, A. Stilp, S. Gogarten, C. McHugh, S. Nelson, T. Sofer, L. Fernandez-Rhodes, A. Justice, M. Graff, K. Young, A. Seyerle, C. Avery, K. Taylor, J. Rotter, G. Talavera, M. Daviglus, S. Wassertheil-Smoller, N. Schneiderman, G. Heiss, R. Kaplan, N. Franceschini, A. Reiner, J. Shaffer, R. Barr, K. Kerr, S. Browning, B. Browning, B. Weir, M. Avilés-Santa, G. Papanicolaou, T. Lumley, A. Szpiro, K. North, K. Rice, T. Thornton and C. Laurie (2016a): “Genetic diversity and association studies in US Hispanic/Latino populations: applications in the Hispanic Community Health Study/Study of Latinos,” Am. J. Hum. Genet., 98, 165–184.10.1016/j.ajhg.2015.12.001Search in Google Scholar

Conomos, M. P., T. Thornton, and S. M. Gogarten (2016b): GENESIS: GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness, r package version 2.5.2.Search in Google Scholar

Duchesne, P. and P. L. de Micheaux (2010): “Computing the distribution of quadratic forms: further comparisons between the liu-tang-zhang approximation and exact methods,” Comput. Stat. Data Anal., 54, 858–862.10.1016/j.csda.2009.11.025Search in Google Scholar

Kruijer, W., M. P. Boer, M. Malosetti, P. J. Flood, B. Engel, R. Kooke, J. J. Keurentjes and F. A. van Eeuwijk (2015): “Marker-based estimation of heritability in immortal populations,” Genetics, 199, 379–398.10.1534/genetics.114.167916Search in Google Scholar

Kruijer, W., P. Flood, and R. Kooke (2016): Heritability: Marker-Based Estimation of Heritability Using Individual Plant or Plot Data, URL http://CRAN.R-project.org/package=heritability, r package version 1.2.Search in Google Scholar

Laurie, C., K. F. Doheny, D. B. Mirel, E. W. Pugh, L. J. Bierut, T. Bhangale, F. Boehm, N. E. Caporaso, M. C. Cornelis, H. J. Edenberg, S. B. Gabriel, E. L. Harris, F. B. Hu, K. B. Jacobs, P. Kraft, M. T. Landi, T. Lumley, T. A. Manolio, C. McHugh, I. Painter, J. Paschall, J. P. Rice, K. M. Rice, X. Zheng, B. S. Weir and GENEVA Investigators (2010): “Quality control and quality assurance in genotypic data for genome-wide association studies,” Genet. Epidemiol., 34, 591–602.10.1002/gepi.20516Search in Google Scholar

Laurie, C. C., C. A. Laurie, K. Rice, K. F. Doheny, L. R. Zelnick, C. P. McHugh, H. Ling, K. N. Hetrick, E. W. Pugh, C. Amos, Q. Wei, L. E. Wang, J. E. Lee, K. C. Barnes, N. N. Hansel, R. Mathias, D. Daley, T. H. Beaty, A. F. Scott, I. Ruczinski, R. B. Scharpf, L. J. Bierut, S. M. Hartz, M. T. Landi, N. D. Freedman, L. R. Goldin, D. Ginsburg, J. Li, K. C. Desch, S. S. Strom, W. J. Blot, L. B. Signorello, S. A. Ingles, S. J. Chanock, S. I. Berndt, L. Le Marchand, B. E. Henderson, K. R. Monroe, J. A. Heit, M. de Andrade, S. M. Armasu, C. Regnier, W. L. Lowe, M. G. Hayes, M. L. Marazita, E. Feingold, J. C. Murray, M. Melbye, B. Feenstra, J. H. Kang, J. L. Wiggs, G. P. Jarvik, A. N. McDavid, V. E. Seshan, D. B. Mirel, A. Crenshaw, N. Sharopova, A. Wise, J. Shen, D. R. Crosslin, D. M. Levine, X. Zheng, J. I. Udren, S. Bennett, S. C. Nelson, S. M. Gogarten, M. P. Conomos, P. Heagerty, T. Manolio, L. R. Pasquale, C. A. Haiman, N. Caporaso and B. S. Weir (2012): “Detectable clonal mosaicism from birth to old age and its relationship to cancer,” Nat. Genet., 44, 642–650.10.1038/ng.2271Search in Google Scholar

LaVange, L. M., W. D. Kalsbeek, P. D. Sorlie, L. M. Avilés-Santa, R. C. Kaplan, J. Barnhart, K. Liu, A. Giachello, D. J. Lee, J. Ryan, M. H. Criqui, and J. P. Elder (2010): “Sample design and cohort selection in the hispanic community health study/study of latinos,” Ann. Epidemiol., 20, 642–649.10.1016/j.annepidem.2010.05.006Search in Google Scholar

Li, H. and J. Gui (2006): “Gradient directed regularization for sparse gaussian concentration graphs, with applications to inference of genetic networks,” Biostatistics, 7, 302–-317.10.1093/biostatistics/kxj008Search in Google Scholar

Lieberman, O. (1994): “Saddlepoint approximation for the distribution of a ratio of quadratic forms in normal variables,” J. Am. Stat. Assoc., 89, 924–928.10.1080/01621459.1994.10476825Search in Google Scholar

Lumley, T., J. A. Brody, G. Peloso, and K. Rice (2016): “Sequence kernel association tests for large sets of markers: tail probabilities for large quadratic forms,” bioRxiv, URL http://www.biorxiv.org/content/early/2016/11/04/085639.Search in Google Scholar

Schweiger, R., S. Kaufman, R. Laaksonen, M. E. Kleber, W. März, E. Eskin, S. Rosset, and E. Halperin (2016): “Fast and accurate construction of confidence intervals for heritability,” Am. J. Hum. Genet., 98, 1181–1192.10.1016/j.ajhg.2016.04.016Search in Google Scholar

Sorlie, P. D., L. M. Avilés-Santa, S. Wassertheil-Smoller, R. C. Kaplan, M. L. Daviglus, A. L. Giachello, N. Schneiderman, L. Raij, G. Talavera, M. Allison, L. LaVange, L. E. Chambless and G. Heiss (2010): “Design and implementation of the hispanic community health study/study of latinos,” Ann. Epidemiol., 20, 629–641.10.1016/j.annepidem.2010.03.015Search in Google Scholar

Zaitlen, N. and P. Kraft (2012): “Heritability in the genome-wide association era,” Hum. Genet., 131, 1655–1664.10.1007/s00439-012-1199-6Search in Google Scholar

Zhou, X. (2016): “A unified framework for variance component estimation with summary statistics in genome-wide association studies,” bioRxiv, URL http://biorxiv.org/content/early/2016/03/08/042846.10.1214/17-AOAS1052Search in Google Scholar

Supplemental Material:

The online version of this article offers supplementary material (DOI: https://doi.org/10.1515/sagmb-2016-0076).

Published Online: 2017-9-1
Published in Print: 2017-9-26

©2017 Walter de Gruyter GmbH, Berlin/Boston