Jump to ContentJump to Main Navigation
Show Summary Details

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year


IMPACT FACTOR increased in 2015: 1.265
5-year IMPACT FACTOR: 1.423
Rank 42 out of 123 in category Statistics & Probability in the 2015 Thomson Reuters Journal Citation Report/Science Edition

SCImago Journal Rank (SJR) 2015: 0.954
Source Normalized Impact per Paper (SNIP) 2015: 0.554
Impact per Publication (IPP) 2015: 1.061

Mathematical Citation Quotient (MCQ) 2015: 0.06

Online
ISSN
1544-6115
See all formats and pricing
Volume 15, Issue 6 (Dec 2016)

Tree-based quantitative trait mapping in the presence of external covariates

Katherine L. Thompson
  • Corresponding author
  • Department of Statistics, University of Kentucky, Lexington, KY, United States of America
  • Email:
/ Catherine R. Linnen
  • Department of Biology, University of Kentucky, Lexington, KY, United States of America
/ Laura Kubatko
  • Departments of Statistics and Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH, United States of America
Published Online: 2016-11-22 | DOI: https://doi.org/10.1515/sagmb-2015-0107

Abstract

A central goal in biological and biomedical sciences is to identify the molecular basis of variation in morphological and behavioral traits. Over the last decade, improvements in sequencing technologies coupled with the active development of association mapping methods have made it possible to link single nucleotide polymorphisms (SNPs) and quantitative traits. However, a major limitation of existing methods is that they are often unable to consider complex, but biologically-realistic, scenarios. Previous work showed that association mapping method performance can be improved by using the evolutionary history within each SNP to estimate the covariance structure among randomly-sampled individuals. Here, we propose a method that can be used to analyze a variety of data types, such as data including external covariates, while considering the evolutionary history among SNPs, providing an advantage over existing methods. Existing methods either do so at a computational cost, or fail to model these relationships altogether. By considering the broad-scale relationships among SNPs, the proposed approach is both computationally-feasible and informed by the evolutionary history among SNPs. We show that incorporating an approximate covariance structure during analysis of complex data sets increases performance in quantitative trait mapping, and apply the proposed method to deer mice data.

Keywords: coalescent theory; genome-wide association study (GWAS); phylogenetic covariance; quantitative trait mapping (QTM); single nucleotide polymorphisms (SNPs)

References

  • Balding, D. J. (2006): “A tutorial on statistical methods for population association studies,” Nat. Rev. Genet., 7, 781–791.

  • Besenbacher, S., T. Mailund and M. H. Schierup (2009): “Local phylogeny mapping of quantitative traits: higher accuracy and better ranking than single-marker association in genomewide scans,” Genetics, 181, 747–753.

  • Browning, S. R. and B. L. Browning (2007): “Rapid and accurate haplotype phasing and missing data inference for whole genome association studies using localized haplotype clustering,” Am. J. Hum. Genet., 81, 1084–1097.

  • Domingues, V. S., Y.-P. Poh, B. K. Peterson, P. S. Pennings, J. D. Jensen and H. E. Hoekstra (2012): “Evidence of adaptation from ancestral variation in young populations of beach mice,” Evolution, 66, 3209–3223.

  • González, J. R., L. Armengol, X. Solé, E. Guinó, J. M. Mercader, X. Estivill and V. Moreno (2007): “SNPassoc: an R package to perform whole genome association studies,” Bioinformatics, 23, 644–645.

  • Guan, Y. and M. Stephens (2011): “Bayesian variable selection regression for genome-wide association studies and other large-scale problems,” Ann. Appl. Stat., 5, 1780–1815.

  • Hirschhorn, J. N. and M. J. Daly (2005): “Genome-wide association studies for common diseases and complex traits,” Nat. Rev. Genet., 6, 95–108.

  • Hudson, R. R. (2002): “Generating samples under a wright-fisher neutral model of genetic variation,” Bioinformatics, 18, 337–338.

  • Kang, H. M., J. H. Sul, S. K. Service, N. A. Zaitlen, S. Kong, N. B. Freimer, C. Sabatti and E. Eskin (2010): “Variance component model to account for sample structure in genome-wide association studies,” Nat. Genet., 42, 348–354.

  • Kass, R. E. and A. E. Raftery (1995): “Bayes factors,” J. Am. Statist. Assoc., 90, 773–795.

  • King, C. R., P. J. Rathouz and D. L. Nicolae (2010): “An evolutionary framework for association testing in resequencing studies,” PLoS Genet., 6, e1001202.

  • Laird, N., S. Horvath and X. Xu (2000): “Implementing a unified approach to family based tests of association,” Genet. Epidemiol., 19, S36–S42. [Crossref]

  • Linnen, C. R., E. P. Kingsley, J. D. Jensen and H. E. Hoekstra (2009): “On the origin and spread of an adaptive allele in deer mice,” Science, 325, 1095–1098.

  • Linnen, C. R., Y.-P. Poh, B. K. Peterson, R. D. H. Barrett, J. G. Larson, J. D. Jensen and H. E. Hoekstra (2013): “Adaptive evolution of multiple traits through multiple mutations at a single gene,” Science, 339, 1312–1316.

  • Lynch, M. and B. Walsh (Ed.) (1998): Genetics and analysis of quantitative traits, chapter 26. Sunderland, MA, USA: Sinauer Associates, Inc.

  • Mailund, T., S. Besenbacher and M. H. Schierup (2006): “Whole genome association mapping by incompatibilities and local perfect phylogenies,” BMC Bioinform., 7, 454.

  • Moore, J. H., F. W. Asselbergs and S. M. Williams (2010): “Bioinformatics challenges for genome-wide association studies,” Bioinformatics, 26, 445–455.

  • Naylor, M. G., S. T. Weiss and C. Lange (2010): “A bayesian approach to genetic association studies with family-based designs,” Genet. Epidemiol., 34, 569–574.

  • Newton, M. A. and A. E. Raftery (1994): “Approximate bayesian inference with the weighted likelihood bootstrap,” J. R. Stat. Soc. Series B Methodol., 56, 3–48.

  • Ott, J., Y. Kamatani and M. Lathrop (2011): “Family-based designs for genome-wide association studies,” Nat. Rev. Genet., 12, 465–474.

  • Pan, F., L. McMillan, F. Pardo-Manuel de Villena, D. Threadgill and W. Wang (2009): “TreeQA”: Quantitative genome wide association mapping using local perfect phylogeny trees, Pac. Symp. Biocomput., 415–426.

  • Patterson, N., A. L. Price and D. Reich (2006): “Population structure and eigenanalysis,” PLoS Genet., 2, e190. [Crossref]

  • Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. Ferreira, D. Bender, J. Maller, P. Sklar, P. I. de Bakker, M. J. Daly and P. C. Sham (2007): “PLINK”: A tool set for whole-genome association and population-based linkage analyses, Am. J Hum. Genet., 81, 559–575.

  • Ried, J. S., A. Döring, K. Oexle, C. Meisinger, J. Winkelmann, N. Klopp, T. Meitinger, A. Peters, K. Suhre, H.-E. Wichmann and C. Gieger (2012): “PSEA:” Phenotype set enrichment analysis–a new method for analysis of multiple phenotypes, Genetic Epidemiol., 36, 244–252.

  • Rogers, J. S. and D. L. Swofford (1998): “A fast method for approximating maximum likelihoods of phylogenetic trees from nucleotide sequences,” Syst. Biol., 47, 77–89.

  • Schaid, D. J., C. M. Rowland, D. E. Tines, R. M. Jacobson and G. A. Poland (2002): “Score tests for association between traits and haplotypes when linkage phase is ambiguous,” Am. J. Hum. Genet., 70, 425–434.

  • Sinnwell, J. P. and D. J. Schaid (2009): haplo.stats: Statistical analysis of haplotypes with traits and covariates when linkage phase is ambiguous, http://CRAN.R-project.org/package=haplo.stats, r package version 1.4.4.

  • Solé, X., E. Guino, J. Valls, R. Iniesta and V. Moreno (2006): “SNPStats”: a web tool for the analysis of association studies, Bioinformatics, 22, 1928–1929.

  • Stephens, M. and D. J. Balding (2009): “Bayesian statistical methods for genetic association studies,” Nat. Rev. Genet., 10, 681–690.

  • Stranger, B. E., E. a Stahl and T. Raj (2011): “Progress and promise of genome-wide association studies for human complex trait genetics,” Genetics, 187, 367–383.

  • Thompson, K. L. and L. S. Kubatko (2013): “Using ancestral information to detect and localize quantitative trait loci in genome-wide association studies,” BMC Bioinform., 14, 200.

  • Tzeng, J.-Y., C.-H. Wang, J.-T. Kao and C. K. Hsiao (2006): “Regression-based association analysis with clustered haplotypes through use of genotypes,” Am. J. Hum. Genet., 78, 231–242.

  • van der Sluis, S., D. Posthuma and C. V. Dolan (2013): “TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies,” PLoS Genet., 9, e1003235.

  • Wood, S. (Ed.) (2006): Generalized additive models: an introduction with R, chapter 6. Boca Raton, FL, USA: Chapman and Hall/CRC.

  • Yan, Q., D. E. Weeks, J. C. Celedón, H. K. Tiwari, B. Li, X. Wang, W.-Y. Lin, X.-Y. Lou, G. Gao, W. Chen and N. Liu (2015): “Associating multivariate quantitative phenotypes with genetic variants in family samples with a novel kernel machine regression method,” Genetics, 201, 1329–1339.

  • Yu, J., G. Pressoir, W. H. Briggs, I. V. Bi, M. Yamasaki, J. F. Doebley, M. D. McMullen, B. S. Gaut, D. M. Nielsen, J. B. Holland, S. Kresovich and E. S. Buckler (2006): “A unified mixed-model method for association mapping that accounts for multiple levels of relatedness,” Nat. Genet., 38, 203–208.

  • Zhang, W., R. Korstanje, J. Thaisz, F. Staedtler, N. Harttman, L. Xu, M. Feng, L. Yanas, H. Yang, W. Valdar, G. A. Churchill and K. DiPetrillo (2012a): “Genome-wide association mapping of quantitative traits in outbred mice,” G3 (Bethesda), 2, 167–174.

  • Zhang, Z., X. Zhang and W. Wang (2012b): “HTreeQA: Using semi-perfect phylogeny trees in quantitative trait loci study on genotype data,” G3 (Bethesda), 2, 175–189.

  • Zhu, X., S. Li, R. S. Cooper and R. C. Elston (2008): “A unified association analysis approach for family and unrelated samples correcting for stratification,” Am. J. Hum. Genet., 82, 352–365.

  • Zöllner, S. and J. K. Pritchard (2005): “Coalescent-based association mapping and fine mapping of complex trait loci,” Genetics, 169, 1071–1092.

  • Zöllner, S., X. Wen and J. K. Pritchard (2005): “Association mapping and fine mapping with TreeLD,” Bioinformatics, 21, 3168–3170.

About the article

Published Online: 2016-11-22

Published in Print: 2016-12-01


Funding Source: National Science Foundation

Award identifier / Grant number: DEB-1257739

The authors would like to thank Hopi Hoekstra for her gracious permission to allow us to re-analyze the deer mouse data. In addition, we would like to thank the University of Kentucky College of Arts & Sciences for the use of their computational cluster for simulation and real data analysis, as well as the University of Kentucky High Performance Computing Center for the use of the supercomputer for empirical data analysis. Lastly, we would like to thank the reviewers for their insightful feedback which greatly improved this manuscript. This material is based, in part, upon work supported by the National Science Foundation under Grant No. DEB-1257739 (to CRL).


Citation Information: Statistical Applications in Genetics and Molecular Biology, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2015-0107. Export Citation

Comments (0)

Please log in or register to comment.
Log in