Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

6 Issues per year


IMPACT FACTOR 2017: 0.812
5-year IMPACT FACTOR: 1.104

CiteScore 2017: 0.86

SCImago Journal Rank (SJR) 2017: 0.456
Source Normalized Impact per Paper (SNIP) 2017: 0.527

Mathematical Citation Quotient (MCQ) 2016: 0.06

Online
ISSN
1544-6115
See all formats and pricing
More options …
Volume 17, Issue 3

Issues

Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

Multi-locus data distinguishes between population growth and multiple merger coalescents

Jere KoskelaORCID iD: http://orcid.org/0000-0002-2836-8777
Published Online: 2018-06-13 | DOI: https://doi.org/10.1515/sagmb-2017-0011

Abstract

We introduce a low dimensional function of the site frequency spectrum that is tailor-made for distinguishing coalescent models with multiple mergers from Kingman coalescent models with population growth, and use this function to construct a hypothesis test between these model classes. The null and alternative sampling distributions of the statistic are intractable, but its low dimensionality renders them amenable to Monte Carlo estimation. We construct kernel density estimates of the sampling distributions based on simulated data, and show that the resulting hypothesis test dramatically improves on the statistical power of a current state-of-the-art method. A key reason for this improvement is the use of multi-locus data, in particular averaging observed site frequency spectra across unlinked loci to reduce sampling variance. We also demonstrate the robustness of our method to nuisance and tuning parameters. Finally we show that the same kernel density estimates can be used to conduct parameter estimation, and argue that our method is readily generalisable for applications in model selection, parameter inference and experimental design.

Keywords: kernel density estimation; multiple merger coalescent; population growth; site frequency spectrum; statistical power

MSC 2010: Primary: 92D10; Secondary: 62M02; 62F03

References

  • Achaz, G. (2008): “Testing for neutrality in samples with sequencing errors,” Genetics, 179, 1409–1424.CrossrefPubMedWeb of ScienceGoogle Scholar

  • Árnason, E. (2004): “Mitochondrial cytochrome b variation in the high-fecundity Atlantic cod: trans-Atlantic clines and shallow gene genealogy.” Genetics, 166, 1871–1885.CrossrefPubMedGoogle Scholar

  • Beaumont, M. A. (2010): “Approximate Bayesian computation in evolution and ecology,” Annu. Rev. Ecol. Evol. Syst., 41, 379–406.CrossrefGoogle Scholar

  • Beckenbach, A. T. (1994): “Mitochondrial haplotype frequencies in oysters: neutral alternatives to selection models,” In: Golding, B. (Ed.), Non-neutral evolution. New York: Chapman & Hall, pp. 188–198.Google Scholar

  • Birkner, M. and J. Blath (2008): “Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model,” J. Math. Biol., 57, 435–465.CrossrefPubMedWeb of ScienceGoogle Scholar

  • Birkner, M., J. Blath, M. Möhle, M. Steinrücken, and J. Tams (2009): “A modified lookdown construction for the Xi-Fleming-Viot process with mutation and populations with recurrent bottlenecks,” ALEA Lat. Am. J. Probab. Math. Stat., 6, 25–61.Google Scholar

  • Birkner, M., J. Blath, and M. Steinrücken (2011): “Importance sampling for Lambda-coalescents in the infinitely many sites model,” Theor. Popul. Biol., 79, 155–173.CrossrefPubMedWeb of ScienceGoogle Scholar

  • Birkner, M., J. Blath, and B. Eldon (2013a): “An ancestral recombination graph for diploid populations with skewed offspring distribution,” Genetics, 193, 255–290.CrossrefWeb of ScienceGoogle Scholar

  • Birkner, M., J. Blath, and B. Eldon (2013b): “Statistical properties of the site-frequency spectrum associated with Lambda-coalescents,” Genetics, 195, 1037–1053.CrossrefWeb of ScienceGoogle Scholar

  • Birkner, M., H. Liu, and A. Sturm (2017): “A note on coalescent results for diploid exchangeable population models,” Preprint, arXiv:1709.02563v2.Google Scholar

  • Blath, J., M. C. Cronjäger, B. Eldon, and M. Hammer (2016): “The site-frequency spectrum associated with Ξ-coalescents,” Theor. Popul. Biol., 110, 36–50.Web of ScienceCrossrefPubMedGoogle Scholar

  • Depaulis, F. and M. Veuille (1998): “Neutrality tests based on the distribution of haplotypes under an infinite-site model,” Mol. Biol. Evol., 15, 1788.CrossrefPubMedGoogle Scholar

  • Diggle, P. J. and R. J. Gratton (1984): “Monte Carlo methods of inference for implicit statistical models,” J. R. Stat. Soc. B, 46, 193–227.Google Scholar

  • Donnelly, P. and T. G. Kurtz (1999): “Particle representations for measure-valued population models,” Ann. Probab., 27, 166–205.CrossrefGoogle Scholar

  • Donnelly, P. and S. Tavaré (1995): “Coalescents and genealogical structure under neutrality,” Annu. Rev. Genet., 29, 401–421.PubMedCrossrefGoogle Scholar

  • Duong, T. and M. L. Hazelton (2003): “Plug-in bandwidth matrices for bivariate kernel density estimation,” J. Nonparametr Stat., 15, 17–30.CrossrefGoogle Scholar

  • Durrett, R. and J. Schweinsberg (2005): “A coalescent model for the effect of advantageous mutations on the genealogy of a population,” Stoch. Proc. Appl., 115, 1628–1657.CrossrefGoogle Scholar

  • Eldon, B. (2011): “Estimation of parameters in large offspring number models and ratios of coalescence times,” Theor. Popul. Biol., 80, 16–28.CrossrefWeb of SciencePubMedGoogle Scholar

  • Eldon, B. and J. Wakeley (2006): “Coalescent processes when the distribution of offspring number among individuals is highly skewed,” Genetics, 172, 2621–2633.PubMedGoogle Scholar

  • Eldon, B. and J. Wakeley (2009): “Coalescence times and FST under a skewed offspring distribution among individuals in a population,” Genetics, 181, 615–629.Google Scholar

  • Eldon, B., M. Birkner, J. Blath, and F. Freund (2015): “Can the site frequency spectrum distinguish exponential population growth from multiple-merger coalescents,” Genetics, 199, 841–856.CrossrefWeb of SciencePubMedGoogle Scholar

  • Fay, J. C. and C.-I. Wu (2000): “Hitchhiking under positive Darwinian selection,” Genetics, 155, 1405–1413.PubMedGoogle Scholar

  • Fu, Y. X. (1995): “Statistical properties of segregating sites,” Theor. Popul. Biol., 48, 172–197.PubMedCrossrefGoogle Scholar

  • Fu, Y. X. and W. H. Li (1993): “Statistical tests of neutrality of mutations,” Genetics, 133, 693–709.PubMedGoogle Scholar

  • Hedgecock, D. and A. I. Pudovkin (2011): “Sweepstakes reproductive success in highly fecund marine fish and shellfish: a review and commentary,” Bull. Mar. Sci., 87, 971–1002.CrossrefWeb of ScienceGoogle Scholar

  • Hein, J., M. H. Schierup, and C. Wiuf (2005): Gene genealogies, variation and evolution. Oxford, UK: Oxford University Press.Google Scholar

  • Hudson, R. R. (1983a): “Properties of a neutral allele model with intragenic recombination,” Theor. Popul. Biol., 23, 183–201.CrossrefGoogle Scholar

  • Hudson, R. R. (1983b): “Testing the constant-rate neutral allele model with protein sequence data,” Evolution, 37, 203–217.CrossrefGoogle Scholar

  • Hudson, R. R. (1990): “Gene genealogies and the coalescent process,” In: Futuyma, D. J., Antonovics, J. (Eds.), Oxford surveys in evolutionary biology, Vol. 7. Oxford: Oxford University Press, pp. 1–44.Google Scholar

  • Kingman, J. F. C. (1982a): “The coalescent,” Stoch. Proc. Appl., 13, 235–248.CrossrefGoogle Scholar

  • Kingman, J. F. C. (1982b): “Exchangeability and the evolution of large populations,” In: Koch, G., Spizzichino, F., (Eds.), Exchangeability in probability and statistics. Amsterdam: North-Holland, pp. 97–112.Google Scholar

  • Kingman, J. F. C. (1982c): “On the genealogy of large populations,” J. Appl. Probab., 19A, 27–43.Google Scholar

  • Koskela, J., P. Jenkins, and D. Spanò (2015): “Computational inference beyond Kingman’s coalescent,” J. Appl. Probab., 52, 519–537.CrossrefGoogle Scholar

  • Koskela, J., P. Jenkins, and D. Spanò (2018): “Bayesian non-parametric inference for Λ-coalescents: posterior consistency and a parametric method,” Bernoulli, 24, 2122–2153.Web of ScienceCrossrefGoogle Scholar

  • Möhle, M. (1998): “Robustness results for the coalescent,” J. Appl. Probab., 35, 438–447.CrossrefGoogle Scholar

  • Nordborg, M. (2001): “Coalescent theory,” In: Balding, D. J., Bishop, M. J., Cannings, C. (Eds.), Handbook of statistical genetics, chapter 25, 2nd edn. Chichester, UK: John Wiley & Sons, pp. 179–212.Google Scholar

  • Pitman, J. (1999): “Coalescents with multiple collisions,” Ann. Probab., 27, 1870–1902.CrossrefGoogle Scholar

  • Ramos-Onsins, S. E. and J. Rozas (2002): “Statistical properties of new neutrality tests against population growth,” Mol. Biol. Evol., 19, 2092–2100.CrossrefPubMedGoogle Scholar

  • Sagitov, S. (1999): “The general coalescent with asynchronous mergers of ancestral lines,” J. Appl. Probab., 36, 1116–1125.CrossrefGoogle Scholar

  • Sargsyan, O. and J. Wakeley (2008): “A coalescent process with simultaneous multiple mergers for approximating the gene genealogies of many marine organisms,” Theor. Popul. Biol., 74, 104–114.PubMedCrossrefWeb of ScienceGoogle Scholar

  • Schweinsberg, J. (2003): “Coalescent processes obtained from supercritical Galton-Watson processes,” Stoch. Proc. Appl., 106, 107–139.CrossrefGoogle Scholar

  • Scott, D. W. (1992): Multivariate density estimation: theory, practice and visualization. New York: John Wiley & Sons.Google Scholar

  • Steinrücken, M., M. Birkner, and J. Blath (2013): “Analysis of DNA sequence variation within marine species using beta-coalescents,” Theor. Popul. Biol., 87, 15–24.CrossrefPubMedWeb of ScienceGoogle Scholar

  • Tajima, F. (1983): “Evolutionary relationship of DNA sequences in finite populations,” Genetics, 105, 437–460.PubMedGoogle Scholar

  • Tajima, F. (1989): “The effect of change in population size on DNA polymorphism,” Genetics, 123, 597–601.PubMedGoogle Scholar

  • Tellier, A. and C. Lemaire (2014): “Coalescence 2.0: a multiple branching of recent theoretical developments and their applications,” Mol. Ecol., 23, 2637–2652.PubMedWeb of ScienceCrossrefGoogle Scholar

  • Tørresen, O. K., B. Star, S. Jentoft, W. B. Reinar, H. Grove, J. R. Miller, B. P. Walenz, J. Knight, J. M. Ekholm, P. Peluso, R. B. Edvardsen, A. Tooming-Klunderud, M. Skage, S. Lien, K. S. Jakobsen, and A. J. Nederbragt (2017): “An improved genome assembly uncovers prolific tandem repeats in Atlantic cod,” BMC Genomics, 18, 95.CrossrefPubMedWeb of ScienceGoogle Scholar

  • Wakeley, J. (2007): Coalescent theory. Greenwood Village: Roberts & Co.Google Scholar

  • Watterson, G. A. (1975): “On the number of segregating sites in genetical models without recombination,” Theor. Pop. Biol., 7, 1539–1546.Google Scholar

  • Zhu, S., J. H. Degnan, S. J. Goldstein, and B. Eldon (2015): “Hybrid-Lambda: simulation of multiple merger and Kingman gene genealogies in species networks and species trees,” BMC Bioinformatics, 16.Web of ScienceGoogle Scholar

About the article

Published Online: 2018-06-13


Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 17, Issue 3, 20170011, ISSN (Online) 1544-6115, DOI: https://doi.org/10.1515/sagmb-2017-0011.

Export Citation

©2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in