Jump to ContentJump to Main Navigation
Show Summary Details
More options …

DNA Barcodes

Ed. by Mitchell, Andrew

1 Issue per year

Emerging Science

Open Access
See all formats and pricing
More options …

An exploration of sufficient sampling effort to describe intraspecific DNA barcode haplotype diversity: examples from the ray-finned fishes (Chordata: Actinopterygii)

Jarrett D. Phillips
  • Centre for Biodiversity Genomics, Department of Integrative Biology, University of Guelph, Ontario, N1G 2W1 Canada
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Rodger A. Gwiazdowski / Daniel Ashlock
  • Department of Mathematics and Statistics, University of Guelph, Guelph, Ontario, N1G 2W1 Canada
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Robert Hanner
  • Corresponding author
  • Centre for Biodiversity Genomics, Department of Integrative Biology, University of Guelph, Ontario, N1G 2W1 Canada
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2015-11-26 | DOI: https://doi.org/10.1515/dna-2015-0008


Estimating appropriate sample sizes to measure species abundance and richness is a fundamental problem for most biodiversity research. In this study, we explore a method to measure sampling sufficiency based on haplotype diversity in the ray-finned fishes (Animalia: Chordata: Actinopterygii). To do this, we use linear regression and hypothesis testing methods on haplotype accumulation curves from DNA barcodes for 18 species of fishes, in the statistics platform R. We use a simple mathematical model to estimate sampling sufficiency from a sample-number based prediction of intraspecific haplotype diversity, given an assumption of equal haplotype frequencies. Our model finds that haplotype diversity for most of the 18 fish species remains largely unsampled, and this appears to be a result of small sample sizes. Lastly, we discuss how our overly simple model may be a useful starting point to develop future estimators for intraspecific sampling sufficiency in studies using DNA barcodes.

This article offers supplementary material which is provided at the end of the article.

Keywords: Chao1 abundance estimator; DNA barcoding; haplotype accumulation curve; method of moments


  • [1] Lenth R.V., Some practical guidelines for effective sample size determination, Am. Stat., 2001, 55, 187-193CrossrefGoogle Scholar

  • [2] Lindblom L., Sample size and haplotype richness in population samples of the lichen-forming ascomycete Xanthoria parietina, The Lichenologist, 2009, 41, 529-535Web of ScienceGoogle Scholar

  • [3] Nei M., Molecular Evolutionary Genetics, Columbia University Press, New York, 1987Google Scholar

  • [4] Goodall-Copestake W.P., Tarling G.A., Murphy E.J., On the comparison of population-level estimates of haplotype and nucleotide diversity: a case study using the gene cox1 in animals, Heredity, 2012, 109, 50-56Web of ScienceGoogle Scholar

  • [5] Hebert P.D.N., Cywinska A., Ball S.A., deWaard J.R., Biological identifications through DNA barcodes, Phil. Trans. Soc. Lond. B., 2003, 270, 313-321Google Scholar

  • [6] Zhang A.B., He L.J., Crozier R.H., Muster C., Zhu, C.-D., Estimating sample sizes for DNA barcoding, Mol. Phylogenet. Evol., 2010, 54, 1035-1039CrossrefWeb of ScienceGoogle Scholar

  • [7] Ratnasingham S., Hebert P.D.N., BOLD: The Barcode of Life Data System (http://www.barcodingoflife.org), Mol. Ecol. Notes, 2007, 7, 355-364Web of ScienceCrossrefGoogle Scholar

  • [8] Muirhead J.R., Gray D.K., Kelly D.W., Ellis S.M., Heath D.D., MacIssac H.J., Identifying the source of species invasions: sampling intensity vs. genetic diversity, Mol. Ecol., 2008, 17, 1020-1035Web of ScienceCrossrefGoogle Scholar

  • [9] Pearson K., Method of moments and method of maximum likelihood, Biometrika, 1936, 28, 34-59.CrossrefGoogle Scholar

  • [10] Gotelli N.J., Colwell R.K. Quantifying biodiversity: Procedures and pitfalls in the measurement and comparison of species richness, Ecol. Lett., 2001, 4, 379-391CrossrefGoogle Scholar

  • [11] Matz M.V., Nielsen R., A likelihood ratio test for species membership based on DNA sequence data, Phil. Trans. R. Soc. B., 2005, 360: 1969-1974Google Scholar

  • [12] Coeur d’acier A., Cruaud A., Artige E., Genson G., Clamens A-L., Pierre E., et al., DNA barcoding and the associated PhylAphidB@se website for the identification of European aphids (Insecta: Hemiptera: Aphididae), PLOS ONE, 2014, 9(6)Google Scholar

  • [13] Grewe P.M., Krueger C.C., Aquadro C.F., Bermingham E., Kincaid H.L., May B., Mitochondrial DNA variation among Lake Trout (Salvelinus namaycush) strains stocked into Lake Ontario, Can. J. Fish Aquat. Sci., 1993, 50, 2397-2403CrossrefGoogle Scholar

  • [14] Brown S. D. J., Collins R. A., Boyer S., Lefort M.-C., Malumbres- Olarte J., Vink C. J. et al., SPIDER: an R package for the analysis of species identity and evolution, with particular reference to DNA barcoding, Mol. Ecol. Resour., 2012, 12, 562-565CrossrefWeb of ScienceGoogle Scholar

  • [15] Paradis E., Claude J., Strimmer K., APE: analyses of phylogenetics and evolution in R language, Bioinformatics, 2004, 20, 289-290CrossrefGoogle Scholar

  • [16] Tamura K., Stecher G., Peterson D., Filipski A., Kumar S., MEGA6: Molecular Evolutionary Genetics Analysis version 6.0, Mol. Biol. Evol., 2013, 30, 2725-2729Google Scholar

  • [17] Paradis E., pegas: an R package for population genetics with an integrated-modular approach, Bioinformatics, 2010, 26, 419-420CrossrefWeb of ScienceGoogle Scholar

  • [18] Hanner R., Data standards for BARCODE records in INSDC (BRIs), 2009Google Scholar

  • [19] Edgar R.C., MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Res., 2004, 32: 1792-1797 CrossrefGoogle Scholar

  • [20] Ratnasingham S., Hebert P.D.N., A DNA-based registry for all animal species: The Barcode Index Number (BIN) system, PLOS ONE, 2013, 8Google Scholar

  • [21] R Core Team, R: a language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, 2013Google Scholar

  • [22] Hortal J., Lobo J.M., An ED-based protocol for optimal sampling of biodiversity, Biodiversity and Conservation, 2005, 14, 2913-2947Google Scholar

  • [23] Chao A., Nonparametric estimation of the number of classes in a population, Scand. J. Statist., 1984, 11, 265-270Google Scholar

  • [24] Chao A., Estimating the population size for capture-recapture data with unequal catchability, Biometrics, 1987, 43, 783-791CrossrefGoogle Scholar

  • [25] Chao A., Estimating population size for sparse data in capturerecapture experiments, Biometrics, 1989, 45, 427-438CrossrefGoogle Scholar

  • [26] Haponski A.E., Bollin T.L., Jedlicka M.A., Stepien C.A., Landscape genetic patterns of the rainbow darter Etheostoma caeruleum: a catchment analysis of mitochondrial DNA sequences and nuclear microsatellites, J. Fish Biol., 2009, 75, 2244-2268 Web of ScienceGoogle Scholar

About the article

Received: 2015-02-26

Accepted: 2015-06-09

Published Online: 2015-11-26

Published in Print: 2015-01-01

Citation Information: DNA Barcodes, ISSN (Online) 2299-1077, DOI: https://doi.org/10.1515/dna-2015-0008.

Export Citation

© 2015. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. BY-NC-ND 3.0

Supplementary Article Materials

Comments (0)

Please log in or register to comment.
Log in