Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

6 Issues per year


IMPACT FACTOR 2017: 0.812
5-year IMPACT FACTOR: 1.104

CiteScore 2017: 0.86

SCImago Journal Rank (SJR) 2017: 0.456
Source Normalized Impact per Paper (SNIP) 2017: 0.527

Mathematical Citation Quotient (MCQ) 2017: 0.04

Online
ISSN
1544-6115
See all formats and pricing
More options …
Ahead of print

Issues

Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

A novel method to accurately calculate statistical significance of local similarity analysis for high-throughput time series

Fang Zhang / Ang Shan / Yihui Luan
Published Online: 2018-11-17 | DOI: https://doi.org/10.1515/sagmb-2018-0019

Abstract

In recent years, a large number of time series microbial community data has been produced in molecular biological studies, especially in metagenomics. Among the statistical methods for time series, local similarity analysis is used in a wide range of environments to capture potential local and time-shifted associations that cannot be distinguished by traditional correlation analysis. Initially, the permutation test is popularly applied to obtain the statistical significance of local similarity analysis. More recently, a theoretical method has also been developed to achieve this aim. However, all these methods require the assumption that the time series are independent and identically distributed. In this paper, we propose a new approach based on moving block bootstrap to approximate the statistical significance of local similarity scores for dependent time series. Simulations show that our method can control the type I error rate reasonably, while theoretical approximation and the permutation test perform less well. Finally, our method is applied to human and marine microbial community datasets, indicating that it can identify potential relationship among operational taxonomic units (OTUs) and significantly decrease the rate of false positives.

This article offers supplementary material which is provided at the end of the article.

Keywords: local similarity analysis; moving block bootstrap; statistical significance

References

  • Andersson, M. G. I., M. Berga, E. S. Lindström and S. Langenheder (2014): “The spatial structure of bacterial communities is influenced by historical environmental conditions,” Ecology, 95, 1134–1140.Google Scholar

  • Balasubramaniyan, R., E. Hüllermeier, N. Weskamp and J. Kämper (2005): “Clustering of gene expression data using a local shape-based similarity measure,” Bioinformatics, 21, 1069–1077.Google Scholar

  • Barberán, A., S. T. Bates, E. O. Casamayor and N. Fierer (2011): “Using network analysis to explore co-occurrence patterns in soil microbial communities,” ISME J., 6, 343–351.Google Scholar

  • Beman, J. M., J. A. Steele and J. A. Fuhrman (2011): “Co-occurrence patterns for abundant marine archaeal and bacterial lineages in the deep chlorophyll maximum of coastal california,” ISME J., 5, 1077–1085.Google Scholar

  • Benjamini, Y. and Y. Hochberg (1995): “Controlling the false discovery rate: A practical and powerful approach to multiple testing,” J. R. Stat. Soc. B, 57, 289–300.Google Scholar

  • Berkowitz, J. and L. Kilian (2000): “Recent developments in bootstrapping time series,” Economet. Rev., 19, 1–48.Google Scholar

  • Caporaso, J. G., C. L. Lauber, E. K. Costello, D. Berg-Lyons, A. Gonzalez, J. Stombaugh, D. Knights, P. Gajer, J. Ravel, N. Fierer, J. I. Gordon and R. Knight (2011): “Moving pictures of the human microbiome,” Genome Biol., 12, R50.Google Scholar

  • Carlstein, E. (1986): “The use of subseries values for estimating the variance of a general statistic from a stationary sequence,” Ann. Stat., 14, 1171–1179.Google Scholar

  • Chaffron, S., H. Rehrauer, J. Pernthaler and C. von Mering (2010): “A global network of coexisting microbes from environmental and whole-genome sequence data,” Genome Res., 20, 947–959.Google Scholar

  • Cram, J. A., L. C. Xia, D. M. Needham, R. Sachdeva, F. Sun and J. A. Fuhrman (2015): “Cross-depth analysis of marine bacterial networks suggests downward propagation of temporal changes,” ISME J., 9, 2573–2586.Google Scholar

  • Durno, W. E., Hanson, N. W., Konwar, K. M & Hallam, S. J. 2013, ‘Expanding the boundaries of local similarity analysis’, BMC Genomics, vol. 14, pp. S3–.Google Scholar

  • Faust, K., J. F. Sathirapongsasuti, J. Izard, N. Segata, D. Gevers, J. Raes and C. Huttenhower (2012): “Microbial co-occurrence relationships in the human microbiome,” PLOS Comput. Biol., 8, 1–17.Google Scholar

  • Faust, K., L. Lahti, D. Gonze, W. M. de Vos and J. Raes (2015): “Metagenomics meets time series analysis: unraveling microbial community dynamics,” Curr. Opin. Microbiol., 25, 56–66.Google Scholar

  • Fierer, N., D. Nemergut, R. Knight and J. M. Craine (2010): “Changes through time: integrating microorganisms into the study of succession,” Res. Microbiol., 161, 635–642.Google Scholar

  • Fuhrman, J. A., I. Hewson, M. S. Schwalbach, J. A. Steele, M. V. Brown and S. Naeem (2006): “Annually reoccurring bacterial communities are predictable from ocean conditions,” Proc. Natl. Acad. Sci. USA, 103, 13104–13109.Google Scholar

  • Gilbert, J. A., J. A. Steele, J. G. Caporaso, L. Steinbrück, J. Reeder, B. Temperton, S. Huse, A. C. McHardy, R. Knight, I. Joint, P. Somerfield, J. A. Fuhrman and D. Field (2012): “Defining seasonal marine microbial community dynamics,” ISME J., 6, 298–308.Google Scholar

  • Giovannoni, S. J. and K. L. Vergin (2012): “Seasonality in ocean microbial communities,” Science, 335, 671–676.Google Scholar

  • Gonçalves, J. and S. Madeira (2014): “Latebiclustering: Efficient heuristic algorithm for time-lagged bicluster identification,” IEEE/ACM T. Comput. Bi, 11, 801–813.Google Scholar

  • Ji, L. and K.-L. Tan (2004): “Mining gene expression data for positive and negative co-regulated gene clusters,” Bioinformatics, 20, 2711–2718.Google Scholar

  • Künsch, H. R. (1989): “The jackknife and the bootstrap for general stationary observations,” Ann. Stat., 17, 1217–1241.Google Scholar

  • Liu, R. Y. and K. Singh (1992): Moving blocks jackknife and bootstrap capture weak dependence, New York: John Wiley, pp. 225–248.Google Scholar

  • Lagnoux, A., S. Mercier, P. Vallois (2017): “Statistical significance based on length and position of the local score in a model of i.i.d. sequences,” Bioinformatics, 33, 654–660.Google Scholar

  • Ljung, G. M. and G. E. P. Box (1978): “On a measure of lack of fit in time series models,” Biometrika, 65, 297–303.Google Scholar

  • Madeira, S. C., M. C. Teixeira, I. Sa-Correia and A. L. Oliveira (2010): “Identification of regulatory modules in time series gene expression data using a linear time biclustering algorithm,” IEEE/ACM T. Comput. Bi, 7, 153–165.Google Scholar

  • Mudelsee, M. (2010): Climate Time Series Analysis: Classical Statistical and Bootstrap Methods, Dordrecht: Atmospheric and Oceanographic Sciences Library, Springer.Google Scholar

  • Palmer, C., E. M. Bik, D. B. DiGiulio, D. A. Relman and P. O. Brown (2007): “Development of the human infant intestinal microbiota,” PLOS Biol., 5, 1–18.Google Scholar

  • Pei, Y., Q. Gao, J. Li and X. Zhao (2014): “Identifying local co-regulation relationships in gene expression data,” J. Theor. Biol., 360, 200–207.Google Scholar

  • Qian, J., M. Dolled-Filhart, J. Lin, H. Yu and M. Gerstein (2001): “Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions11edited by f. cohen,” J. Mol. Biol., 314, 1053–1066.Google Scholar

  • Qin, J., R. Li, J. Raes, M. Arumugam, K. S. Burgdorf, C. Manichanh, T. Nielsen, N. Pons, F. Levenez, T. Yamada, D. R. Mende, J. Li, J. Xu, S. Li, D. Li, J. Cao, B. Wang, H. Liang, H. Zheng, Y. Xie, J. Tap, P. Lepage, M. Bertalan, J.-M. Batto, T. Hansen, D. Le Paslier, A. Linneberg, H. B. Nielsen, E. Pelletier, P. Renault, T. Sicheritz-Ponten, K. Turner, H. Zhu, C. Yu, S. Li, M. Jian, Y. Zhou, Y. Li, X. Zhang, S. Li, N. Qin, H. Yang, J. Wang, S. Brunak, J. Doré, F. Guarner, K. Kristiansen, O. Pedersen, J. Parkhill, J. Weissenbach, M. Consortium, P. Bork, S. D. Ehrlich and J. Wang (2010): “A human gut microbial gene catalogue established by metagenomic sequencing,” Nature, 464, 59–65.Google Scholar

  • Ruan, Q., D. Dutta, M. S. Schwalbach, J. A. Steele, J. A. Fuhrman and F. Sun (2006): “Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors,” Bioinformatics, 22, 2532–2538.Google Scholar

  • Shade, A., J. S. Read, N. D. Youngblut, N. Fierer, R. Knight, T. K. Kratz, N. R. Lottig, E. E. Roden, E. H. Stanley, J. Stombaugh, R. J. Whitaker, C. H. Wu and K. D. McMahon (2012): “Lake microbial communities are resilient after a whole-ecosystem disturbance,” ISME J., 6, 2153–2167.Google Scholar

  • Shade, A., J. Gregory Caporaso, J. Handelsman, R. Knight and N. Fierer (2013): “A meta-analysis of changes in bacterial and archaeal communities with time,” ISME J., 7, 1493–1506.Google Scholar

  • Sherman, M., F. M. Speed Jr and F. M. Speed (1998): “Analysis of tidal data via the blockwise bootstrap,” J. Appl. Stat., 25, 333–340.Google Scholar

  • Steele, J. A., P. D. Countway, L. Xia, P. D. Vigil, J. M. Beman, D. Y. Kim, C.-E. T. Chow, R. Sachdeva, A. C. Jones, M. S. Schwalbach, J. M. Rose, I. Hewson, A. Patel, F. Sun, D. A. Caron and J. A. Fuhrman (2011): “Marine bacterial, archaeal and protistan association networks reveal ecological linkages,” ISME J., 5, 1414–1425.Google Scholar

  • Storey, J. D. (2002): “A direct approach to false discovery rates,” J. R. Stat. Soc. B, 64, 479–498.Google Scholar

  • Storey, J. D., A. J. Bass, A. Dabney and D. Robinson (2015): qvalue: Q-value estimation for false discovery rate control. R package version 2.6.0.

  • The Human Microbiome Project Consortium. (2012): “Structure, function and diversity of the healthy human microbiome,” Nature, 486, 207–214.Google Scholar

  • Trosvik, P., N. C. Stenseth and K. Rudi (2010): “Convergent temporal dynamics of the human infant gut microbiota,” ISME J., 4, 151–158.Google Scholar

  • Weiss, S., W. V. Treuren, C. Lozupone, K. Faust, J. Friedman, D. Ye, L. C. Xia, Z. Z. Xu, L. Ursell, E. J. Alm, A. Birmingham, J. A. Cram, J. A. Fuhrman, J. Raes, F. Sun, J. Zhou and R. Knight (2016): “Correlation detection strategies in microbial data sets vary widely in sensitivityand precision.” ISME J., 10, 1669–1681.Google Scholar

  • Waterman, M. S. (1995): Introduction to Computational Biology: Maps, Sequences and Genomes, NY, USA: Chapman and Hall/CRC.Google Scholar

  • Xia, L. C., J. A. Steele, J. A. Cram, Z. G. Cardon, S. L. Simmons, J. J. Vallino, J. A. Fuhrman and F. Sun (2011): “Extended local similarity analysis (elsa) of microbial community and other time series data with replicates,” BMC Syst. Biol., 5, S15.Google Scholar

  • Xia, L. C., D. Ai, J. Cram, J. A. Fuhrman and F. Sun (2013): “Efficient statistical significance approximation for local similarity analysis of high-throughput time series data,” Bioinformatics, 29, 230–237.Google Scholar

  • Xia, L. C., D. Ai, J. A. Cram, X. Liang, J. A. Fuhrman and F. Sun (2015): “Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of markov chains,” BMC Bioinformatics, 16, 301.Google Scholar

  • Zhou, J., Y. Deng, P. Zhang, K. Xue, Y. Liang, J. D. Van Nostrand, Y. Yang, Z. He, L. Wu, D. A. Stahl, T. C. Hazen, J. M. Tiedje and A. P. Arkin (2014): “Stochasticity, succession, and environmental perturbations in a fluidic ecosystem,” Proc. Natl. Acad. Sci. USA, 111, 836–845.Google Scholar

About the article

Published Online: 2018-11-17


Funding Source: Natural Science Foundation of China

Award identifier / Grant number: 11371227, 61432010, 11626247

The research was supported by the Natural Science Foundation of China Grants (Funder Id: 10.13039/501100001809, 11371227, 61432010, 11626247).


Citation Information: Statistical Applications in Genetics and Molecular Biology, 20180019, ISSN (Online) 1544-6115, DOI: https://doi.org/10.1515/sagmb-2018-0019.

Export Citation

©2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Supplementary Article Materials

Comments (0)

Please log in or register to comment.
Log in