Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter September 19, 2018

Assessing genome-wide significance for the detection of differentially methylated regions

Christian M. Page ORCID logo, Linda Vos, Trine B. Rounge, Hanne F. Harbo and Bettina K. Andreassen

Abstract

DNA methylation plays an important role in human health and disease, and methods for the identification of differently methylated regions are of increasing interest. There is currently a lack of statistical methods which properly address multiple testing, i.e. control genome-wide significance for differentially methylated regions. We introduce a scan statistic (DMRScan), which overcomes these limitations. We benchmark DMRScan against two well established methods (bumphunter, DMRcate), using a simulation study based on real methylation data. An implementation of DMRScan is available from Bioconductor. Our method has higher power than alternative methods across different simulation scenarios, particularly for small effect sizes. DMRScan exhibits greater flexibility in statistical modeling and can be used with more complex designs than current methods. DMRScan is the first dynamic approach which properly addresses the multiple-testing challenges for the identification of differently methylated regions. DMRScan outperformed alternative methods in terms of power, while keeping the false discovery rate controlled.

Funding source: University of Oslo

Award Identifier / Grant number: 531217/1231

Funding source: Folkhälsan Research Foundation; The Academy of Finland

Award Identifier / Grant number: 250704

Funding statement: This work was supported by the University of Oslo [Funder Id: 10.13039/501100005366, grant number 531217/1231]; Folkhälsan Research Foundation; The Academy of Finland [grant number 250704]; The Life and Health Medical Fund [grant number 1-23-28]; The Swedish Cultural Foundation in Finland [grant number 15/0897]; The Signe and Ane Gyllenberg Foundation [grant number 37-1977-43]; and The Yrjö Jahnsson Foundation [grant number 11486].

Acknowledgement

We acknowledge Folkhälsan Research Center and the Fin-HIT study group: Sabina Simola, Stephanie Von Kreamer, Jesper Skand, Catharina Sarkkola, Sajan Raju and Elisabete Weiderpass (Helsinki, Finland) for providing data for benchmarking the different models. Institute for Molecular Medicine Finland (FIMM) provided computational infrastructure and preformed the sequencing to this project. Suzanne Campbell and Marissa LaBlanc for critical evaluation of this manuscript.

List of abbreviations

AR(p)

Autoregressive process of order p

ChIP

Chromatin Immunoprecipitation

DMR

Differentially methylated region

Ek

Expected number of significant windows of size k

FDR

False discovery rate

MCMC

Markov Chain Monte Carlo

OU-process

Ornstein-Uhlenbeck process

tk

Window threshold for sliding windows of size k

Declarations

  1. Ethics: The Coordinating Ethics Committees of the Hospital Districts of Helsinki and Uusimaa approved the study. Informed consent was obtained from all participants and as well as one of their legal guardians.

  2. Availability of data and materials: The R package is placed at Bioconductor under the name DMRScan, along with the example data set used in this paper. The R-code for comparing the methods can be found in the GitHub repos for the of the R package: https://github.com/christpa/DMRScan.

  3. Conflict of interest statement: The authors declare that they have no competing interests.

References

Aldous, D. (1989): Probability approximations via the Poisson clumping heuristic, Springer Science & Business Media.Search in Google Scholar

Benjamini, Y., J. Taylor and R. A. Irizarry. (2018): “Selection corrected statistical inference for region detection with high-throughput assays.” J. Am. Stat. Assoc., 1(47).Search in Google Scholar

Bock, C. (2012): “Analysing and interpreting DNA methylation data,” Nat. Rev. Genet., 13, 705–719.2298626510.1038/nrg3273Search in Google Scholar

Butcher, L. M. and S. Beck (2015): “Probe Lasso: a novel method to rope in differentially methylated regions with 450K DNA methylation data,” Methods, 72, 21–28.10.1016/j.ymeth.2014.10.036Search in Google Scholar

Du, P., X. Zhang, C. C. Huang, N. Jafari, W. A. Kibbe, L. Hou and S. M. Lin (2010): “Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis,” BMC Bioinformatics, 11, 587.10.1186/1471-2105-11-58721118553Search in Google Scholar

Feinberg, A. P., R. A. Irizarry, D. Fradin, M. J. Aryee, P. Murakami, T. Aspelund, G. Eiriksdottir, T. B. Harris, L. Launer, V. Gudnason and M. D. Fallin (2010): “Personalized epigenomic signatures that are stable over time and covary with body mass index,” Sci. Transl. Med., 2, 49ra67.20844285Search in Google Scholar

Hansen, K. D., B. Langmead and R. A. Irizarry (2012): “BSmooth: from whole genome bisulfite sequencing reads to differentially methylated regions,” Genome Biol., 13, R83.2303417510.1186/gb-2012-13-10-r83Search in Google Scholar

Jaffe, A. E., P. Murakami, H. Lee, J. T. Leek, M. D. Fallin, A. P. Feinberg and R. A. Irizarry (2012): “Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies,” Int. J. Epidemiol., 41, 200–209.2242245310.1093/ije/dyr238Search in Google Scholar

Jones, P. A. (2012): “Functions of DNA methylation: islands, start sites, gene bodies and beyond,” Nat. Rev. Genet., 13, 484–492.10.1038/nrg323022641018Search in Google Scholar

Korthauer, K., S. Chakraborty, Y. Benjamini amd R. A. Irizarry. (2017): “Detection and accurate False Discovery Rate control of differentially methylated regions from Whole Genome Bisulfite Sequencing.” Biostatistics.Search in Google Scholar

Laurent, L., E. Wong, G. Li, T. Huynh, A. Tsirigos, C. T. Ong, H. M. Low, K. W. Kin Sung, I. Rigoutsos, J. Loring and C. L. Wei (2010): “Dynamic changes in the human methylome during differentiation,” Genome Res., 20, 320–331.2013333310.1101/gr.101907.109Search in Google Scholar

Lun, A. T. and G. K. Smyth (2015): “csaw: a Bioconductor package for differential binding analysis of ChIP-seq data using sliding windows,” Nucleic Acids Res., 44, e45–e45.Search in Google Scholar

Peng, G., L. Luo, H. Siu, Y. Zhu, P. Hu, S. Hong, J. Zhao, X. Zhou, J. D. Reveille and L. Jin (2010): “Gene and pathway-based second-wave analysis of genome-wide association studies,” Eur. J. Hum. Genet., 18, 111–117.1958489910.1038/ejhg.2009.115Search in Google Scholar

Peters, T. J., M. J. Buckley, A. L. Statham, R. Pidsley, K. Samaras, V. L. R, S. J. Clark and P. L. Molloy (2015): “De novo identification of differentially methylated regions in the human genome,” Epigenetics Chromatin, 8, 6.25972926Search in Google Scholar

Rakyan, V. K., T. A. Down, D. J. Balding and S. Beck (2011): “Epigenome-wide association studies for common human diseases,” Nat. Rev. Genet., 12, 529–541.2174740410.1038/nrg3000Search in Google Scholar

Reiner-Benaim, A., R. W. Davis and K. Juneau (2014): “Scan statistics analysis for detection of introns in time-course tiling array data,” Stat. Appl. Genet. Mol. Biol., 13, 173–190.24572987Search in Google Scholar

Ritchie, M. E., B. Phipson, D. Wu, Y. Hu, C. W. Law, W. Shi amd G. K. Smyth. (2015): “Limma powers differential expression analyses for RNA-sequencing and microarray studies,” Nucleic Acids Res., 43, e47–e60.Search in Google Scholar

Rounge, T. B., C. M. Page, M. Lepisto, P. Ellonen, B. K. Andreassen and E. Weiderpass (2016): “Genome-wide DNA methylation in saliva and body size of adolescent girls,” Epigenomics, 8, 1495–1505.10.2217/epi-2016-004527762626Search in Google Scholar

Rozowsky, J., G. Euskirchen, R. K. Auerbach, Z. D. Zhang, T. Gibson, R. Bjornson, N. Carriero, M. Snyder and M. B. Gerstein (2009): “PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls,” Nat. Biotechnol., 27, 66–75.10.1038/nbt.151819122651Search in Google Scholar

Satterthwaite, F. E. (1946): “An approximate distribution of estimates of variance components,” Biometrics Bull., 2, 110–114.10.2307/3002019Search in Google Scholar

Shen, L., J. Zhu, S.-Y. Robert Li and X. Fan (2017): “Detect differentially methylated regions using non-homogeneous hidden Markov model for methylation array data,” Bioinformatics, 33, 3701–3708.2903632010.1093/bioinformatics/btx467Search in Google Scholar

Siegmund, D. (1985): Sequential analysis: tests and confidence intervals. NY, USA, Springer Science & Business Media.Search in Google Scholar

Siegmund, D. and B. Yakir (2007): The statistics of gene mapping. NY, USA, Springer Science & Business Media.Search in Google Scholar

Siegmund, D. O., N. R. Zhang and B. Yakir (2011): “False discovery rate for scanning statistics,” Biometrika, 98, 979–985.10.1093/biomet/asr057Search in Google Scholar

Slieker, R. C., S. D. Bos, J. J. Goeman, J. V. Bovee, R. P. Talens, R. van der Breggen, H. E. Suchiman, E. W. Lameijer, H. Putter, E. B. van den Akker, Y. Zhang, J. W. Jukema, P. E. Slagboom, I. Meulenbelt and B. T. Heijmans (2013): “Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array,” Epigenetics Chromatin, 6, 26.10.1186/1756-8935-6-2623919675Search in Google Scholar

Stouffer, S. A., E. A. Suchman, L. C. DeVinney, S. A. Star and R. M. Williams (1949): The American soldier: Adjustment during army life. (Studies in social psychology in World War II).Search in Google Scholar

Sun, Y. V., A. M. Levin, E. Boerwinkle, H. Robertson and S. L. Kardia (2006): “A scan statistic for identifying chromosomal patterns of SNP association,” Genet. Epidemiol., 30, 627–635.1685869810.1002/gepi.20173Search in Google Scholar

Zhang, Y. (2008): “Poisson approximation for significance in genome-wide ChIP-chip tiling arrays,” Bioinformatics, 24, 2825–2831.1895304710.1093/bioinformatics/btn549Search in Google Scholar

Supplementary Material

The online version of this article offers supplementary material (DOI: https://doi.org/10.1515/sagmb-2017-0050).

Published Online: 2018-09-19

©2018 Walter de Gruyter GmbH, Berlin/Boston