Statistical Applications in Genetics and Molecular Biology
Editor-in-Chief: Stumpf, Michael P.H.
6 Issues per year
Increased IMPACT FACTOR 2012: 1.717
Rank 18 out of 117 in category Statistics & Probability in the 2012 Thomson Reuters Journal Citation Report/Science Edition
Mathematical Citation Quotient 2012: 0.07
Volume 12 (2013)
Volume 11 (2012)
Volume 10 (2011)
Volume 9 (2010)
Volume 8 (2009)
Volume 6 (2007)
Volume 5 (2006)
Volume 4 (2005)
Volume 3 (2004)
Volume 2 (2003)
Volume 1 (2002)
Most Downloaded Articles
- A General Framework for Weighted Gene Co-Expression Network Analysis by Zhang, Bin and Horvath, Steve
- Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments by Smyth, Gordon K
- Detecting Differential Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken Dispersion Estimates by Lund, Steven P./ Nettleton, Dan/ McCarthy, Davis J. and Smyth, Gordon K.
- A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics by Schäfer, Juliane and Strimmer, Korbinian
- Normalization, bias correction, and peak calling for ChIP-seq by Diaz, Aaron/ Park, Kiyoub/ Lim, Daniel A. and Song, Jun S.
A Heuristic Bayesian Method for Segmenting DNA Sequence Alignments and Detecting Evidence for Recombination and Gene Conversion
1Wroclaw University of Technology, Poland
2Biomathematics and Statistics Scotland (BioSS), United Kingdom
Citation Information: Statistical Applications in Genetics and Molecular Biology. Volume 5, Issue 1, Pages –, ISSN (Online) 1544-6115, DOI: 10.2202/1544-6115.1238, October 2006
- Published Online:
We propose a heuristic approach to the detection of evidence for recombination and gene conversion in multiple DNA sequence alignments. The proposed method consists of two stages. In the first stage, a sliding window is moved along the DNA sequence alignment, and phylogenetic trees are sampled from the conditional posterior distribution with MCMC. To reduce the noise intrinsic to inference from the limited amount of data available in the typically short sliding window, a clustering algorithm based on the Robinson-Foulds distance is applied to the trees thus sampled, and the posterior distribution over tree clusters is obtained for each window position. While changes in this posterior distribution are indicative of recombination or gene conversion events, it is difficult to decide when such a change is statistically significant. This problem is addressed in the second stage of the proposed algorithm, where the distributions obtained in the first stage are post-processed with a Bayesian hidden Markov model (HMM). The emission states of the HMM are associated with posterior distributions over phylogenetic tree topology clusters. The hidden states of the HMM indicate putative recombinant segments. Inference is done in a Bayesian sense, sampling parameters from the posterior distribution with MCMC. Of particular interest is the determination of the number of hidden states as an indication of the number of putative recombinant regions. To this end, we apply reversible jump MCMC, and sample the number of hidden states from the respective posterior distribution.
Keywords: DNA sequence alignment; phylogenetics; interspecific recombination; moving window method; probabilistic divergence measure; hidden Markov model; model selection; Bayesian inference; reversible jump Markov chain Monte Carlo