Confidently Estimating the Number of DNA Replication Origins

Anand Bhaskar 1  and Uri Keich 2
  • 1 Computer Science Division, University of California, Berkeley
  • 2 School of Mathematics and Statistics, University of Sydney

We present a method for estimating and providing a confidence interval for the number of DNA replication origins in the genome of the yeast Kluyveromyces lactis. The method requires an initial set of verified sites from which a position specific frequency matrix (PSFM) can be constructed. We further assume that we have access to a sparingly used experimental procedure which can verify the functionality of a few, but not all, computationally predicted sites. While our motivation comes from estimating the number of autonomously replicating sequences (ARSs), our method can also be applied to estimating the genome-wide number of “functional” transcription factor binding sites, where functionality is determined by experimental verification of the transcription factor binding event using, for example, ChIP data. The reliability of our method is demonstrated by correctly predicting the known number of Saccharomyces cerevisiae ARSs as well as the number of S. cerevisiae probes that bind to the transcription factor ABF1.

Purchase article
Get instant unlimited access to the article.
Log in
Already have access? Please log in.

Journal + Issues

SAGMB publishes significant research on the application of statistical ideas to problems arising from computational biology. The range of topics includes linkage mapping, association studies, gene finding and sequence alignment, protein structure prediction, design and analysis of microarrary data, molecular evolution and phylogenetic trees, DNA topology, and data base search strategies.