Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Stumpf, Michael P.H.

6 Issues per year


IMPACT FACTOR 2016: 0.646
5-year IMPACT FACTOR: 1.191

CiteScore 2017: 0.86

SCImago Journal Rank (SJR) 2017: 0.456
Source Normalized Impact per Paper (SNIP) 2017: 0.527

Mathematical Citation Quotient (MCQ) 2016: 0.06

Online
ISSN
1544-6115
See all formats and pricing
More options …
Volume 14, Issue 2

Issues

Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

Study of triplet periodicity differences inside and between genomes

Yulia M. Suvorova
  • Corresponding author
  • Bioinformatics Laboratory, Centre of Bioengineering of the Russian Academy of Sciences, 117312, Prospect 60-tya Oktyabrya, Moscow, Russian Federation
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Eugene V. Korotkov
  • Bioinformatics Laboratory, Centre of Bioengineering of the Russian Academy of Sciences, 117312, Prospect 60-tya Oktyabrya, Moscow, Russian Federation
  • Department of Applied Mathematics, National Nuclear Investigational University (MIFI), 115522, Kashirskoe Shosse, 31, Moscow, Russian Federation
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2015-02-24 | DOI: https://doi.org/10.1515/sagmb-2013-0063

Abstract

Triplet periodicity (TP) is a distinctive feature of the protein coding sequences of both prokaryotic and eukaryotic genomes. In this work, we explored the TP difference inside and between 45 prokaryotic genomes. We constructed two hypotheses of TP distribution on a set of coding sequences and generated artificial datasets that correspond to the hypotheses. We found that TP is more similar inside a genome than between genomes and that TP distribution inside a real genome dataset corresponds to the hypothesis which implies that a common TP pattern exists for the majority of sequences inside a genome. Additionally, we performed gene classification based on TP matrixes. This classification showed that TP allows identification of the genome to which a given gene belongs with more than 85% accuracy.

Keywords: gene classification; genomes comparison; protein coding genes; triplet periodicity

References

  • Antezana, M. A. and M. Kreitman (1999): “The nonrandom location of synonymous codons suggests that reading frame-independent forces have patterned codon preferences,” J. Mol. Evol., 49, 36–43.CrossrefGoogle Scholar

  • Bernaola-Galván, P., I. Grosse, P. Carpena, J. L. Oliver, R. Román-Roldán and H. E. Stanley (2000): “Finding borders between coding and noncoding DNA regions by an entropic segmentation method,” Phys. Rev. Lett., 85, 1342–1345.Google Scholar

  • Bohlin, J. and E. Skjerve (2009): “Examination of genome homogeneity in prokaryotes using genomic signatures,” PLoS One, 4, 12.PubMedWeb of ScienceGoogle Scholar

  • Bohlin, J., E. Skjerve and D. W. Ussery (2009): “Correction: investigations of oligonucleotide usage variance within and between prokaryotes,” PLoS Comput. Biol., 5, 9.CrossrefWeb of ScienceGoogle Scholar

  • Bradley, J. V. (1968): Distribution-free statistical tests, Chapter 12, Prentice-Hall, Englewood Cliffs, NJ, USA.Google Scholar

  • Chen, B. and P. Ji (2012): “Numericalization of the self adaptive spectral rotation method for coding region prediction,” J Theor. Biol., 296, 95–102.Web of ScienceGoogle Scholar

  • Cover, T. and P. Hart (1967): “Nearest neighbor pattern classification,” IEEE Trans. Inform. Theor., 13, 21–27.CrossrefGoogle Scholar

  • Eskesen, S. T., F. N. Eskesen, B. Kinghorn and A. Ruvinsky (2004): “Periodicity of DNA in exons,” BMC Mol. Biol., 5, 12.PubMedCrossrefGoogle Scholar

  • Fickett, J. W. (1982): “Recognition of protein coding regions in DNA sequences,” Nucleic Acids Res., 10, 5303–5318.PubMedCrossrefGoogle Scholar

  • Fickett, J. W. and C. S. Tung (1992): “Assessment of protein coding measures,” Nucleic Acids Res., 20, 6441–6450.CrossrefPubMedGoogle Scholar

  • Frenkel, F. E. and E. V. Korotkov (2008): “Classification analysis of triplet periodicity in protein-coding regions of genes,” Gene, 421, 52–60.Web of ScienceGoogle Scholar

  • Frenkel, F. E. and E. V. Korotkov (2009): “Using triplet periodicity of nucleotide sequences for finding potential reading frame shifts in genes,” DNA Res., 16, 105–114.Web of ScienceCrossrefPubMedGoogle Scholar

  • Gao, J., Y. Qi, Y. Cao and W. Tung (2005): “Protein coding sequence identification by simultaneously characterizing the periodic and random features of DNA sequences,” J. Biomed. Biotechnol., 2005, 139–146.Google Scholar

  • Jose, M. V., T. Govezensky and J. R. Bobadilla (2005): “Statistical properties of DNA sequences revisited: the role of inverse bilateral symmetry in bacterial chromosomes,” Physica A Stat. Mech. Appl., 351, 477–498.Google Scholar

  • Konopka, A. K. (1994): Sequences and codes: fundamentals of biomolecular cryptology. In: Smith, D. (Ed.), Biocomputing: Informatics and Genome Projects. Academic Press, San Diego, pp. 119–174.Google Scholar

  • Korotkov, E. V., M. A. Korotkova and V. M. Rudenko (1999): “Latent periodicity of protein sequences,” J. Mol. Model, 5, 103–115. doi:10.1007/s008940050122.CrossrefGoogle Scholar

  • Korotkov, E. V., M. A. Korotkova and N. A. Kudryashov (2003): “Information decomposition of symbolic sequences,” Phys. Lett. A, 312, 198–210.Google Scholar

  • Korotkova, M. A., E. V. Korotkov and N. A. Kudryashov (2011): “An approach for searching insertions in bacterial genes leading to the phase shift of triplet periodicity,” Genom. Proteom. Bioinform., 9, 158–170.CrossrefGoogle Scholar

  • Kullback, S. (1997): Information theory and statistics, Dover Publications, New York.Google Scholar

  • Li, W. (1997): “The study of correlation structures of DNA sequences: a critical review,” Comput. Chem., 21, 257–271.CrossrefGoogle Scholar

  • López-Villaseñor, I., M. V. José and J. Sánchez (2004): “Three-base periodicity patterns and self-similarity in whole bacterial chromosomes,” Biochem. Biophys. Res. Commun., 325, 467–478.Google Scholar

  • Makeev, V. J. and V. G. Tumanyan (1996): “Search of periodicities in primary structure of biopolymers: a general Fourier approach,” Comput. Appl. Biosci., 12, 49–54.PubMedGoogle Scholar

  • Mena-Chalco, J. P., H. Carrer, Y. Zana and R. M. Cesar (2008): “Identification of protein coding regions using the modified Gabor-wavelet transform,” IEEE/ACM Trans. Comput. Biol. Bioinform., 5, 198–207. doi:10.1109/TCBB.2007.70259.CrossrefWeb of ScienceGoogle Scholar

  • Ogata, H., S. Goto, K. Sato, W. Fujibuchi, H. Bono and M. Kanehisa (1999): “KEGG: Kyoto encyclopedia of genes and genomes,” Nucleic Acids. Res., 27, 29–34.Google Scholar

  • Pinho, A. J., S. P. Garcia, P. J. S. G. Ferreira, V. Afreixo and J. R. Neves (2010): “Exploring homology using the concept of three-state entropy vector,” LNBI 6282, 161–170.Google Scholar

  • Plotkin, J. B. and G. Kudla (2011): “Synonymous but not the same: the causes and consequences of codon bias,” Nat. Rev. Genet., 12, 32–42.CrossrefWeb of SciencePubMedGoogle Scholar

  • Sanchez, J. and M. V. Jose (2002): “Analysis of bilateral inverse symmetry in whole bacterial chromosomes,” Biochem. Biophys. Res. Commun., 299, 126–134.Google Scholar

  • Sánchez, J. and I. López-Villaseñor (2006): “A simple model to explain three-base periodicity in coding DNA,” FEBS Lett., 580, 6413–6422.Google Scholar

  • Sharp, P. M., E. Cowe, D. G. Higgins, D. C. Shields, K. H. Wolfe and F. Wright (1988): “Codon usage patterns in Escherichia coli, Bacillus subtilis, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Drosophila melanogaster and Homo sapiens; a review of the considerable within-species diversity,” Nucleic Acids Res., 16, 8207–8211.CrossrefGoogle Scholar

  • Shepherd, J. C. (1981): “Periodic correlations in DNA sequences and evidence suggesting their evolutionary origin in a comma-less genetic code,” J. Mol. Evol., 17, 94–102.CrossrefGoogle Scholar

  • Suvorova, Y. M., V. M. Rudenko and E. V. Korotkov (2012): “Detection change points of triplet periodicity of gene,” Gene, 491, 58–64.Web of ScienceGoogle Scholar

  • Suzuki, H., C. J. Brown, L. J. Forney and E. M. Top (2008): “Comparison of correspondence analysis methods for synonymous codon usage in bacteria,” DNA Res., 15, 357–365.PubMedWeb of ScienceCrossrefGoogle Scholar

  • Team, R. C. D. (2011): R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria.Google Scholar

  • Tiwari, S., S. Ramachandran, A. Bhattacharya, S. Bhattacharya and R. Ramaswamy (1997): “Prediction of probable genes by Fourier analysis of genomic sequences,” Comput. Appl. Biosci., 13, 263–270.PubMedGoogle Scholar

  • Trifonov, E. N. (1998): “3-, 10.5-, 200- and 400-base periodicities in genome sequences,” Physica A Stat. Mech. Appl., 249, 511–516.Google Scholar

  • Trifonov, E. N. (1999): “Elucidating sequence codes: three codes for evolution,” Ann. NY Acad. Sci., 870, 330–338.Google Scholar

  • Trifonov, E. N. and J. L. Sussman (1980): “The pitch of chromatin DNA is reflected in its nucleotide sequence,” Proc. Natl. Acad. Sci. USA, 77, 3816–3820.Google Scholar

  • Trotta, E. (2011): “The 3-base periodicity and codon usage of coding sequences are correlated with gene expression at the level of transcription elongation,” PLoS One, 6, 11.PubMedGoogle Scholar

  • Tsonis, A. A., J. B. Elsner and P. A. Tsonis (1991): “Periodicity in DNA coding sequences: implications in gene evolution,” J. Theor. Biol., 151, 323–331.Google Scholar

  • Vinga, S. and J. Almeida (2003): “Alignment-free sequence comparison – a review,” Bioinformatics, 19, 513–523.PubMedCrossrefGoogle Scholar

  • Wang, L. and L. D. Stein (2010): “Localizing triplet periodicity in DNA and cDNA sequences,” BMC Bioinform., 11, 550.CrossrefGoogle Scholar

  • Yan, M., Z. S. Lin and C. T. Zhang (1998): “A new Fourier transform approach for protein coding measure based on the format of the Z curve,” Bioinformatics, 14, 685–690.CrossrefGoogle Scholar

  • Yin, C. and S. S.-T. Yau (2007): “Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence,” J. Theor. Biol., 247, 687–694.Web of ScienceGoogle Scholar

  • Zoltowski, M. (2007): “Is DNA code periodicity only due to CUF-codons usage frequency?” Conf. Proc. Int. Conf. IEEE Eng. Med. Biol. Soc., 2007, 1383–1386.Google Scholar

About the article

Corresponding author: Yulia M. Suvorova, Bioinformatics Laboratory, Centre of Bioengineering of the Russian Academy of Sciences, 117312, Prospect 60-tya Oktyabrya, Moscow, Russian Federation, e-mail:


Published Online: 2015-02-24

Published in Print: 2015-04-01


Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 14, Issue 2, Pages 113–123, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2013-0063.

Export Citation

©2015 by De Gruyter.Get Permission

Comments (0)

Please log in or register to comment.
Log in