Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Statistical Applications in Genetics and Molecular Biology

Editor-in-Chief: Sanguinetti, Guido

IMPACT FACTOR 2017: 0.812
5-year IMPACT FACTOR: 1.104

CiteScore 2017: 0.86

SCImago Journal Rank (SJR) 2017: 0.456
Source Normalized Impact per Paper (SNIP) 2017: 0.527

Mathematical Citation Quotient (MCQ) 2017: 0.04

See all formats and pricing
More options …
Volume 16, Issue 2


Volume 10 (2011)

Volume 9 (2010)

Volume 6 (2007)

Volume 5 (2006)

Volume 4 (2005)

Volume 2 (2003)

Volume 1 (2002)

A time warping approach to multiple sequence alignment

Ana Arribas-Gil
  • Departamento de Estadística, Universidad Carlos III de Madrid, C/ Madrid, 126 - 28903 Getafe, Spain
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
/ Catherine Matias
  • Corresponding author
  • Sorbonne Universités, Université Pierre et Marie Curie, Université Paris Diderot, Centre National de la Recherche Scientifique, Laboratoire de Probabilités et Modèles Aléatoires, 4 place Jussieu, 75252 PARIS Cedex 05, France
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2017-04-12 | DOI: https://doi.org/10.1515/sagmb-2016-0043


We propose an approach for multiple sequence alignment (MSA) derived from the dynamic time warping viewpoint and recent techniques of curve synchronization developed in the context of functional data analysis. Starting from pairwise alignments of all the sequences (viewed as paths in a certain space), we construct a median path that represents the MSA we are looking for. We establish a proof of concept that our method could be an interesting ingredient to include into refined MSA techniques. We present a simple synthetic experiment as well as the study of a benchmark dataset, together with comparisons with 2 widely used MSA softwares.

Keywords: Alignment; dynamic time warping; multiple sequence alignment; warping


  • Arribas-Gil, A. (2010): “Parameter estimation in multiple hidden i.i.d. models from biological multiple alignment,” Stat. Appl. Genet. Mol. Biol., 9, 10.Web of ScienceGoogle Scholar

  • Arribas-Gil, A. and H.-G. Müller (2014): “Pairwise dynamic time warping for event data,” Comput. Stat. Data Anal., 69, 255–268.Web of ScienceCrossrefGoogle Scholar

  • Arribas-Gil, A., E. Gassiat, and C. Matias (2006): “Parameter estimation in pair-hidden Markov models,” Scand. J. Stat., 33, 651–671.CrossrefGoogle Scholar

  • Arribas-Gil, A., D. Metzler, and J.-L. Plouhinec (2009): “Statistical alignment with a sequence evolution model allowing rate heterogeneity along the sequence,” IEEE/ACM Trans. Comput. Biol. Bioinf., 6, 281–295.CrossrefGoogle Scholar

  • Do, C., M. Mahabhashyam, M. Brudno, and S. Batzoglou (2005): “ProbCons: probabilistic consistency-based multiple sequence alignment,” Genome Res., 15, 330–340.CrossrefGoogle Scholar

  • Durbin, R., S. Eddy, A. Krogh, and G. Mitchison (1998): Biological sequence analysis: probabilistic models of proteins and nucleic acids, Cambridge University Press, Cambridge.Google Scholar

  • Edgar, R. C. (2004): “MUSCLE: multiple sequence alignment with high accuracy and high throughput,” Nucleic Acids Res., 32, 1792.CrossrefGoogle Scholar

  • Edgar, R. C. and S. Batzoglou (2006): “Multiple sequence alignment,” Curr. Opin. Struct. Biol., 16, 368–373.CrossrefWeb of ScienceGoogle Scholar

  • Floyd, R. W. and R. L. Rivest (1975): “Algorithm 489: the algorithm SELECT-for finding the ith smallest of n elements [M1],” Commun. ACM, 18, 173.CrossrefGoogle Scholar

  • Gotoh, O. (1996): “Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments,” J. Mol. Biol., 264, 823–838.CrossrefGoogle Scholar

  • Katoh, K. and D. M. Standley (2013): “MAFFT multiple sequence alignment software version 7: improvements in performance and usability,” Mol. Biol. Evol., 30, 772.CrossrefWeb of ScienceGoogle Scholar

  • Keogh, E. and C. A. Ratanamahatana (2005): “Exact indexing of dynamic time warping,” Knowl. Inf. Syst., 7, 358–386.CrossrefGoogle Scholar

  • Kruskal, J. B. (1983): “An overview of sequence comparison: time warps, string edits, and macromolecules,” SIAM Rev., 25, 201–237.CrossrefGoogle Scholar

  • Kumar, S. and A. Filipski (2007): “Multiple sequence alignment: in pursuit of homologous DNA positions,” Genome Res., 17, 127–135.CrossrefWeb of ScienceGoogle Scholar

  • Lipman, D. J., S. F. Altschul, and J. D. Kececioglu (1989): “A tool for multiple sequence alignment,” Proc. Natl. Acad. Sci. USA PNAS, 86, 4412–4415.CrossrefGoogle Scholar

  • Liu, X. and H.-G. Müller (2004): “Functional convex averaging and synchronization for time-warped random curves,” J. Am. Stat. Assoc., 99, 687–699.CrossrefGoogle Scholar

  • Needleman, S. and C. Wunsch (1970): “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” J. Mol. Biol., 48, 443–453.CrossrefGoogle Scholar

  • Notredame, C. (2007): “Recent evolutions of multiple sequence alignment algorithms,” PLOS Comput. Biol., 3, 1–4.CrossrefWeb of ScienceGoogle Scholar

  • Notredame, C., D. G. Higgins, and J. Heringa (2000): “T-coffee: a novel method for fast and accurate multiple sequence alignment,” J. Mol. Biol., 302, 205–217.CrossrefGoogle Scholar

  • Pages, H., P. Aboyoun, R. Gentleman, and S. DebRoy (2016): “Biostrings: string objects representing biological sequences, and matching algorithms,” R package version 2.28.0.Google Scholar

  • Pais, F. S.-M., P. d. C. Ruy, G. Oliveira, and R. S. Coimbra (2014): “Assessing the efficiency of multiple sequence alignment programs,” Algorithms Mol. Biol., 9, 4.Web of ScienceCrossrefGoogle Scholar

  • Pei, J. and N. V. Grishin (2006): “MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information,” Nucleic Acids Res., 34, 4364–4374.CrossrefGoogle Scholar

  • Smith, T. F., M. S. Waterman, and W. M. Fitch (1982): “Comparative biosequence metrics,” J. Mol. Evol., 18, 423–423.CrossrefGoogle Scholar

  • Tang, R. and H.-G. Müller (2008): “Pairwise curve synchronization for functional data,” Biometrika, 95, 875.CrossrefWeb of ScienceGoogle Scholar

  • Thompson, J., D. G. Higgins, and T. J. Gibson (1994): “CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic Acids Res., 22, 4673–4680.CrossrefGoogle Scholar

  • Thompson, J. D., F. Plewniak, and O. Poch (1999): “A comprehensive comparison of multiple sequence alignment programs,” Nucleic Acids Res., 27, 2682.CrossrefGoogle Scholar

  • Thompson, J. D., P. Koehl, R. Ripp, and O. Poch (2005): “Balibase 3.0: latest developments of the multiple sequence alignment benchmark,” Proteins Struct. Funct. Bioinf., 61, 127–136.Google Scholar

  • Thorne, J., H. Kishino, and J. Felsenstein (1991): “An evolutionary model for maximum likelihood alignment of DNA sequences.” J. Mol. Evol., 33, 114–124.CrossrefGoogle Scholar

  • Wallace, I. M., G. Blackshields, and D. G. Higgins (2005): “Multiple sequence alignments,” Curr. Opin. Struct. Biol., 15, 261–266.CrossrefGoogle Scholar

  • Yang, Z. (2006): Computational molecular evolution, Oxford series in ecology and evolution, Oxford University Press, Oxford.Google Scholar

About the article

Published Online: 2017-04-12

Published in Print: 2017-04-25

Citation Information: Statistical Applications in Genetics and Molecular Biology, Volume 16, Issue 2, Pages 133–144, ISSN (Online) 1544-6115, ISSN (Print) 2194-6302, DOI: https://doi.org/10.1515/sagmb-2016-0043.

Export Citation

©2017 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in