Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Linguistics Vanguard

A Multimodal Journal for the Language Sciences

Editor-in-Chief: Bergs, Alexander / Cohn, Abigail C. / Good, Jeff

See all formats and pricing
More options …

Modeling linguistic evolution: a look under the hood

Chundra Aroor CathcartORCID iD: https://orcid.org/0000-0002-3066-4532
Published Online: 2018-04-03 | DOI: https://doi.org/10.1515/lingvan-2017-0043


This paper takes a detailed look at some popular models of evolution used in contemporary diachronic linguistic research, focusing on the continuous-time Markov model, a particularly popular choice. I provide an exposition of the math underlying the CTM model, seldom discussed in linguistic papers. I show that in some work, a lack of explicit reference to the underlying computation creates some difficulty in interpreting results, particularly in the domain of ancestral state reconstruction. I conclude by adumbrating some ways in which linguists may be able to exploit these models to investigate a suite of factors which may influence diachronic linguistic change.

Keywords: evolutionary linguistics; diachronic linguistics


  • Beaulieu, Jeremy M. & Brian C. O’Meara. 2014. Hidden Markov models for studying the evolution of binary morphological characters. In László Zsolt Garamszegi (ed.), Modern phylogenetic comparative methods and their application in evolutionary biology: Concepts and practice, 395–408. Heidelberg, New York, Dordrecht, London: Springer.Google Scholar

  • Berlin, Brent & Paul Kay. 1969. Basic color terms: Their universality and evolution. Berkeley, CA: University of California Press.Google Scholar

  • Bollback, Jonathan P. 2006. SIMMAP: Stochastic character mapping of discrete traits on phylogenies. BMC Bioinformatics 7. 88.CrossrefPubMedGoogle Scholar

  • Bowern, Claire & Quentin D. Atkinson. 2012. Computational phylogenetics and the internal structure of Pama–Nyungan. Language 88(4). 817–845.CrossrefWeb of ScienceGoogle Scholar

  • Chang, William. 2014. A vanishing, multiple-gain lexical trait model: Challenges and opportunities in lexical data and analysis. Paper presented at the Workshop Towards a Global Language Phylogeny, Jena, 17–20 September. Available at http://lingsoup.com/talk/jena-2014.pdf (accessed 1 October 2015).

  • Dediu, Dan. 2010. A Bayesian phylogenetic approach to estimating the stability of linguistic features and the genetic biasing of tone. Proceedings of the Royal Society of London B 278(1704). 474–479.Google Scholar

  • del Prado Martín, Fermín Moscoso & Christian Brendel. 2016. Case and cause in Icelandic: Reconstructing causal networks of cascaded language changes. In Proceedings of the 54th annual meeting of the association for computational linguistics, 2421–2430. Association for Computational Linguistics.Google Scholar

  • Dryer, Matthew S. 1989. Large linguistic areas and language sampling. Studies in Language 13(2). 257–292.CrossrefGoogle Scholar

  • Dunn, Michael. 2015. Language phylogenies. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 190–211. New York & Oxford: Routledge.Google Scholar

  • Dunn, Michael, Tonya Kim Dewey, Carlee Arnett, Thórhallur Eythórsson & Jóhanna Barðdal. 2017. Dative sickness: A phylogenetic analysis of argument structure evolution in Germanic. Language 93(1). e1–e22.CrossrefGoogle Scholar

  • Dunn, Michael, Simon J. Greenhill, Stephen C. Levinson & Russell D. Gray. 2011. Evolved structure of language shows lineage-specific trends in word-order universals. Nature 473(7345). 79–82.CrossrefWeb of SciencePubMedGoogle Scholar

  • Felsenstein, Joseph. 2004. Inferring phylogenies. Sunderland, MA: Sinauer Associates.Google Scholar

  • Gelman, Andrew & Donald B. Rubin. 1992. Inference from iterative simulation using multiple sequences. Statistical Science 7. 457–511.CrossrefGoogle Scholar

  • Greenhill, Simon J. 2015. Demographic correlates of language diversity. In Claire Bowern & Bethwyn Evans (eds.), The Routledge handbook of historical linguistics, 557–578. New York & Oxford: Routledge.Google Scholar

  • Haynie, Hannah J. & Claire Bowern. 2016. A phylogenetic approach to the evolution of color term systems. Proceedings of the National Academy of Sciences 113(48). 13666–13671.Web of ScienceCrossrefGoogle Scholar

  • Höhna, Sebastian, Michael J. Landis, Tracy A. Heath, Bastien Boussau, Nicolas Lartillot, Brian R. Moore, John P. Huelsenbeck & Fredrik Ronquist. 2016. RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Systematic Biology 65(4). 726–736.PubMedCrossrefWeb of ScienceGoogle Scholar

  • Irvahn, Jan & Vladimir N. Minin. 2014. Phylogenetic stochastic mapping without matrix exponentiation. Journal of Computational Biology 21(9). 676–690.CrossrefWeb of ScienceGoogle Scholar

  • Jäger, Gerhard & Johann-Mattis List. 2016. Investigating the potential of ancestral state reconstruction algorithms in historical linguistics. In Christian Bentz, Gerhard Jäger & Igor Yanovich (eds.), Proceedings of the Leiden workshop on capturing phylogenetic algorithms for linguistics. Tübingen: University of Tübingen, online publication system, https://publikationen.uni-tuebingen.de/xmlui/handle/10900/68641 (accessed 1 April 2017).

  • Kembel, S. W., P. D. Cowan, M. R. Helmus, W. K. Cornwell, H. Morlon, D. D. Ackerly, S. P. Blomberg & C. O. Webb. 2010. Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26. 1463–1464.CrossrefPubMedWeb of ScienceGoogle Scholar

  • Liggett, Thomas M. 2010. Continuous time Markov processes: an introduction, vol. 113 Graduate Studies in Mathematics. Providence, RI: American Mathematical Society.Google Scholar

  • List, Johann-Mattis & Robert Forkel. 2016. Lingpy. A Python library for historical linguistics. http://lingpy.org. doi:https://zenodo.org/badge/latestdoi/5137/lingpy/lingpy.

  • Maddison, Wayne P. & Richard G. FitzJohn. 2015. The unsolved challenge to phylogenetic correlation tests for categorical characters. Systematic Biology 64(1). 127–136.PubMedCrossrefWeb of ScienceGoogle Scholar

  • Maurits, Luke. 2016. Beastling: a linguistics-focussed command line tool for generating beast xml files. Python package. https://github.com/lmaurits/beastling.

  • Maurits, Luke & Thomas Griffiths. 2014. Tracing the roots of syntax with Bayesian phylogenetics. Proceedings of the National Academy of Sciences 111(37). 13576–13581.Web of ScienceCrossrefGoogle Scholar

  • Narroll, Raoul. 1961. Two solutions to Galton’s Problem. Philosophy of Science 28. 15–29.CrossrefGoogle Scholar

  • Nicholls, Geoff K. & Russell D. Gray. 2006. Quantifying uncertainty in a stochastic Dollo model of vocabulary evolution. In Peter Forster & Colin Renfrew (eds.), Phylogenetic methods and the prehistory of languages, 161–71. Cambridge: McDonald Institute for Archaeological Research.Google Scholar

  • Nichols, Johanna. 1986. Head-marking and dependent-marking grammar. Language 62. 56–119.CrossrefGoogle Scholar

  • Nichols, Johanna & Tandy Warnow. 2008. Tutorial on computational linguistic phylogeny. Language and Linguistics Compass 2(5). 760–820.CrossrefGoogle Scholar

  • Nielsen, Rasmus. 2002. Mapping mutations on phylogenies. Systematic Biology 51(5). 729–739.CrossrefPubMedGoogle Scholar

  • Pagel, Mark. 1994. Detecting correlated evolution on phylogenies: A general method for the comparative analysis of discrete characters. Proceedings of the Royal Society of London B 255. 37–45.CrossrefGoogle Scholar

  • Pagel, Mark. 1999. The maximum likelihood approach to reconstructing ancestral character states of discrete characters on phylogenies. Systematic Biology 48(3). 612–622.CrossrefGoogle Scholar

  • Pagel, Mark & Andrew Meade. 2006. Bayesian analysis of correlated evolution of discrete characters by Reversible-Jump Markov Chain Monte Carlo. The American Naturalist 167(6). 808–825.PubMedGoogle Scholar

  • Paradis, E., J. Claude & K. Strimmer. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20. 289–290.CrossrefGoogle Scholar

  • Pearl, Judea. 2009. Causality: Models, reasoning, and inference. Cambridge: Cambridge University Press.Google Scholar

  • Revell, Liam J. 2012. phytools: An R package for phylogenetic comparative biology (and other things). Methods in Ecology and Evolution 3. 217–223.CrossrefWeb of ScienceGoogle Scholar

  • Rosenthal, Jeffrey S. 2011. Optimal proposal distributions and adaptive MCMC. In Steve Brooks, Andrew Gelman, Galin L. Jones & Xiao-Li Meng (eds.), Handbook of Markov Chain Monte Carlo, 93–112. Boca Raton, FL: Chapman & Hall/CRC.Google Scholar

  • Wang, Huai-Chun, Matthew Spencer, Edward Susko & Andrew J. Roger. 2006. Testing for covarion-like evolution in protein sequences. Molecular Biology and Evolution 24(1). 294–305.PubMedWeb of ScienceGoogle Scholar

  • Widmer, Manuel, Sandra Auderset, Johanna Nichols, Paul Widmer & Balthasar Bickel. 2017. NP recursion over time: evidence from Indo-European. Language 93(4). 799–826.CrossrefGoogle Scholar

  • Yang, Ziheng. 2014. Molecular evolution: A statistical approach. Oxford: Oxford University Press.Google Scholar

About the article

Received: 2017-09-28

Accepted: 2017-12-06

Published Online: 2018-04-03

Citation Information: Linguistics Vanguard, Volume 4, Issue 1, 20170043, ISSN (Online) 2199-174X, DOI: https://doi.org/10.1515/lingvan-2017-0043.

Export Citation

©2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in