Zipf's law and the grammar of languages: A quantitative study of Old and Modern English parallel texts : Corpus Linguistics and Linguistic Theory uses cookies, tags, and tracking settings to store information that help give you the very best browsing experience.
To understand more about cookies, tags, and tracking, see our Privacy Statement
I accept all cookies for the De Gruyter Online site

Jump to ContentJump to Main Navigation

Corpus Linguistics and Linguistic Theory

Founded by Gries, Stefan Th. / Stefanowitsch, Anatol

Ed. by Wulff, Stefanie

IMPACT FACTOR 2014: 0.579
5-year IMPACT FACTOR: 0.760
Rank 79 out of 171 in category Linguistics in the 2014 Thomson Reuters Journal Citation Report/Social Sciences Edition

SCImago Journal Rank (SJR) 2014: 0.300
Source Normalized Impact per Paper (SNIP) 2014: 1.285
Impact per Publication (IPP) 2014: 0.594

ERIH category 2011: INT2

30,00 € / $42.00 / £23.00

Get Access to Full Text

Zipf's law and the grammar of languages: A quantitative study of Old and Modern English parallel texts

1 / Douwe Kiela2 / Feli Hill2 / Paula Buttery1

1Department of Theoretical and Applied Linguistics, University of Cambridge, UK

2Computer Laboratory, University of Cambridge, UK

Citation Information: Corpus Linguistics and Linguistic Theory. Volume 10, Issue 2, Pages 175–211, ISSN (Online) 1613-7035, ISSN (Print) 1613-7027, DOI: 10.1515/cllt-2014-0009, March 2014

Publication History

Published Online:


This paper reports a quantitative analysis of the relationship between word frequency distributions and morphological features in languages. We analyze a commonly-observed process of historical language change: The loss of inflected forms in favour of ‘analytic’ periphrastic constructions. These tendencies are observed in parallel translations of the Book of Genesis in Old English and Modern English. We show that there are significant differences in the frequency distributions of the two texts, and that parts of these differences are independent of total number of words, style of translation, orthography or contents. We argue that they derive instead from the trade-off between synthetic inflectional marking in Old English and analytic constructions in Modern English. By exploiting the earliest ideas of Zipf, we show that the syntheticity of the language in these texts can be captured mathematically, a property we tentatively call their grammatical fingerprint. Our findings suggest implications for both the specific historical process of inflection loss and more generally for the characterization of languages based on statistical properties.

Keywords: Zipf's law; vocabulary growth curves; diachronic corpus linguistics; syntheticity; analyticity; parallel texts; historical linguistics; Old English

Comments (0)

Please log in or register to comment.