Etracker Debug:
	et_pagename = "Corpus Linguistics and Linguistic Theory|cllt|C|[EN]"
	
        
Jump to ContentJump to Main Navigation

Corpus Linguistics and Linguistic Theory

Editor-in-Chief: Gries, Stefan Th.

2 Issues per year

IMPACT FACTOR increased in 2013: 1.000
5-year IMPACT FACTOR: 1.019
Rank 41 out of 169 in category Linguistics in the 2013 Thomson Reuters Journal Citation Report/Social Sciences Edition
ERIH category 2011: INT2

Zipf's law and the grammar of languages: A quantitative study of Old and Modern English parallel texts

1 / Douwe Kiela2 / Felix Hill2 / Paula Buttery1

1Department of Theoretical and Applied Linguistics, University of Cambridge, UK

2Computer Laboratory, University of Cambridge, UK

Citation Information: Corpus Linguistics and Linguistic Theory. Volume 0, Issue 0, ISSN (Online) 1613-7035, ISSN (Print) 1613-7027, DOI: 10.1515/cllt-2014-0009, March 2014

Publication History

Published Online:
2014-03-25

Abstract

This paper reports a quantitative analysis of the relationship between word frequency distributions and morphological features in languages. We analyze a commonly-observed process of historical language change: The loss of inflected forms in favour of `analytic' periphrastic constructions. These tendencies are observed in parallel translations of the Book of Genesis in Old English and Modern English. We show that there are significant differences in the frequency distributions of the two texts, and that parts of these differences are independent of total number of words, style of translation, orthography or contents. We argue that they derive instead from the trade-off between synthetic inflectional marking in Old English and analytic constructions in Modern English. By exploiting the earliest ideas of Zipf, we show that the syntheticity of the language in these texts can be captured mathematically, a property we tentatively call their grammatical fingerprint. Our findings suggest implications for both the specific historical process of inflection loss and more generally for the characterization of languages based on statistical properties.

Keywords: Zipf's law; vocabulary growth curves; diachronic corpus linguistics; syntheticity; analyticity; parallel texts; historical linguistics; Old English

Comments (0)

Please log in or register to comment.
Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.