Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter 2019

Syntax, text type, genre and authorial voice in Old English: A data-driven approach

Bettelou Los and Thijs Lubbers

Abstract

In the course of the 1990s, the Old English part of the Helsinki Corpus was extended and enriched with morphological and syntactic tagging as the result of a number of research projects, before the publication in 2003 of the final version (as Taylor et al. 2003). The nature and the size of extant Old English texts is such that Old English texts can be compared by genre or register (homily versus narrative, metrical versus non-metrical prose, translated versus non-translated prose) and in some cases by author (Wulfstan versus Alfric). The question posed here is to what extent such quantitative data can inform our qualitative understanding of the language of these texts, and how properties of the grammar in combination with text-type characteristics either constrain or give shape to forms of stylistic variation. The present paper takes advantage particularly of morphological tags to attempt a data-driven, quantitative stylometric approach which includes n-grams on the basis of such tags, as well as visualizations in the form of correspondence analyses. The biggest challenge is how to move beyond the “Fish fork” - to avoid the circular bootstrapping of looking for features that we already know are significant - and to find hitherto unnoticed features that set texts apart, but that are also meaningful in that they increase our understanding of the interaction between the range of options offered by the syntax and the stylistic choices found in individual texts.

© 2019 Walter de Gruyter GmbH, Berlin/Munich/Boston