Matthew Brook O'donnell,, Mike Scott,, Michaela Mahlberg,, Michael Hoey,
February 27, 2013
The notion of ‘textual colligation’ predicts that certain lexical items have a tendency to occur at particular points in a text, i.e. the beginning or end of texts, paragraphs or sentences. This paper describes new corpus-based methods developed to identify the profile of words, clusters (n-grams) and concgrams (non-contiguous patterns in variant order) in terms of their most common textual locations. Groups of co-occurring text-initial items are then analyzed in terms of their discourse function in relation to theories of newspaper structure. This analysis illustrates how methods from corpus linguistics, when targeted to specific textual positions, can complement text-linguistic analyses.