Iris Hanique, Mirjam Ernestus, Lou Boves
August 16, 2014
This paper investigates whether individual speakers forming a homogeneous group differ in their choice and pronunciation of words when engaged in casual conversation, and if so, how they differ. More specifically, it examines whether the Balanced Winnow classifier is able to distinguish between the twenty speakers of the Ernestus Corpus of Spontaneous Dutch, who all have the same social background. To examine differences in choice and pronunciation of words, instead of characteristics of the speech signal itself, classification was based on lexical and pronunciation features extracted from hand-made orthographic and automatically generated broad phonetic transcriptions. The lexical features consisted of words and two-word combinations. The pronunciation features represented pronunciation variations at the word and phone level that are typical for casual speech. The best classifier achieved a performance of 79.9% and was based on the lexical features and on the pronunciation features representing single phones and triphones. The speakers must thus differ from each other in these features. Inspection of the relevant features indicated that, among other things, the words relevant for classification generally do not contain much semantic content, and that speakers differ not only from each other in the use of these words but also in their pronunciation.