This study examines the vowel-length contrast in Seoul Korean. Seoul Korean is generally described as having contrastively long vowels, but many descriptions of the language also note that the length contrast is being lost or is already lost in younger speakers’ speech. The first goal of the study is to provide an empirical contribution to this topic by examining the vowel-length contrast in The Reading-Style Speech Corpus of Standard Korean (The National Institute of the Korean Language 2005), which contains read-speech data from 118 speakers of Seoul Korean stratified for gender and age. Specifically, the study aims to examine change in long-vowel realization across different age groups, which we interpret as a reflection of sound change in real time (Bailey et al. 1991).
The second goal of the study is to examine the effect of word frequency on this sound change, i.e., whether this sound change affects words of high and low frequency differently, and if so, how. Studies on lexical diffusion show that sound change may affect high- or low-frequency words differently depending on the nature of sound change (Bybee 2001; Phillips 2006). By examining the frequency effect, we can provide evidence about the underlying mechanism of this sound change – whether the change is a phonetically gradual reduction or a structurally motivated phonological shortening – and also contribute to the general literature on the interaction between frequency and sound change. It is also notable that most previous studies of frequency effects on lexical diffusion of sound change examine the final outcome of a sound change or a frequency effect on synchronic variation at a static point in time rather than tracking the effect of frequency on change over time. 1 Our study contributes to filling this gap by examining the interaction of frequency with sound change as the sound change unfolds in (apparent) time.
2 Word frequency and lexical diffusion
Studies on lexical diffusion of sound change observe that changes that are physiologically motivated and phonetically gradient tend to affect high-frequency words first, while changes that are analogical in nature and phonetically abrupt tend to affect low-frequency words first (Schuchardt 1885/1972; Hooper 1976; Phillips 1984; Bybee 2001, Bybee 2002; Bybee and Hopper 2001; Pierrehumbert 2001). 2 An oft-cited example of a phonetic sound change that affects high-frequency words first is t/d deletion in American English; other things being equal, t/d deletion is more frequent in high-frequency words than in low-frequency words (Jurafsky et al. 2001; Bybee 2002; Coetzee and Kawahara 2013). Regularization of the English past tense is an example of change that affects low-frequency words first; high-frequency words like slept, left, and kept retain their irregular past-tense form, while low-frequency words like wept, leapt, and crept may regularize (Hooper 1976).
Such frequency effects on sound change have been put forth as supporting evidence for the usage-based model of phonology (Bybee 2001; Bybee and Hopper 2001) and the exemplar-based model of phonology (Pierrehumbert 2001, Pierrehumbert 2002; Kirchner 2012), since these models assume a direct link between the use of individual word forms and their representations. The resistance of high-frequency words to a regularizing change follows from the assumption that high-frequency words form a stronger and more independent mental representation and therefore can resist changes motivated by analogy to other forms. The propensity of high-frequency words to undergo a reductive sound change follows from the assumption that the lexical representation of a word includes phonetically detailed exemplars, and that this representation is constantly updated with each use of the word. In a leniting sound change, the more frequently a word is used, the more exemplars of lenited tokens accrue, forming the basis for subsequent productions of the word. As a result, more frequent words are expected to undergo a reductive sound change at a faster rate than less frequent words.
However, the claim that phonetically gradient sound change can be also lexically gradual is contested, and evidence for a direct influence of frequency on gradual phonetic sound change remains elusive (Labov 1994, 2010; Dinkin 2008; Walker 2012). Walker (2012), in particular, shows that the frequency effect in Canadian English t/d deletion is miniscule or non-existent once grammatical and phonological effects are factored out. Also, Dinkin (2008) and Walker (2012) note that there is no clear evidence that t/d deletion is in fact a sound change, as opposed to a stable synchronic variation. Relatedly, it is important to note that frequency-dependent reduction occurs synchronically even in the absence of sound change. Studies have shown that high-frequency words are produced with a shorter duration than low-frequency words. While some interpret such frequency effects as evidence for the phonetically rich word-specific representations assumed by exemplar-based models (Gahl 2008), others note that such effects can be attributed to the higher resting activation level of frequent words, giving rise to faster lexical access and articulatory planning without resorting to different phonetic representations (Pluymaekers et al. 2005; Bell et al. 2009). In other words, the fact that frequent words tend to be produced with a higher degree of reduction at a static point in time is not, in and of itself, evidence for word-specific phonetically detailed lexical representations, since the same effects can arise from on-line phonetic factors (Pierrehumbert 2002; Ernestus 2014).
By the same reasoning, the fact that high-frequency words are further along in a diachronic sound change does not in itself necessarily support word-specific phonetic representation either. A general lenition bias which moves all words toward more lenited realizations over time, along with synchronic on-line factors that promote further lenition of high-frequency words, can create the appearance of high-frequency words being preferentially targeted by reductive sound change. Such a scenario is schematically represented in Figure 1(a). In the graphs in Figure 1, the x-axis represents time, and the y-axis (‘vowel duration’) represents a phonetic dimension along which reductive sound change progresses, with a lower value representing a further stage in the sound change. The horizontal line indicates a hypothetical category boundary where the phonological label changes. In Figure 1(a), high- and low-frequency words are distributed differently along the phonetic dimension, with high-frequency words more reduced than low-frequency words, an effect attributable to synchronic on-line factors, but high- and low-frequency words undergo the change at the same rate. When the change has progressed far enough, we may encounter a state where only the high-frequency words have advanced far enough to be relabeled or reanalyzed as a different phonological category. 3 It should be noted that the pattern of change in Figure 1(a) is also consistent with a hybrid exemplar model (Pierrehumbert 2006), in which a layer of phonological categories acts to keep the word-specific variation from running rampant. In this model, words can have different phonetic representations, but word-specific variation is kept in check, and the top-down pressure of a shared phonological category keeps sounds in high-frequency words from leniting exponentially away from the rest of the category.
On the other hand, the prediction of the exemplar-based model, in its strongest form (Pierrehumbert 2001, Pierrehumbert 2002), can be schematically represented as in Figure 1(b): high-frequency words are overall more reduced, but they also undergo the change at a faster rate because each production adds reduced exemplars, further precipitating the reductive change. In Figure 1(a), the frequency effect at the outcome of the sound change is purely a reflection of a synchronic frequency effect (Synchrony=Diachrony), while in Figure 1(b), the synchronic frequency effect is expanded over time, i.e., the diachronic frequency effect reflects an expansion of the synchronic frequency effect (Synchrony
The first two scenarios sketched in Figure 1 illustrate changes which progress in the same direction as physiologically conditioned lenition, with high-frequency words being the first targets of the change. However, the distinction between a phonetically motivated leniting sound change and an analogically motivated sound change is not always straightforward, and it is possible that different mechanisms may underlie what may look like identical sound changes. The merger of /ʊu/ (as in moan) and /ʌu/ (as in mown) in East Anglia, discussed by Trudgill and Foxcroft (1978), provides a well-known example. Some dialects merge the contrast by phonetic ‘approximation’ of the two categories, while other dialects merge the contrast by ‘transfer’ of items from one vowel class to the other vowel class. Thus, a sound change that may seem reductive on the surface, such as vowel shortening, may affect the sounds at the phonemic category label, not at the subphonemic phonetic level, and may therefore affect low-frequency words before high-frequency words.
In fact, a number of cases presented in the literature appear to contradict the purported correlation between the nature of a given sound change and the direction of the frequency effect. But closer examination of these examples reveals additional factors underlying these diachronic patterns, ultimately leading to a more insightful understanding of sound change. For example, Phillips (2001) found that a stress shift in English verbs with the -ate suffix, as in frústráte>frustrate, is more prone to affect high-frequency words, although this sound change is not phonetically gradient or reductive. Phillips (2001) attributes this frequency effect to the fact that high-frequency words are more likely to be analyzed as monomorphemic, rather than as having a stem+suffix structure, allowing the regular stress assignment rule of English to apply.
The accent shift in Ancient Greek discussed by Probert (2006) illustrates a more complex interaction of morphological decompositionality and frequency. 4Probert (2006) observes that in a group of nouns formed with an adjective-forming affix in Ancient Greek, the accent has shifted from a final accent (expected based on the original adjectival affixes) to a recessive accent (a general default accent pattern). As expected, very high-frequency words were resistant to this regularizing accent shift and retained their original final accent. Somewhat unexpectedly, very low-frequency words also resisted this change. Probert (2006) suggests that very low-frequency words were less prone to ‘demorphologization’ and more likely to be analyzed as containing the adjectival suffix, and hence more likely to follow the stress pattern expected based on the adjectival suffix. In other words, these two examples of accent shift show that the effect of frequency on morphological compositionality can create a ‘low-frequency-last’ effect even for sound changes that are not physiologically conditioned.
The variation in Finnish suffix vowel harmony examined by Duncan (2011) presents another case where a change is not physiologically motivated but nevertheless affects high-frequency words first. Duncan examined the frequency effect on suffix vowel harmony in Finnish loanwords where the stem vowels consist of back vowels followed by neutral vowels. Based on Google searches, she found that high-frequency words are more likely to take a front vowel suffix than low-frequency words. Duncan (2011) suggests that this frequency effect may be due to low-frequency words being more prone to conforming to prescriptive rules in their written form, while high-frequency words are more likely to be written as they are pronounced.
Phillips (2001) also discusses a number of ‘weakening’ sound changes that somewhat unexpectedly affect low-frequency words first. For example, the unrounding of /ö(ː)/ to /e(ː)/ in early Middle English affects low-frequency words more than high-frequency words and also exhibits sensitivity to the grammatical category of words. Phillips (2001) analyzes this change as ‘typologically motivated’; languages without high front rounded vowels tend not to have mid front rounded vowels, and this change affected those dialects of Middle English that had recently lost high front rounded vowels. In other words, the change is not a physiologically motivated lenition but a result of pressure from the phonological system, i.e., a removal of a highly marked structure.
This and other cases of frequency effects led Phillips to propose a refined hypothesis regarding the connection between frequency and sound change: “[s]ound changes which require analysis—whether syntactic, morphological, or phonological—during their implementation affect the least frequent words first” (Phillips 2001: 123), while “changes which ignore the phonological integrity of segments and the morphological composition of words affect the most frequent words first” (Phillips 2001: 134). Therefore, a sound change that looks like a reductive sound change on the surface may in fact affect infrequent words first depending on whether or not the underlying mechanism of the change is analytical.
Assibilation of noun-final coronal plosives in Korean, which turns root-final /t tʰ c cʰ/ into /s/ in nouns, is an example where a seemingly phonetically motivated process (Kim 2001) is more properly analyzed as an analogy to a dominant noun alternation pattern (Kwak 1984; Ko 1989; Ito 2010). As expected for an analogically motivated change, high-frequency words resist this change (Kang 2003, Kang 2007). Devoicing of stem-final /v/ and /z/ in Dutch presents a case where the process is reductive and gradient at the phonetic level but also shows an analogical frequency effect: low-frequency words are more likely to show devoicing, and hypercorrective voicing is also attested, another sign that the change is not purely phonetic (De Schryver et al. 2008). 5
This third possibility of how sound change and lexical frequency may interact is schematically represented in Figure 1(c). In this scenario, the direction of category label change driving the sound change happens to align with the general direction of synchronic phonetic lenition, giving the appearance of a phonetic change. However, the underlying mechanism is in fact analytical and affects the sounds at the level of the phonemic category label, not at the subphonemic phonetic level. As a result, frequent words, despite their generally more lenited phonetic realization within the original category, are more resistant to a change motivated by analogy or phonological markedness. 6 In this scenario, the synchronic effect of frequency is reduced or reversed in the diachronic change (Synchrony>Diachrony).
With this background, we now turn to the case of the loss of long vowels in Seoul Korean. The questions we are posing are: (1) Is this change actually happening? (2) Are high-frequency words more reduced and produced with a shorter vowel duration? (3) Do high- and low-frequency words undergo change differently, and if so, how? In the absence of any sound change, we expect long vowels to be realized with a shorter duration when they occur in high-frequency words than in low-frequency words, due to the synchronic frequency effect on duration (Pluymaekers et al. 2005; Bell et al. 2009). We expect the frequency effects illustrated in Figure 1(b) if the sound change is a phonetically gradual reduction, while we expect the frequency effects illustrated in Figure 1(c) if the sound change has an analytical underpinning. Yet another possibility is that we do not find any frequency effect and the sound change affects all words equally, as illustrated in Figure 1(a), which illustrates the purely synchronic frequency effect on duration.
3 The vowel-length contrast in Seoul Korean
In this section, we provide a review of previous studies on Seoul Korean vowel length. Seoul Korean is generally described as having a vowel-length contrast (Choe 1959; Huh 1960; Lee 1956, Lee 1960, Lee 1993; Lee and Ramsey 2011; Martin 1992), but many researchers also state that the contrast is being lost in younger speakers’ speech, and such statements are attested as early as 1951 (Martin 1951, Martin 1992; Lee 1960; Park 1994; Sohn 1999; Shim et al. 2013). 7 The minimal pairs in (1) illustrate the contrast.
|[cʌk-tɑ]||‘to write’||[cʌːk-tɑ]||‘not plenty’|
Studies suggest that the long vowels were first weakened and lost in non-initial position and that the change spread to initial position (Lee 1960; Cha 2012). Thus, in present-day Korean, long vowels tend to be limited to the word-initial syllable, while underlying long vowels in non-initial position shorten. This alternation is illustrated in (2) (Lee 1960; Lee and Ramsey 2011).
|[pʌːllita]||‘to spread open’||[t’ʌ-pʌllitɑ]||‘to brag’|
Underlying long vowels in word-initial position may also shorten in particular morphophonological contexts. In most monosyllabic verbs and adjectives, the underlying long vowels shorten when a vowel-initial suffix is attached as shown in (3a) or when a passive or causative suffix is attached as shown in (3b). However, there are exceptions: for some verbs and adjectives, the underlying long vowels do not alternate, as shown in (4). 8
|[kuːm-t’ɑ] (||‘to starve’||[kulm-ʌ]||‘starve-DEC’|
|[nʌː-tʰɑ] (||‘to insert’||[nʌ-ɨni] (||‘insert-therefore’|
|[puː-t’ɑ] (||‘to pour’||[pu-ɨni]||‘to pour’ (s-irregular)|
|[kɑːm-t’ɑ]||‘to wind up’||[kɑm-ki-tɑ]||‘to be coiled’ (PASSIVE)|
|[puː-t’ɑ] (||‘to pour’||[pul-li-tɑ] (||‘to soak’ (CAUSATIVE)|
|[k’oː-tɑ]||‘to twist’||[k’o-i-tɑ]||‘to be entangled’ (PASSIVE)|
|[maːn-tʰa] (||‘plentiful’||[maːn-a] (||‘plentiful-DEC’|
A study of recordings from the 1930s shows that the vowel-length contrast is robustly attested in Seoul speakers’ speech from this time period (Cha 2005), but more recent studies are in general agreement that the contrast is being lost in contemporary Seoul Korean. What remains unclear is whether this sound change is a phonetically gradual reduction of long vowels, an analogical change where the underlying long vowel is reanalyzed/misanalysed as an underlying short vowel, or both.
In the earliest instrumental study we are aware of, Han (1964) examined the vowel durations in 25 vowel-length minimal pairs produced by four Seoul Korean speakers in their 20s and 30s and found an average durational ratio of 2.51:1 between long and short vowels in citation forms of the words. However, this contrast already exhibited a sign of erosion at the time of Han’s study. Han (1964) observes that there is a group of long-vowel words that were only produced as long by a subset of the informants. This suggests that some long-vowel words were consistently produced with long vowels while others were not. Han (1964: 14) goes on to speculate that “this is largely due to the lack of distinctive notation in the Korean orthography. [...] When dealing with less common words or learned words, only a limited number of people may distinguish them in their conscious speech.” What this statement suggests is that the phonetic contrast between long and short vowels is robust and that the change is one of misanalysis of long vowels as short vowels, which affects less frequent words first.
More recent quantitative studies find similar lexical and individual variation in the realization of the vowel-length contrast. Zhi et al. (1990) examined three minimal pairs produced by three speakers (all males, 19–26 years of age) and found that two of the three speakers produced the contrast consistently with a duration ratio ranging from 1.45:1 to 2.01:1, while one speaker did not produce the contrast for two of the pairs and produced a contrast in the reverse direction for one of the pairs. Kahng (1995) found that two of her three speakers (all male and in their 20s) produced a consistent vowel-length duration contrast for all or most of the 16 minimal pairs examined, but one speaker showed the correct contrast only for a small subset. 9Kong and Moon (2002), on the other hand, found that all three of their speakers (one male each in their 20s, 30s, and 40s) produced a substantial and consistent length contrast (a ratio of 1.5:1 to 2:1) for the seven minimal pairs examined. What these studies, taken together, suggest is that while the vowel length contrast is not completely lost in younger speakers’ speech, it exhibits speaker variation (some speakers retain the contrast while others do not) and word-specific effects (the length contrast is retained in some words but not in others). The duration ratio between long and short vowels in more recent studies tends to be smaller, at around 1.5:1 to 2:1, than the ratio of 2.51:1 found in Han’s (1964) study. But given the difference in the methodology across these studies (Han’s study measured words in isolation, while the other studies measured words embedded in a sentence), a direct comparison of observed ratios may not be meaningful.
Studies that directly compare vowel-length production across different age groups also present a similar picture of lexical and speaker variation and find an age-dependent trend of long-vowel loss or reduction (Park 1985; Jung and Hwang 2000; Kim 2003). Park (1985) examined the realization of the vowel-length contrast by 30 Seoul Korean speakers stratified for age. The study examined the speakers’ declarative knowledge of vowel length in 277 commonly used words as well as their production patterns. 10Table 1 summarizes the key findings.
Variation in long-vowel production by age in Seoul Korean speakers (adapted from Park 1985) (%).
|Speaker’s age||≥ 60s||50s||40s||30s||20s||10s|
|(a) Proportion of words produced with a long vowel||62.00||62.75||62.50||49.75||30.75||16.75|
|(b) Proportion of words produced with a long vowel out of those the speaker judged to contain a long vowel||94.92||93.29||74.63||69.64||50.39||29.12|
In the study, long vowels are defined as those produced 1.5–2 times longer than short vowels in an identical phonological context. Based on this criterion, the overall proportion of words produced with a long vowel declines in younger speakers’ speech. While the speakers in their 40s or older produce long vowels for over 60% of the words examined, the rate is lower for younger speakers (Table 1(a)). Younger speakers also show a large discrepancy between their judgments of vowel length and their actual productions: the proportion of long-vowel production for words the speakers themselves judged to contain a long vowel is over 90% for speakers in their 50s and 60s, but the rate drops to below 30% for speakers in their teens (Table 1(b)). Park (1985) interprets this discrepancy between production and judgment as an indication that younger speakers are making a random choice when judging the vowel length of a given word. However, another possibility is that their exposure to older speakers’ speech provides them with implicit knowledge of the vowel-length pattern even though their own production grammar does not make a consistent enough distinction to pass the 1.5:1 ratio criterion. 11Park (1985) also notes that among the vowels that are produced and classified as long, there is a generational difference in the ratio of long vs. short vowels. The ratio is around 2.0:1 for older speakers but between 1.5:1 (the lower limit possible by definition) and 1.7:1 for younger speakers. 12 Park’s (1985) result suggests that the vowel-length contrast loss is phonetically gradual but also lexically diffused: in younger speakers’ speech, the vowel-length contrast is still retained, but their long vowels are shorter in duration, and the contrast is more robustly preserved in some words than in others.
Jung and Hwang (2000) examined the production of 24 minimal pairs (all Sino-Korean disyllabic words) by 12 Seoul Korean speakers stratified for gender and age (20s to 70s) and found an effect of both factors. When the duration was averaged across all word pairs, the older speakers showed a statistically significant duration difference between long and short vowels, while younger speakers did not. For middle-aged speakers, males patterned with the older speakers, while females patterned with the younger speakers. After examining the minimal-pair-specific durational contrast for each speaker group, Jung and Hwang (2000) reached the conclusion that the change is one of lexical diffusion, where the set of lexical items with long vowels becomes smaller and smaller in younger speakers’ speech, rather than a phonetically gradual shortening that affects all long vowels.
A survey by the National Institute of the Korean Language examined the production of 29 long-vowel words and 12 short-vowel words by 350 Korean speakers from the Seoul Metropolitan area, stratified for age, gender, and level of education (Kim 2003). The productions were recorded and transcribed, but the criteria for determining the vowel length of a particular production are not provided, making it hard to interpret the findings. Figure 2 summarizes the data reported in Kim (2003). The plot shows the proportion of long-vowel realization, aggregated over all words and speakers.
The results show that there is a positive correlation between the age of speakers and the overall percentage of long-vowel realization, i.e., older speakers produced more long vowels, as one would expect from the general trend of long-vowel reduction in younger speakers. Interestingly, however, this trend is true of both underlying long and short vowels. In other words, older speakers not only produced more long vowels for underlying long vowels than younger speakers, but they also did so for underlying short vowels. Kim (2003) interprets this as evidence that even older speakers are losing the contrast, not only mispronouncing long vowels as short but also pronouncing short vowels as long. However, since no information is provided as to the specific criteria for determining long vs. short vowel realization, we need to be cautious in interpreting the results. One alternative possibility is that the older speakers tend to have a slower speech rate and may have produced all vowels generally longer than younger speakers, and this may have affected the categorization of the vowels by the fieldworkers, who were all young Seoul Korean speakers. Regardless of how we interpret a substantial percentage of long-vowel realizations of underlying short vowels by older speakers, we do find that the older (50–60-year-old) speakers’ productions show a statistically significant difference between long- and short-vowel realizations, while those of younger speakers (20s, 30s, and 40s) do not. 13
To summarize, previous studies are generally consistent with the view that younger speakers produce a more reduced vowel-length contrast. There are suggestions that the change is phonetically gradual and also suggestions of lexical diffusion, with low-frequency words particularly susceptible to the change.
The data for this study come from The Reading-Style Speech Corpus of Standard Korean (The National Institute of the Korean Language 2005), which contains read speech of 60 male and 60 female speakers of Korean residing in the Seoul metropolitan area. The age of the speakers ranged from 19 to 71 at the time of recording in 2003. For two of the speakers, the sound files were missing or had errors, and our analysis is based on data from the other 118 speakers. The distribution of the speakers by gender and year of birth, inferred from the age and the year of recording, is given in Table 2.
Distribution of speakers in the NIKL Corpus by gender and decade of birth. (Counts in parentheses include the two speakers with missing files.)
The speech material consists of well-known short stories and essays, totaling 930 sentences. Of these, 403 sentences were read by all speakers, while the rest were read by younger speakers only. More information about the corpus is available in several publications (Yoon and Kang 2012, Yoon and Kang 2014; Kang to appear; Yoon to appear). As the vowel-length contrast is mainly retained in word-initial syllables, the current study examined only the vowels in word-initial syllables in the 403 sentences read by all speakers. The analysis includes only monophthongal vowels; vowels preceded by an on-glide (/j/ or /w/) are excluded. In the 403 sentences, a total of 1,260 lemma types, 2,252 word-form types, and 3,368 word-form tokens contain a monophthongal vowel in a word-initial syllable. Of those 3,368 vowel tokens, 563 (=16.7%) are long, and the rest are short. The vowel-length specifications are based on the vowel-length marking in the Great Dictionary of the Korean Language. 14
Determining whether a particular speaker’s particular vowel token is realized as phonologically long or short is a difficult problem. 15 The duration of a segment is affected not only by its phonological length but also by a number of contextual factors, such as vowel height, syllable structure, preceding or following segments, word length, position in a prosodic phrase, and speech rate, among others (Lehiste 1970; Klatt 1976; Maddieson 1985; Crystal and House 1988; Zhi 1993; Turk and Shattuck-Hufnagel 2000, Turk and Shattuck-Hufnagel 2007; Yun 2009; Yoon to appear). As a result, there is no absolute value of duration or any other acoustic measure that can identify a vowel as long or short. Nevertheless, we expect that if vowel length is contrastive in a speaker’s speech, then all else being equal, long vowels should show a longer duration than short vowels. Therefore, in this study, we will use the phonetic duration of a vowel as our measure and examine how the effect of phonological vowel length on phonetic duration changes across speakers of different age groups, while controlling for other factors that are known to affect segment duration. 16
The duration measurements were extracted using the forced-alignment system for Korean developed by the second author (Yoon and Kang 2012, Yoon and Kang 2014; Yoon to appear). 17 Files that contained gross errors in alignment, due to incorrect file-text matching in the original corpus or to disfluencies or reading errors by the speakers, were discarded, and as a result, around 1% (418 out of 47,554 files) of all the files were excluded from analysis. The automatic aligner analyzes the signal in 10 ms frames and assigns each frame a segment label; therefore, the resolution for duration measurements is 10 ms. In the analyses provided below, only vowels with a duration of less than 300 ms were included. Manual inspection of the tokens with a duration longer than 300 ms indicated that most such extreme values were alignment errors, interjections, or vowels in utterance- or phrase-final position. Also excluded were tokens for which no pitch is detected (which tend to involve alignment errors), as well as completely devoiced and extremely short vowels. After this elimination process, we had a total of 373,733 tokens of vowels in word-initial syllables produced across 118 speakers. In our corpus data (and presumably also in the language in general) underlying long vowels are in the minority, comprising around 17.0% of monophthongal vowels in word-initial syllables. The breakdown of the tokens by vowel quality and underlying vowel length based on the Great Dictionary of the Korean Language is provided in Table 3. The contrast between /ɛ/ and /e/ is generally lost in younger Seoul speaker’s speech, but the contrast is marginally retained in older speakers’ speech (Kang to appear). We therefore treat the two vowels as distinct.
Proportion of underlying long and short vowels by vowel quality in the analyzed data.
|ɑ (ㅏ)||ɛ (ㅐ)||e (ㅔ)||ʌ (ㅓ)||o (ㅗ)||i (ㅣ)||ɨ (ㅡ)||u (ㅜ)||All|
The frequency counts in our analysis are based on the lemma frequency list published by the National Institute of the Korean Language (Cho 2002). 18 This frequency list is based on a corpus of over 1.5 million words, and homophones are disambiguated. 19 For the analysis below, we converted frequency into a categorical variable with three levels (low, mid, and high). The categories are determined by k-means clustering of log frequency counts of 3,368 word-forms in the read text. 20 The frequency range of the three categories and the number of tokens for each category is summarized in Table 4.
|Number of tokens||105,910||162,602||105,221||373,733|
We start with descriptive summaries of the data. Figure 3(a) shows the mean vowel duration of long and short vowels aggregated over speakers’ decade of birth. As expected, the duration of long vowels is negatively correlated with speakers’ year of birth. However, the duration of the short vowels also shows this negative correlation. The general negative correlation between vowel length and year of birth is attributable to the speech rate difference across different age groups: younger speakers generally speak faster than older speakers. This interpretation is supported by other studies that examined the effect of age on speech rate in the same corpus. Kang (2014) found that younger speakers produce shorter vowels overall, and this was the case not only for the vowels in the first syllable of a word (the focus of the current study), but also for the vowels in the second syllable. Bang et al. (to appear) calculated the speech rate of each speaker in the same corpus by calculating the number of syllables and phones per second and found that younger speakers show a faster speech rate.
To examine the realization of the long vs. short vowel contrast while controlling for the variation in speech rate, the durations of the long vowels are converted to a ratio by dividing the long vowel duration by the mean short vowel duration of each speaker. The mean long/short ratio, aggregated over speakers’ decade of birth, is summarized in Figure 3(b). The ratio of long to short vowels is over 1.20:1 for speakers born in the 1930s, but the ratio falls below 1:1 for speakers born in the 1980s. This is consistent with the general trend of length contrast reduction suggested in the literature reviewed in Section 3.
We now examine the effect of frequency on vowel-length realization and its change over time. Figure 3(c) is the same graph as Figure 3(b) except that words of high, mid, and low frequency are plotted separately. We can make two general observations about this graph. First of all, contrary to the predictions of all three hypotheses in Figure 1, the high-frequency words do not have a shorter duration than the low-frequency words. As for the pattern of change over time, in older generations of speakers (those born in the 1950s or earlier), words of all three frequency types undergo reduction at a comparable rate, but in younger speakers (those born in the 1960s or later), mid- and low-frequency words continue to shorten, but high-frequency words resist further reduction. This distinct trajectory for high-frequency words seems to suggest that the high-frequency words are more resistant to the sound change than mid- or low-frequency words, in line with Figure 1(c).
Note, however, that the words of different frequency levels consist of words of different phonological shapes occurring in different phrasal contexts, and we know that such factors have substantial effects on vowel duration. As a result, we cannot compare the durations of different word groups directly and read the frequency effect off of the descriptive statistics. In order to examine the frequency effect properly, we need to control other factors that affect vowel duration. To accomplish this, we conducted a statistical analysis using a linear mixed-effects model (Baayen et al. 2008), which examines the effect of frequency on vowel duration and the interaction of frequency with speakers’ year of birth, while controlling for other factors. The analysis was carried out using the lmer function in the lme4 package (Bates et al. 2012) for R (R Development Core Team 2013).
The data included in the model are 63,365 long vowels in word-initial syllables. In this analysis, we model the effect of speaker age and word frequency on the realization of long-vowel duration. To normalize the speaker-specific durational variation, the duration of each long vowel is divided by the same speaker’s mean short-vowel duration. This duration ratio is the dependent variable. Fixed-effect predictors included in the model are summarized in Table 5. There are three speaker-level predictors: year of birth (YOB), gender (Gender), and by-speaker mean short-vowel duration (Rate). We expect the duration ratio will be reduced as the speaker’s year of birth increases and therefore expect a negative coefficient for YOB. From the exploratory figure above, we observe that the effect of YOB is not linear, and a quadratic term is also included (YOB^2) to model the curved shape of the trajectory. The mean short-vowel duration, which we interpret as an indicator of speech rate (Rate), is also included in the model. This predictor is added to take into account the possibility that faster speakers are less likely to retain the durational contrast between long and short vowels. Speaker’s gender (Gender) is included to test whether male and female speakers produce the contrast differently. Studies of other sound changes in the same corpus – the voice onset time merger between lenis and aspirated stops (Kang 2014; Bang et al. to appear) and the /e/-/ɛ/ merger (Kang to appear) – found that females are ahead of males in these sound changes, in keeping with the cross-linguistic tendency for females to lead sound change (Labov 1990). If the vowel-length merger follows the same pattern, we expect male speakers to show a larger long-vowel duration ratio than female speakers.
List of fixed-effect predictors.
|Year of birth (YOB, YOB^2)||continuous, quadratic|
|Frequency (Freq)||factor (low|
|Gender (Gender)||factor (male, female)|
|Speech rate (Rate)||continuous|
|Preceding consonant (Prec)||factor (null, m, n, p’, t’, k’, s’, c’, p, t, k, c, s, ph, th, kh, ch, h)|
|Vowel quality (Vowel)||factor (a, ɛ, e, ʌ, o, i, u, ɨ)|
|Syllable structure (Syll)||factor (closed.obs, closed.son, open.tense, open.C, open.nonC)|
|Word length (WordLeng)||factor (mono, poly)|
|Phrase final (IPFinal)||factor (non-final, final)|
|Phrase initial (IPInitial)||factor (non-initial, initial)|
Several word-level predictors are included in the model. The main predictor of interest for us is frequency (Freq). It is a factor with three levels (low, mid, and high), as discussed in connection with Table 4. This factor is Helmert-coded to compare low vs. mid and then low+mid vs. high words. Based on the general tendency for high-frequency words to show more reduction, we expect a lower duration ratio for high-frequency words. To examine how frequency interacts with the sound change over time, the interaction of Freq with YOB is also included in the model. The three hypotheses sketched in Figure 1 predict different patterns of interaction. Figure 1(a) predicts no interaction, and Figure 1(b) and (c) predict an interaction but in opposite directions. According to Figure 1(b), the YOB effect on long-vowel duration is stronger in high-frequency words (i.e., has a steeper slope) than in low-frequency words, while according to Figure 1(c), the YOB effect on long-vowel duration is weakened in high-frequency words (i.e., has a flatter slope) than in low-frequency words.
A number of control predictors that are known to affect vowel duration are also included. These include preceding consonants (Prec), vowel quality (Vowel), syllable structure (Syll), word length (WordLeng), and phrasal positions (IPFinal and IPInitial). For the factor of syllable structure (Syll), based on an exploratory analysis, five levels of syllable structure are defined. Closed.obs and closed.son represent closed syllable contexts where the coda consonant is an obstruent or a sonorant, respectively. Closed.son includes cases where the coda consonant is underlyingly an obstruent but surfaces as a sonorant due to a regular assimilation process. Open syllables are divided into three levels. Open.tense refers to open syllables followed by a fortis or aspirated consonant. These consonants have long closure duration and are known to shorten the preceding vowel (Zhi 1993; Yun 2009). Open.C refers to open syllables followed by a lenis or sonorant consonant, while Open.nonC refers to open syllables followed by /h/, another vowel, or a word boundary. 21 WordLeng differentiates monosyllabic vs. polysyllabic words. IPFinal and IPInitial distinguish vowels occurring at a phrase boundary from those occurring in phrase-medial position. The phrasal boundary is defined by the presence of a ‘silent pause’ assigned by the automatic forced aligner (Yoon to appear). We expect the vowels to be longer in absolute phrase-initial or phrase-final position than in phrase-medial position.
To reduce collinearity, numerical variables (YOB and Rate) are centered, and all categorical variables (Gender, Prec, Vowel, Syll, WordLeng, IPFinal, and IPInitial) are sum-coded, except for Freq, which is Helmert-coded as explained above. 22 The random effects include Word and Speaker. For Speaker, only a random intercept is included, and for Word, a random intercept and a random slope adjustment to YOB (YOB + YOB^2) are included.
We now turn to the results. All factors included in the model are statistically significant as determined by a Wald chi-square test except for Rate, which is only marginally significant. The Anova function of the car package (Fox et al. 2013a) is used for this test. The test statistics are summarized in Table 6.
Wald chi-square test of predictors in the linear mixed-effects model.
|YOB+YOB^2 * Freq||12.539||4||0.014||*|
Table 7 summarizes a coefficient estimate for each predictor and related test statistics. For each fixed-effect predictor, a coefficient estimate, a standard error, a t-test statistic, and a p-value are provided. The p-values are determined using a t-test with the degrees of freedom calculated by taking the number of observations (63,365) and subtracting the number of fixed-effect parameters (Baayen 2008). The model as a whole explains 49.1% of the variance in the data.
List of fixed-effect predictors, coefficient estimates, standard errors, and p-values.
|Prec (null vs. /m/)||−0.143||0.039||−3.646||<0.001||***|
|Prec (null vs. /n/)||−0.142||0.044||−3.253||0.001||**|
|Prec (null vs. /p’/)||−0.269||0.112||−2.399||0.016||*|
|Prec (null vs. /t’/)||0.093||0.112||0.834||0.405|
|Prec (null vs. /k’/)||−0.278||0.086||−3.24||0.001||**|
|Prec (null vs. /c’/)||−0.347||0.188||−1.845||0.065|
|Prec (null vs. /p/)||−0.194||0.061||−3.179||0.001||**|
|Prec (null vs. /t/)||−0.216||0.045||−4.765||<0.001||***|
|Prec (null vs. /k/)||−0.396||0.041||−9.627||<0.001||***|
|Prec (null vs. /c/)||−0.317||0.041||−7.761||<0.001||***|
|Prec (null vs. /s/)||−0.378||0.036||−10.647||<0.001||***|
|Prec (null vs. /pʰ/)||−0.364||0.09||−4.027||<0.001||***|
|Prec (null vs. /tʰ/)||−0.346||0.188||−1.844||0.065||.|
|Prec (null vs. /cʰ/)||−0.513||0.087||−5.892||<0.001||***|
|Prec (null vs. /h/)||−0.512||0.041||−12.354||<0.001||***|
|Vowel (/a/ vs. /ɛ/)||−0.097||0.037||−2.628||0.009||**|
|Vowel (/a/ vs. /e/)||−0.163||0.07||−2.325||0.02||*|
|Vowel (/a/ vs. /ʌ/)||−0.246||0.036||−6.872||<0.001||***|
|Vowel (/a/ vs. /o/)||−0.154||0.03||−5.205||<0.001||***|
|Vowel (/a/ vs. /i/)||−0.383||0.039||−9.812||<0.001||***|
|Vowel (/a/ vs. /u/)||−0.437||0.043||−10.109||<0.001||***|
|Vowel (/a/ vs. /ɨ/)||−0.483||0.065||−7.433||<0.001||***|
|Syll (closed_obs vs. closed_son)||0.065||0.034||1.898||0.058||.|
|Syll (closed_obs vs. open_tense)||0.020||0.06||0.33||0.742|
|Syll (closed_obs vs. open_C)||0.260||0.034||7.622||<0.001||***|
|Syll (closed_obs vs. open_nonC)||0.647||0.051||12.645||<0.001||***|
|WordLeng (mono vs. poly)||−0.195||0.047||−4.114||<0.001||***|
|IPFinal (non-final vs. final)||1.266||0.029||43.907||<0.001||***|
|IPInitial (non-initial vs. initial)||0.409||0.014||28.701||<0.001||***|
|Gender (female vs. male)||0.033||0.01||3.358||0.003||**|
|Freq (low vs. mid)||−0.015||0.028||−0.54||0.589|
|Freq (low+mid vs. high)||−0.133||0.029||−4.53||<0.001||***|
|YOB * Freq (low vs. mid)||−3.002||2.202||−1.363||0.173|
|YOB^2 * Freq (low vs. mid)||−0.736||1.353||−0.544||0.587|
|YOB * Freq (low+mid vs. high)||4.091||2.07||1.976||0.048||*|
|YOB^2 * Freq (low+mid vs. high)||2.875||1.21||2.377||0.017||*|
We will discuss each predictor in turn along with the partial-effect plots in Figures 4 and 5, which display the predicted values for each factor with each of the other predictors held constant at its average. 23 The effect function in the effects package (Fox et al. 2013b) is used to calculate the partial-effect estimates and the 95% confidence intervals. We discuss the control factors first and then discuss the main factors of interest.
The consonant that precedes the vowel (Prec) affects the vowel’s duration. Overall, consonants that have long aspiration or frication noise tend to shorten the following vowel, as shown in Figure 4(a). Vowel quality (Vowel), more specifically vowel height, also affects vowel duration: high vowels (/i, ɨ u/) are shorter than mid vowels (/ʌ, o, e, ɛ/), and mid vowels are shorter than the low vowel /a/, as shown in Figure 4(b). 24 The syllable structure (Syll) also significantly affects vowel duration. Vowels are shorter in closed syllables (Closed.obs or Closed.son) than in open syllables (except for those open syllables followed by a fortis or aspirated consonant [Open.tense]). Open syllables followed by a lenis or sonorant consonant (Open.C) are shorter than open syllables followed by /h/, a vowel, or a word boundary (Open.nonC). 25 This effect is summarized in Figure 4(c). WordLeng, IPFinal, and IPInitial all show a significant effect in the expected direction, as shown in Figure 4(d)–(f). Vowels are longer in monosyllabic than in polysyllabic words, and vowels are longer in phrase-initial or phrase-final position than in phrase-medial position.
Next we turn to the speaker-level factors. As for Gender, female speakers overall have a more reduced long-vowel duration ratio than male speakers, as shown in Figure 4(g). This effect is in line with the general trend of female speakers leading various sound changes, both in this corpus (Kang 2014, Kang to appear) and in sound changes more generally (Labov 1990). Speakers’ speech rate (Rate), as defined by the mean short vowel duration for the speaker, shows that the faster speakers (i.e., speakers with a shorter mean short vowel duration) reduce the vowel length contrast more, as shown in Figure 4(h), and this effect is marginally significant.
We now turn to the factors of primary interest: YOB, Freq, and their interaction. Speakers’ year of birth (YOB) is confirmed to be significant; the younger the speakers, the more reduced the long vs. short vowel contrast, in agreement with previous studies that suggested reduction of this contrast in younger speakers’ speech. The significant quadratic term (YOB^2) indicates that the rate of contrast reduction slows down in the very young speakers’ speech. This is apparent as a curved shape in the partial-effect plot in Figure 4(i).
Before we move on the next variable, we want to consider the potential confound of age and speech rate. As discussed above, younger speakers in general produce all vowels with shorter durations, not only long vowels but also short vowels, and there is a significant correlation between speakers’ age (YOB) and their speech rate (Rate), i.e., speakers’ mean short vowel duration (t=−7.8661; df=116; p<0.001). Given this correlation between the two predictors, we need to consider an alternative hypothesis that the age effect is not a reflection of sound change in progress but only an epiphenomenon of a rate effect, i.e., that younger speakers tend to speak faster, and faster speakers tend to show a more reduced durational contrast. To test this alternative hypothesis, we conducted a more stringent test of the YOB effect by replacing YOB with the residual of YOB against Rate as a predictor. In other words, this new model attributes all explanatory power shared by YOB and Rate to Rate, and YOB is given the minimum credit possible. In this new model, the effect of Rate is significant (χ²=82.828, df=1, p<0.001), not surprisingly, but the effect of YOB+YOB^2 remains significant as well (χ² = 86.164, df=2, p<0.001), and so does its interaction with Freq (χ²=44.374, df=4, p<0.001). This confirms that speakers’ age has a robust and independent effect on durational contrast over and above any effect attributable to the speech rate difference across generations.
As for the effect of frequency (Freq) on vowel duration, when various structural differences across low-, mid-, and high-frequency words are controlled for, an effect of frequency emerges. Long vowels in high-frequency words are significantly shorter than those in mid- and low-frequency words, while there is no significant difference between low- and mid-frequency words, as shown in Figure 4(j).
Now that we have established the effects of YOB and Freq in the expected directions, we turn to the interaction of YOB and Freq. Recall from Figure 3(c), repeated as Figure 5(a), that vowels in mid- and low-frequency words show a generally linear trajectory, and the shortening trend in the speech of older speakers is sustained in the speech of younger speakers. For high-frequency words, on the other hand, the shortening trend bottoms out for the younger speakers. Our model confirms this interaction of frequency and age; the YOB effect on vowel-duration ratio differs significantly between high-frequency words and low- and mid-frequency words. The YOB effect does not differ between low- and mid-frequency words. This interaction is visually represented in Figure 5(b).
This partial-effect plot reveals that long vowels are indeed far more reduced in high-frequency words than in low- and mid-frequency words, and they seem to reach the endpoint (the end of the S-curve; Labov 1994) much earlier than mid- or low-frequency words. Therefore, after controlling for the relevant phonological factors, it does not seem to be the case that the high-frequency words are particularly more resistant to this sound change, as one might be tempted to infer from Figure 5(a). Instead, high-frequency words seem to stop moving further along the change because they have already reached the endpoint and cannot reduce further. This trajectory therefore seems inconsistent with an interpretation of this particular sound change as analogical in nature (cf. Figure 1(c)). To examine if the rate of change differs by frequency during the time when high-frequency words are still progressing, we fit another model including only the data from speakers who were born before 1960 and found no significant interaction between YOB and Freq (χ²=1.5806, df=2, p=0.4537). 26 In other words, our data show no difference in the rate of change across frequency types, and the data better support the constant-rate hypothesis sketched in Figure 1(a) over the other two hypotheses sketched in Figure 1(b) and (c), according to which high- and low-frequency words change at different rates.
Let us now review the questions we raised at the outset of this paper and see how our data can answer them. The first question was whether the vowel-length contrast merger is actually happening in Seoul Korean, and our answer is yes. We found a clear effect of speaker’s year of birth on the durational contrast and confirmed that younger speakers indeed produce a more reduced contrast than older speakers. We also checked to make sure that this effect is not an epiphenomenon of a speech-rate effect and found that the age effect is robust even after the faster speech rate of younger speakers is factored out. Furthermore, we found that the change is almost complete, such that the long/short ratio falls below 1:1 and plateaus out in the youngest speakers’ speech, suggesting that we are observing the end stage of an S-curve in this change.
The other questions we posed were about the effect of frequency. We noted that a synchronic frequency effect on phonetic reduction may give rise to an appearance of a diachronic frequency effect, if the frequency effect is examined at a static point in time. To remedy this methodological confound, we tried to disentangle a diachronic effect of frequency on sound change from a synchronic effect on phonetic reduction by examining its dynamic effect (i.e., frequency effect on the rate of change in duration) as well as its static effect (i.e., frequency effect on overall duration). We found a robust effect of frequency on duration: high-frequency words have a substantially shorter duration than mid- or low-frequency words of comparable phonological structure and context. We did find a significant effect of frequency on the trajectory of sound change, but not in the way that we hypothesized. High-frequency words differed from mid- and low-frequency words in that they have reached the endpoint earlier and stopped progressing further in younger speakers speech, while mid- and low-frequency words continue to reduce in younger speakers’ speech. However, when we examined the slope of the change, when high-frequency words, as well as mid- and low-frequency words, were all still progressing, words of different frequency types did not differ in their overall rate or slope of change. In short, of the three patterns of frequency sketched in Figure 1, Figure 1(a) most closely resembles the pattern we found.
We now turn to the question of what this frequency effect, or lack thereof, suggests about the nature of this sound change and the nature of lexical representation. First of all, we did not find any evidence from frequency effects that the loss of the vowel-length contrast in Seoul Korean is analogical or analytical in nature, that is, we did not find high-frequency words to be more resistant to the change overall. We also did not find strong support for the exemplar-based model of lexical representation, which includes word-specific distribution of phonetically detailed stored exemplars that are updated with each use of the word. Such a model would predict a faster rate of reduction for high-frequency words over low-frequency words in a reductive sound change. The observed frequency effect, or lack thereof, is compatible with a model without word-specific phonetic representations, where the frequency effect on duration comes from online factors that affect phonetic implementation of speech sounds, not from stored tokens of word-specific variants. We also note that the lack of a diachronic frequency effect in our data may still be compatible with a hybrid exemplar model (Pierrehumbert 2006), where a layer of phonological categories acts to keep the word-specific variation from running rampant. Under such a hybrid model, the diachronic frequency effect on sound change may be more visible in the early to middle stages of the change. As a sound change nears its end, the frequency-conditioned word-specific variation cannot expand further but can only reduce. We may therefore be catching the vowel shortening sound change too late in the process to observe the relevant effect. Future studies of frequency effects on sound change may provide clearer tests of the status of word-specific phonetic representation by examining the sound change as it unfolds over time, especially in the early to middle stages of the change.
The paper benefited from valuable feedback from the Language Variation and Change Group and the Phonology-Phonetics Reading Group at the University of Toronto and the audience at the 22nd Manchester Phonology Meeting and the 14th Conference on Laboratory Phonology, especially Naomi Nagy, Morgan Sonderegger, and Kevin Tang. The authors thank two anonymous reviewers for helpful comments and Jessamyn Schertz and Timothy Vance for thorough and careful editorial suggestions that improved the paper and the research assistants, Yaruna Cooblal, Sohyun Hong, Roobika Karunananthan, Julianna So, Shawna-Kaye Tucker, and Cindy Yee, for conducting a manual check of the alignments in a subset of the corpus data. The research reported in the paper is funded by the Social Sciences and Humanities Research Council of Canada Partnership Development Grant (#890-2012-25).
Baayen, R. Harald. 2008. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
Baayen, R. Harald, Douglas J. Davidson & Douglas M. Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59. 390–412.
Bailey, Guy, Tom Wikle, Jan Tillery & Lori San. 1991. The apparent time construct. Language Variation and Change 3. 241–264.
Bang, Hye-Young, Morgan Sonderegger, Yoonjung Kang, Meghan Clayards & Tae-Jin Yoon. To appear. The effect of word frequency on the timecourse of tonogenesis in Seoul Korean. Proceedings of the 18th International Congress of Phonetic Sciences.
Bates, Douglas M., Martin Maechler & Ben Bolker. 2012. lme4: Linear mixed-effects models using S4 classes. R package version 0.999999-0. http://cran.rproject.org/web/packages/lme4/index.html (accessed 16 May 2013).
Bell, Alan, Jason M. Brenier, Michelle Gregory, Cynthia Girand & Dan Jurafsky. 2009. Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language 60. 92–111.
Bybee, Joan. 2001. Phonology and language use. Cambridge: Cambridge University Press.
Bybee, Joan. 2002. Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change 14. 261–290.
Bybee, Joan & Paul Hopper. 2001. Introduction to frequency and the emergence of linguistic structure. In Joan Bybee & Paul Hopper (eds.), Frequency and the emergence of linguistic structure, 1–24. Amsterdam: John Benjamins.
Cha, Jae-Eun. 2005. 1930nyentaeuy hankwuke umcangey tayhan yenkwu: pothonghakkyo
uy umseng calyolul cwungsimulo [The length of vowels in 1930s: A research on the sounds data of Joseoneodokbon]. Mincokmunhwayeonkwu 43. 105–128.
Cha, Jae-Eun. 2012. Tanileuy umcangi pokhapeey panyengtoynun yangsang:
ul cwungsimulo [How simple words’ vowel lengths are reflected in compound words: Focusing on The Kuen Dictionary]. Studies in Linguistics 22. 277–294.
Cho, Nam-Ho. 2002. Hyentay Kwuke Sayong Pinto Cosa: Hankwuke Haksupyong Ehwi Sencengul Wihan Kicho Cosa [A survey of word frequency in Contemporary Korean]. Seoul, Korea: The National Institute of the Korean Language.
Choe, Hyeon-Bae. 1959. Wuli Malpon [Our grammar]. Seoul, Korea: Chungumsa, 1937.
Coetzee, Andreis & Shigeto Kawahara. 2013. Frequency biases in phonological variation. Natural Language and Linguistic Theory 31. 47–89.
Crystal, Thomas H. & Arthur S. House. 1988. Segmental durations in connected-speech signals: Syllabic stress. Journal of the Acoustical Society of America 83. 1574–1585.
De Schryver, Johan, Anneke Nejit, Pol Ghesquière, & Mirjam Ernestus. 2008. Analogy, frequency, and sound change. The case of Dutch devoicing. Journal of Germanic Linguistics 20. 159–195.
Dinkin, Aaron J. 2008. The real effect of word frequency on phonetic variation. Penn Working Papers in Linguistics 14. 97–106.
Do, Youngah, Chiyuki Ito & Michael Kenstowicz. 2014. Accent classes in South Kyengsang Korean: Lexical drift, novel words and loanwords. Lingua 148. 147–182.
Duncan, Liisa. 2011. Variation in Finnish loan words: Evidence from Google. In Ain Haas & Peter B. Brown (eds.), Proceedings of the XIVth, XVth, and XVIth Conferences of the Finno-Ugric Studies Association of Canada: The Uralic World and Eurasia, 107–126. Providence: Rhode Island College.
Ernestus, Mirjam. 2014. Acoustic reduction and the roles of abstracts and exemplars in speech processing. Lingua 142. 27–41.
Fidelholtz, James L. 1975. Word frequency and vowel reduction in English. Chicago Linguistic Society 11. 200–213.
Fox, John & Jangman Hong. 2009. Effects displays in R for multinomial and proportional-odds logit models: Extensions to the effects package. Journal of Statistical Software 32. 1–24.
Fox, John, Sanford Weisberg, Douglas Bates, David Firth, Michael Friendly, Spencer Graves, Richard Heiberger, Rafael Laboissiere, Georges Monette, Henric Nilsson, Brian Ripley, Achim Zeileis & R Core. 2013a. car: Companion to applied regression. R package version 2.0-16. http://cran.r-project.org/web/packages/car/index.html (accessed 16 May 2013).
Fox, John, Sanford Weisberg, Jangman Hong, Robert Andersen & Steve Taylor. 2013b. Effects: Effect displays for linear, generalized linear, multinomial-logit, proportional-odds logit models and mixed-effects models. R package version 2.2-4. Online: http://cran.rproject.org/web/packages/effects/index.html (accessed 14 May 2013).
Gahl, Susanne. 2008. Time and thyme are not homophones: The effect of lemma frequency on word durations in spontaneous speech. Language 84. 474–496.
Han, Mieko S. 1964. Studies in the phonology of Asian languages II: Duration of Korean vowels. Los Angeles, CA: University of California.
Hay, Jennifer. 2006. Factors influencing speech perception in the context of a merger-in-progress. Journal of Phonetics 34. 458–484.
Hinskens, Frans. 2014. Grammar or Lexicon. Or: Grammar and lexicon? Rule-based and usage-based approaches to phonological variation. Lingua 142. 1–26.
Hooper, Joan Bybee. 1976. Word frequency in lexical diffusion and the source of morphophonological change. In W. M. Christie (ed.), Current progress in historical linguistics, 95–105. Amsterdam: North Holland.
Hothorn, Torsten, Frank Bretz, Peter Westfall, Richard M. Heiberger, Andre Schuetzenmeister & Susan Scheibe. 2013. multcomp: Simultaneous inference in general parametric models. R package version 1.2-17. http://cran.r-project.org/web/packages/multcomp/ (accessed 16 May 2013).
Huh, Woong. 1960. Kwuke Umwunlon [Korean phonology]. Seoul: Chungumsa.
Ito, Chiyuki. 2010. Analogy and lexical restructuring in the development of nominal inflection from Middle to Contemporary Korean. Journal of East Asian Linguistics 19. 357–383.
Jung, Myung-Sook & Kwuk-Jung Hwang. 2000. Kwuke hancaeuy cangtanumey tayhan silhemumsenghakcek yenkwu [A phonetic study on the vowel length in Sino-Korean words]. Emunnoncip 42. 285–299.
Jurafsky, Daniel, Alan Bell, Michelle Gregory & William D. Raymond. 2001. Probabilistic relation between words: Evidence from reduction in lexical production. In Joan Bybee & Paul Hopper (eds.), Frequency and the emergence of linguistic structure, 229–254. Amsterdam: John Benjamins.
Kahng, Soon-Kyong. 1995. Phyocunmalkwa munhwaeuy cangtanmoum punsek: choyso taylipelul cwungsimulo [Analysis of long and short vowels of South and North Korean: Concentrating on minimal pairs]. Tongsemwunhayenkwu 3. 3–26.
Kang, Yoonjung. 2003. Sound changes affecting noun-final coronal obstruents in Korean. In W. McClure (ed.), Japanese/Korean Linguistics 12, 128–139. Stanford, CA: CSLI Publication.
Kang, Yoonjung. 2007. Frequency effects and regularization in Korean noun variations. Paper presented at the Workshop on Variation, Gradience, and Frequency in Phonology, Stanford University.
Kang, Yoonjung. 2014. Voice onset time merger and development of tonal contrast in Seoul Korean stops: A corpus study. Journal of Phonetics 45. 76–90.
Kang, Yoonjung. To appear. A corpus-based study of positional variation in Seoul Korean vowels. In Theodore Levin, Ryo Masuda & Michael Kenstowicz (eds.), Japanese/Korean Linguistics 23. Stanford, CA: CSLI Publications.
Kim, Ju-Phil. 1990. ‘Phyocwune moum’uy simuy kyengwiwa haysel [Deliberation process for defining ‘standard vowels’]. Kwukesaynghwal 22. 190–207.
Kim, Sun-Chel. 2003. Phyocwune Palum Silthae Cosa II [A survery of standard pronunciation II]. Seoul: The National Institute of the Korean Language.
Kirchner, Robert. 2012. Modeling exemplar-based phonologization. In Abigail C. Cohn, Cecile Fougeron & Marie K. Huffman (eds.), The Oxford Handbook of Laboratory Phonology. Oxford: Oxford University Press.
Klatt, Dennis H. 1976. Linguistic uses of segmental duration in English: Acoustic and perceptual evidence. Journal of the Acoustical Society of America 59. 1208–1221.
Ko, Eon-Suk. 2002. The phonology and phonetics of word level prosody and its interaction with phrase level prosody: A study of Korean in comparison to English. Philadelphia: University of Pennsylvania Ph.D. dissertation.
Ko, Kwang-mo. 1989. Cheyen kkuthuy pyenhwa t>s ey tayhan saylowun haysek [Explaining the noun-final change t>s in Korean]. Enehak 11. 3–22.
Kong, Su-Jin & Seung-Jae Moon. 2002. A study of contextual assimilation manifested in Korean long/short vowel contrast. Proceedings of the Meeting of the Acoustical Society of Korea 21. 281–284.
Kreft, Ita & Jan de Leeuw. 1998. Introducing multilevel modeling. London: Sage.
Kwak, Chunggu. 1984. Cheyenekanmal seltancaumuy machalumhwaey tayhaye [On the spirantization of apical consonants in the final position of nouns]. Kwukekwukmunhak 91. 1–22.
Labov, William. 1990. The intersection of sex and social class in the course of linguistic change. Language Variation and Change 2. 205–254.
Labov, William. 1994. Principles of linguistic change: Internal factors. Cambridge, MA: Blackwell.
Labov, William. 2010. Principles of linguistic change: Cognitive and cultural factors. Cambridge, MA: Blackwell.
Lee, Hi Sung. 1956. Kwukehakkaysel [Introduction to Korean Linguistics]. Seoul: Minjung Seokwan.
Lee, Ki-Moon & S. Robert Ramsey. 2011. A history of the Korean language. Cambridge: Cambridge University Press.
Lee, Sung Nyong. 1960. Kwukehak Nonko [Studies on Korean linguistics]. Seoul: Tongyang Chwulphansa.
Lehiste, Ilse. 1970. Suprasegmentals. Cambridge, MA: MIT Press.
Maddieson, Ian. 1985. Phonetic cues to syllabification. In V. A. Fromkin (ed.), Phonetic linguistics: Essays in honor of Peter Ladefoged, 203–221. Orlando: Academic Press.
Martin, Samuel E. 1992. A reference grammar of Korean: A complete guide to the grammar and history of the Korean language. Tokyo: Tuttle Publishing.
Park, Jeong-Woon. 1994. Variation of vowel length in Korean. In Young-Key Kim-Renaud (ed.), Theoretical issues in Korean linguistics, 175–188. Stanford, CA: Center for the Study of Language and Information.
Park, Ju-Kyeng. 1985. Hyentay hankwukeuy cangtanumey kwanhan yenkwu [A study on segmental length in Contemporary Korean]. Malsori 11–14. 121–131.
Phillips, Betty. 2001. Lexical diffusion, lexical frequency, and lexical analysis. In Joan Bybee & Paul Hopper (eds.), Frequency and the emergence of linguistics structure, 123–136. Amsterdam: Benjamins.
Phillips, Betty. 2006. Word frequency and lexical diffusion. New York: Palgrave.
Pierrehumbert, Janet B. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Joan Bybee & Paul Hopper (eds.), Frequency effects and the emergence of linguistic structure, 137–157. Amsterdam: John Benjamins.
Pierrehumbert, Janet B. 2002. Word-specific phonetics. In Carlos Gussenhoven & Natasha Warner (eds.), Laboratory Phonology VII, 101–140. Berlin: Mouton de Gruyter.
Pluymaekers, Mark, Mirjam Ernestus & R. Harald Baayen. 2005. Lexical frequency and acoustic reduction in spoken Dutch. The Journal of the Acoustical Society of America 118. 2561–2569.
Probert, Philomen. 2006. Ancient Greek accentuation: Synchronic patterns, frequency effects, and prehistory. Oxford: Oxford University Press.
R Development Core Team. 2013. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.r-project.org/ (accessed 16 May 2013).
Schleef, Erik. 2013. Glottal replacement of /t/ in two British capitals: Effects of word frequency and morphological compositionality. Language Variation and Change 25. 201–223.
Schuchardt, Hugo. 1972. Reprint with English translation. Uber die Lautgesetze; Gegen die Junggramatiker. In Theo Vennemann & Terence H. Wilbur (eds.), Schuchardt, the Neogrammarians, and the transformational theory of phonological change, 1–72. Frankfurt: Athenaum. Original edition, Berlin: Oppenheim, 1885.
Shim, Jiyoung, Jieun Kiaer & Jaeeun Cha. 2013. The sounds of Korean. Cambridge: Cambridge University Press.
Sohn, Ho-Min. 1999. The Korean language. Cambridge: Cambridge University Press.
Sonderegger, Morgan. In Press. Testing for frequency and structural effects in an English stress shift. Proceedings of the Annual Meeting of the Berkeley Linguistics Society (2010).
Sonderegger, Morgan & Partha Niyogi. 2013. Variation and change in English noun/verb pair stress: Data, dynamical systems models, and their interaction. In Alan C. L. Yu (ed.), Origins of sound patterns: Approaches to phonologization, 262–284. Oxford: Oxford University Press.
Trudgill, Peter & Tina Foxcroft. 1978. On the sociolinguistics of vocalic mergers: Transfer and approximation in East Anglia. In Peter Trudgill (ed.), Sociolinguistic patterns in British English, 69–79. London: Edward Arnold.
Turk, Alice E. & Stefanie Shattuck-Hufnagel. 2000. Word-boundary-related duration patterns in English. Journal of Phonetics 28. 397–440.
Turk, Alice E. & Stefanie Shattuck-Hufnagel. 2007. Multiple targets of phrase-final lengthening in American English words. Journal of Phonetics 35. 445–472.
Walker, James A. 2012. Form, function, and frequency in phonological variation. Language Variation and Change 24. 397–415.
Yoon, Tae-Jin. To appear. A corpus-based study on the layered duration in Standard Korean. In Theodore Levin, Ryo Masuda & Michael Kenstowicz, (eds.), Japanese/Korean linguistics 23: Stanford: CSLI Publications.
Yoon, Tae-Jin & Yoonjung Kang. 2012. A forced-alignment-based study of declarative sentence-ending ‘da’ in Korean. Proceedings of the 6th International Conference on Speech Prosody 2012, 559–562. Shanghai: Tongji University Press.
Yoon, Tae-Jin & Yoonjung Kang. 2014. Hankwuke tayyonglyang palhwa malmungchiuy tanmoum punsek [Monophthong analysis on a large-scale speech corpus of read-style Korean]. Malsoriwa Umsengkwahak 6. 139–145.
Yun, Ilsung. 2009. Vowel duration and the feature of the following consonant. Malsoriwa Umsengkwahak 1. 41–46.
Zhi, Minje. 1993. Soriuy kili [Segment duration]. Saykwuke Saynghwal 3. 39–57.
Zhi, Minje, Jung-Chul Lee, Eung-Bae Kim & Yong-Ju Lee. 1990. Acoustic phonetic studies for speech synthesis by rule of Korean I: Acoustic analysis and perception experiment on vowel quantity. Proceedings of the 1990 Meeting of the Korean Institute of Communications and Information Sciences. 146–150.
Other studies that show a high-frequency-words-first effect include Fidelholtz (1975) and Schleef (2013), and studies that show a low-frequency-words-first effect include Do et al. (2014), Kang (2003), and Sonderegger and Niyogi (2013), among others.
We thank Ranjan Sen for bringing this case to our attention.
De Schryver et al. (2008) suggest that orthography likely played a role in this frequency effect. Dutch orthography reflects the standard form of these words, and speakers would have more exposure to the spelling of high-frequency words than low-frequency words. This may have led to a frequency effect in the opposite direction from that expected in a reductive sound change.
The picture gets even more complicated, since there are cases where we find frequency effects in opposite directions within a single sound change. For example, Sonderegger (in press) examined the English diatonic stress shift in noun–verb pairs, where noun–verb pairs that originally have stress on the second syllable shift the stress to the initial syllable for nouns. The overall frequency effect is in the expected direction, i.e., the stress shift is more likely for low-frequency than high-frequency words, but the frequency effect interacts with the phonological factors such that when the phonological shape of the word strongly prefers a stress shift, the frequency effect is attenuated or even reversed.
The standard view is that the long vowels of contemporary Seoul Korean developed from the rising tone of Middle Korean (Martin 1992; Lee and Ramsey 2011). However, see Ko (2002) for an alternative view that vowel length in Contemporary Korean is a phonetic exponent of phonological accent.
Ko (2002) observes that the adjectives that are exceptions to this vowel shortening process tend to be high in frequency. In our corpus, there are 28 monosyllabic verb or adjective lemmas for which the underlying vowel is long and the verb or adjective appears with a vowel-initial suffix in our corpus – a potential context for shortening. Of those 28 verbs and adjectives, 19 are alternating types and 9 are non-alternating types, according to the Great Dictionary of the Korean Language (see footnote 14). The non-alternating types are indeed higher in their lemma frequency (mean=1815.4, SD=2880.7) than alternating types (mean=475.7, SD=525.3). Relatedly, an anonymous reviewer points out that there may be an analogical mechanism that underlies this particular pattern of vowel shortening. However, as the reviewer points out, this frequency effect is likely indicative of a synchronic alternation pattern of older speakers who retain the vowel-length contrast, not an effect of sound change in progress. So, in our analysis below, we take the surface outcome of this shortening process as indicated in the Great Dictionary of the Korean Language as the starting point of the general vowel shortening in current Seoul Korean. A statistical test shows that the vowels in this morpho-phonological shortening context do not pattern differently from other vowels in terms of the frequency effect. The test used a mixed-effects regression model, similar to the one presented in Table 6 except that the morphological context and its interaction with frequency and YOB were included.
Kahng (1995) also examined the productions of three speakers of Munhwae (standard North Korean) and found that they tend not to retain the vowel-length contrast.
Each token was categorized as long or short based on the author’s perception supplemented with spectrogram-based measurements. A vowel was considered to be long if it was 1.5–2 times longer than the short vowel in a comparable phonological context.
Park (1985) makes a number of other interesting observations about the generational difference in vowel-length production. In older speakers’ speech, the mid back unrounded vowel shows quality differentiation conditioned by the vowel length: the long /ʌː/ is realized as more raised and central than the short /ʌ/, while younger speakers produce the long and short /ʌ/ with the same quality. Park (1985) also observes that younger speakers may lengthen the coda consonant in a syllable with a long vowel, while for older speakers, the vowel proper carries the durational contrast. Our study did not examine whether younger speakers may retain the vowel-length contrast by lengthening the coda consonant. We leave this topic for future research.
Separate mixed-effects logistic regression models for each age group (dependent variable: Vowel Length Realization; fixed effect: Underlying Vowel Length; random effect: Word) show that the underlying vowel length is significant for the speakers in their 50s (z=1.977, p=0.0481) and 60s (z=2.413, p=0.0158) but not for the speakers in their 20s (z=0.015, p=0.988), 30s (z=1.002, p=0.316), and 40s (z=0.842, p=0.4).
The perceptual categorization by native speakers used in Kim’s (2003) survey is a method that is not practical for a corpus study of this scale, and the method introduces its own problems, as discussed above.
However, this aggregated measure of phonetic duration does not distinguish between two types of change: all long vowels reduced in phonetic duration over time vs. long vowels not reduced in phonetic duration but some long-vowel words changed to short vowels.
An on-line interface for the aligner is available at http://www.yoonjungkang.com/korean-phonetic-aligner.html.
Lemma frequency was chosen partly because of the availability of a reliable frequency list with homophones carefully disambiguated. Future work will examine the effect of frequency based on word-form frequency counts.
We thank Kevin Tang for suggesting k-means clustering as a way to define the frequency levels. In converting a frequency count to a log frequency, the frequency count was raised by 1 to avoid undefined values for zero-frequency items.
/h/ systematically deletes in some morpho-phonological contexts and optionally deletes in intervocalic contexts in casual speech.
Centering is particularly important for YOB, a variable of particular interest in the study. Centering the variable makes the interpretation of the intercept more meaningful, as the estimated value is for a speaker with an average age rather than for a speaker born in year 0. More importantly, centering reduces collinearity. According to Kreft and de Leeuw (1998: 135–137), “[c]entering is good for technical purposes, since it removes high correlations between the random intercept and slopes, and high correlations between first- and second-level variables and crosslevel interactions [...] Centering stabilizes the model, and allows one to look at coefficients as more or less independent estimates.” But, as a reviewer correctly points out, centering does not remove collinearity between two distinct fixed effect predictors, e.g., YOB and Rate. The issue of collinearity between these two predictors is dealt with below.
For categorical predictors, this means that the values are “set to their proportional distribution in the data by averaging over contrasts” (Fox and Hong 2009: 5).
A Tukey’s post-hoc pairwise comparison using the glht function of the multcomp package (Hothorn et al. 2013) generally confirms the three-way differentiation of high, mid, and low vowels. The differences across different categories are significant, and the differences within categories are not significant, with the following exceptions. No differences are found for /a/ vs. /e/ and /a/ vs. /ɛ/, and /ɛ/ is significantly longer than /ʌ/.
A Tukey’s post-hoc pairwise comparison confirms the following three levels of differentiation: closed.obs, closed.son, open.tense
In this model, only a linear YOB term is used. A model including a quadratic term, and models with a different age cut-off (born before 1970 or born before 1950) also show no significant interaction of YOB and Freq.