1.1 Language background
Like German and English, Frisian belongs to the West Germanic languages. It includes three main varieties: West Frisian, East Frisian (or Saterland Frisian), and North Frisian, each of which comprise a number of regional dialects that are not necessarily mutually intelligible. Despite this regional spread, Frisian is considered an endangered language with overall fewer than 500,000 speakers who live along the coastline of the North Sea from the Netherlands via Germany to Southern Denmark. Figure 1 presents a map on which the core areas of the three main varieties of Frisian are shown in black.
From a phonetic point of view, Frisian is on the whole an understudied language, even though some varieties and dialects are better investigated than others. In particular, while at least some consonant and/or vowel characteristics of all three main varieties have been analyzed in modern experimental and acoustic studies (e.g., de Graaf and Tiersma 1980; Willkommen 1991; Tröster 1996; Bohn 2004), similarly detailed intonational analyses are rare and only available for West Frisian (Hoekstra 1991; Tiersma 1999; Peters 2010a, Peters 2010b; Peters et al. 2014, Peters et al. 2015) and East Frisian (Peters 2008).
Intonational analyses of North Frisian have so far been restricted to auditory descriptions that date back to the early twentieth century. For example, Tedsen (1906: 25) describes the intonation patterns of North Frisian in terms of musical notations and states, amongst other things, that the accented syllable has a higher pitch than the surrounding ones, that the high pitch does not set in before the accented-syllable onset, and that the rise towards the high pitch is smaller than the subsequent fall, typically a second vs. a fifth.
The lack of modern intonational analyses is even more problematic insofar as the North Frisian dialects have been losing both territory and numbers of speakers since the seventeenth century. One of the North Frisian mainland dialects vanished in 1981 (Walker and Wilts 2001), and the remaining nine North Frisian dialects together have less than 5,000 speakers (Århammar 2007), despite their protection by the European Charter for Regional or Minority Languages.
1.2 Aim and questions
Against the background set out in Section 1.1, it is the aim of the present study to lay the cornerstone for a phonological analysis of North Frisian intonation, starting from the detailed phonetic description of two pitch-accent patterns, one of which is quite distinctive and perceptually very salient to Northern Standard German ears, for reasons that we outline below.
Pitch accents in West Germanic languages like Northern Standard German are characterized by falling and/or rising pitch movements that start inside, lead into, or span the accented syllable (’t Hart et al. 1990; Niebuhr and Kohler 2004; Xu and Xu 2005; Niebuhr 2007; Peters et al. 2015). The high pitch at the beginning or end of the fall or rise, respectively, is typically a “clearly identifiable”, local event (Mücke et al. 2006: 298). So, pitch accents have overall a pointed shape, even though the slopes of rise and fall can vary in length and steepness due to, for example, intonational context, syllable structure, or a trade-off with pitch-accent alignment (e.g., Ladd et al. 1999; Wichmann et al. 2000; Ladd et al. 2000; Schepman et al. 2006; Niebuhr et al. 2011). Figure 2(a) provides examples of pointed pitch accents. They were realized by a Northern Standard German speaker in the utterance desto fester hüllte sich der Wanderer in seinen Mantel ein (‘the more closely did the traveler fold his cloak around him’) from the North Wind and the Sun fable.
Exceptions to the immediate succession of rising and falling movements in the realization of pitch accent patterns occur, if at all, under very limited conditions in West Germanic languages. The prosodic context can lead to plateau-shaped pitch accent realizations, for example, when the general pitch range is compressed, when the accented syllable is framed by certain sonorant consonants or occurs in a certain lexical-stress context, and/or when the corresponding rhythmic foot includes a large number of unaccented syllables (House et al. 1999; Wichmann et al. 2000; Knight 2004, Knight 2008; Braun 2005; Knight and Nolan 2006). Plateaux also occur for functional reasons, for example, under certain types of emphasis, when the accent is part of a stepped ‘calling contour’, or when two pitch accents are linked together in a ‘hat pattern’ (Grabe et al. 2005; Ambrazaitis and Niebuhr 2008; Niebuhr 2010; Dombrowski 2013).
The starting point for our study was the observation that North Frisian intonation seems to have plateaux that are not bound to the limited conditions outlined above. The plateaux are as clearly pronounced as in ‘calling contours’ and ‘hat patterns’, but, instead of spanning unaccented syllables and being related to phrasing or phrase-final tonal phenomena, the North Frisian plateaux are associated with single pitch accents. That is, they span accented syllables. Their key auditory characteristic is a combination of a sustained high pitch and a postponed-sounding falling pitch movement. Figure 2(b) gives an example, showing the same passage from the North Wind and the Sun fable as in Figure 2(a), but produced by a native speaker of North Frisian: Ji feester hääl ham de waanersmaan iin uun san mantel (lit. ‘the more-closely held him the wanderer in in his cloak’). 1 The pitch accent patterns across the gray boxes 1–3 have plateau-like rather than pointed peak shapes. Moreover, Figure 3 displays two single-accent intonation phrases from the North Frisian (Fering) speaker OBO, the first one with a plateau-shaped and the second one with a pointed pitch-accent pattern. 2
To our knowledge, plateau-shaped pitch-accent patterns as those displayed in Figures 2(b) and 3 are quite extraordinary in terms of intonational typology. The combination of sustained high pitch and the postponed-sounding fall creates the impression of a ‘halted’ or ‘disfluent’ accent pattern to the ears of a Northern Standard German listener. If similarly pronounced plateau-shaped pitch accents are mentioned at all in the literature, then they figure as a result of artificially extended pitch-accent peaks in the resynthesized stimuli of perception experiments (e.g., Gili Fivela and D’Imperio 2010; D’Imperio et al. 2010; Niebuhr 2011; Barnes et al. 2011, Barnes et al. 2012).
We wanted to know whether we would be able (i) to find more instances of such plateau-shaped pitch-accent patterns across different speakers, i.e., whether pitch accents with a high plateau are a regular feature of North Frisian intonation. If the answer to this major question is ‘yes’, then we are further interested in (ii) how the alignment and scaling properties of plateau-shaped accents can be described with reference to established segmental landmarks, (iii) how plateau-shaped accents differ in terms of alignment and scaling from pitch accents with a pointed peak shape, which – as is evident from Figure 2(b) and 3 – do also occur in North Frisian, and (iv) whether the plateau-shaped accent is a contextually determined variant of the pointed accent, or whether it represents a separate phonological pitch-accent category with a separate communicative function.
We analyzed spontaneous field interviews with elderly people recorded in the 1950s or 1960s. This has two advantages. First, our results benefit from the high ecological validity of spontaneous speech. Second, we can increase the internal validity of our data. The material comes from native speakers of North Frisian who grew up and lived in the main at a time when North Frisian still shaped all aspects of the speakers’ lives, and before commuting, modern mass media, and mass tourism became routine. Today’s speakers do not show the same high language competence and unbiased use of North Frisian, as knowledge and use of North Frisian have substantially decayed under the influence of the dominant Northern Standard German. Thus, using 50-year-old recordings ensures that all our results reflect original North Frisian intonation, rather than the intonational patterns of Northern Standard German or other intermediate patterns.
2.1 Speech material
The speech material we used is part of a still ongoing language documentation project of North Frisian. It was recorded in the 1950s and 1960s by the North Frisian Dictionary Centre (Nordfriesische Wörterbuchstelle) at Kiel University. Fieldworkers were, among others, Dietrich Hofmann, Nils Århammar, Hans Christian Nickelsen, and Alastair Walker. Recordings were made on audio tape, but were digitized later in the years 2001–2005 at a 44.1 kHz sampling rate and a 16-bit quantization.
The fieldworkers interviewed the speakers in a quiet room at the speakers’ homes and asked questions about the speakers’ lives, for example, where they grew up and/or went to school, which events or experiences shaped their lives, which professions they had, and what their families do. In this way, the fieldworkers elicited a number of spontaneous narratives from the speakers, altogether about 7–15 minutes per speaker. These narratives were very fluently and naturally produced, as storytelling was still a common practice and an important form of entertainment at that time. The informal atmosphere during the interviews was further supported by the fact that the fieldworkers themselves were skilled or even native speakers of North Frisian and known to the speakers for a long time.
From the about 500 hours of digitized (but non-annotated) speech material on CD-ROMs, we selected one regional variety of North Frisian for the present study. It is spoken by a few hundred speakers on the small island of Föhr, off the German coast, south of the German-Danish border (see Figure 1). The name of the variety spoken on the island of Föhr is Fering.
Fering was used here, as it represents one of the major dialects of North Frisian and is probably currently the best described (Walker 1990; Walker and Wilts 2001; Hoekstra 2007, 2013; Wilts et al. 2010). In addition, the geographical isolation of Fering on an island was considered in principle advantageous with respect to language contact and possible undermining influences from Northern Standard German.
Our analysis is based on the fieldwork interviews of 10 speakers, 4 females and 6 males. All of them were more than 60 years old when the interview took place. That is, they were all born at the end of the nineteenth century or the beginning of the twentieth century (between 1870 and 1910). Accordingly, they grew up and spent most of their life at a time when Fering was by default learned as the native language and shaped all aspects of everyday life on the island. Further information about our speakers, all of whom have passed away by now, is summarized in Table 1.
2.3 Sub-samples of pointed and plateau-shaped pitch accents
The first author – a trained phonetician with long experience in describing and annotating intonation contours – conducted an auditory search for plateau-shaped pitch accents in the fieldwork interviews of our 10 Fering speakers. Relevant tokens were determined on the basis of their particular ‘halted’-sounding melodic quality, which is very striking to the ears of Northern Standard German listeners.
The search took place in a silent sound studio at Kiel University. The 93 minutes of speech material were listened to through headphones and played stepwise in sections of about 30 seconds.
The search yielded 177 tokens. However, in order to keep the material of this initial study as simple and homogeneous as possible, we focused on terminal-falling nuclear pitch accents that occurred towards the ends of regular (i.e., non-interrupted/truncated) intonation phrases. The prosodic filtering left a total of 93 tokens that constituted the plateau-shaped sub-sample.
In forming this sub-sample, we took into account the fact that Fering has a quantity distinction between phonologically short vowels on the one hand and phonologically long vowels, diphthongs, and triphthongs on the other (Bohn 2004). Given the well-known cross-linguistic influences of vowel quantity on the realization of pitch accents (Kohler 1991; Asu and Nolan 1999; Ladd et al. 2000; Schepman et al. 2006), we further divided our sub-sample of plateau-shaped pitch accents into a short-vowel condition with 24 tokens and a long-vowel condition with 69 tokens. Furthermore, we included vowel quantity as a separate independent variable in our statistics.
In a second step, the speech material was searched again in order to create a reference sub-sample of pointed accent contours. That is, this time the first author searched for pitch patterns which sounded like the regular L+H* pitch accents of Northern Standard German (Grice and Baumann 2002; Niebuhr and Ambrazaitis 2006). In order to ensure that the sub-sample of reference contours is as comparable as possible to the sub-sample of target contours, the former was formed in parallel to the latter with respect to prosodic context, vowel quantity, and speaker. So, each plateau-shaped pitch accent in the target sub-sample had a counterpart in the reference sub-sample, which was produced in nuclear position by the same speaker in the same recording session and in combination with the same phonological vowel quantity. Moreover, the accented syllable was similarly structured (e.g., with or without a sonorant/obstruent syllable onset and coda) and located at a similar distance to preceding prenuclear accents or adjacent phrase boundaries. 3 Correspondingly, the reference sub-sample also consisted of 93 pitch accent tokens, 24 of them realized in short-vowel contexts and 69 in long-vowel contexts.
The total number (2×93=186) of pitch-accent tokens were subjected to a multiparametric acoustic analysis.
2.4 Acoustic analysis
The following measurements were taken for each pitch-accent token by means of the advanced speech signal processing tool XASSP, developed by the former Institute of Phonetics and Digital Speech Processing (IPDS) at Kiel University (Scheffers and Thon 1991): the F0 maximum of the pitch-accent peak (in Hz), the points in time (in ms) before and after the peak maximum at which F0 is 1 semitone (st) lower than the maximum (these two points will be referred to as rise offset and fall onset), the F0 value at rise onset (in Hz) as well as its point in time (in ms), the F0 value at fall offset (in Hz) as well as its point in time (in ms), the onset and offset of the accented vowel (in ms), and the onset and offset of the accented syllable (in ms).
The raw measurements above were used to calculate 10 parametric acoustic variables, which can be grouped along three themes: duration pattern, pitch-accent pattern, and alignment pattern. The key features of the acoustic analysis are illustrated in Figure 4.
The duration pattern includes two variables: the duration of the accented vowel, and the duration of the accented syllable (both in ms).
The pitch-accent pattern is represented by three variables: the range of the F0 rise, the range of the F0 fall (both in st), and peak duration, i.e., the duration of the high F0 section in between rise offset and fall onset (in ms).
Regarding the alignment pattern, we derived five variables from our raw measurements: F0-maximum alignment relative to the accented-vowel onset, rise-onset alignment relative to the onset of the accented syllable, fall-offset alignment relative to the offset of the accented syllable, the alignment of the onset of the high F0 section, and the alignment of the offset of the high F0 section. Unlike the first three alignment variables, which were measured in milliseconds, the onset and offset alignments of the high F0 section were expressed in percentage relative to the vowel onset or offset, respectively.
Measuring F0 ranges had to take into account the fact that the data came from male and female speakers. Nolan (2003) suggests on the basis of imitation experiments that gender-independent representations of intonation patterns should best be expressed in either ERB or semitone values. As semitones performed slightly better than ERB values in Nolan’s study, we decided to represent our F0 ranges in semitones in order to neutralize gender differences.
All segmental and F0 landmarks were determined by three trained phoneticians who had long experience in labeling speech signals. One of them was the first author. In those cases in which landmarks were difficult to determine, decisions were discussed and made on the basis of inter-observer agreement (between at least two of the three labelers).
F0 errors like octave errors or false voiceless markings (which are not infrequent for elderly voices) were manually corrected, either by temporarily changing the default settings of the F0 analysis or by measuring the durations of F0 periods in the waveform display. The elbows on both sides of the pitch accent pattern, i.e., rise onset and fall offset, were determined by visual inspection of the F0 course. We did not use an automatic elbow detection algorithm, as previous studies showed that these algorithms are not superior to human labelers in terms of consistency and adequacy (e.g., del Giudice et al. 2007). On the contrary, unlike automatic elbow detection algorithms, the trained phoneticians were able to use their metalinguistic knowledge for separating, as far as possible, relevant turning points of the macro-prosodic intonation contour from micro-prosodic F0 perturbations (upward or downward spikes) at CV or VC boundaries (Kohler 1990).
Segment boundaries were also set manually and with reference to the guidelines summarized by Machač and Skarnitzl (2009).
The reference points in the segmental string against which F0 alignment patterns were measured are all points that have proved to be relevant in previous studies for consistently describing temporal shapes of pitch accents and reliably detecting phonological, dialectal, and structural differences between them (Ladd 2008 for a summary). Moreover, the reference points selected here are consistent with the conclusion of Schepman et al. (2006: 1) that “tonal alignment is best expressed relative to a nearby segmental landmark”.
F0-peak maxima and peak durations (i.e., the durations of the high F0 sections in between rise offset and fall onset; see Figure 4) were always easily measurable. If the same F0 maximum value occurred two or more times within a high F0 section, which happened in only 9 (i.e., 4.8%) of our 186 tokens, then we applied a common method and defined the first value as the peak maximum (e.g., Braun 2005). Note that the rising and falling F0 ranges were not calculated with reference to the F0-peak maximum. Rather, we took the onsets and offsets of the high F0 section as points of reference (see Figure 4). Thus the range of the rise was the F0 difference between rise onset and onset of the high F0 section. Likewise, the range of the fall was the F0 difference between fall offset and offset of the high F0 section.
The 1-st step, by which the beginning and ending points of peak duration were determined, was chosen with reference to experiments like those of Pierrehumbert (1979), ’t Hart (1981), and Mack and Gold (1986). They show that 1-st intervals are about the minimum pitch changes that hearers can detect in speech stimuli. Under psychoacoustic conditions, i.e., for stimuli with constant spectral properties like steady tones or synthetic vowels, the listeners’ sensitivity to pitch changes can be much smaller than 1 st (Lehiste 1970). However, speech is a continuously changing, highly dynamic signal, and these cognitively more demanding conditions increase the just noticeable difference for pitch changes considerably. ’t Hart (1981: 811) arrives at the conclusion that “only differences of more than 3 semitones play a part in communicative situations”. Against this background, we consider our 1-st threshold a conservative estimate. Some studies used slightly different semitone thresholds for defining pitch plateaux, but as Knight and Nolan (2006: 24) point out with reference to their pilot analyses, “there is not a great deal of difference between plateaux identified according to these different measures”.
Remijsen and Ayoker (2014) also used 1-st steps along the frequency axis to determine high F0 measuring points similar to our rise offset and fall onset. Moreover, like the F0 trimming algorithm in Remijsen and Ayoker’s study, we also disregarded those F0 points whose 1-st distance from the F0 peak maximum was clearly due to micro-prosodic F0 perturbations at consonant-vowel transitions. In these cases, the next plausible F0 value with a 1-st distance from the peak maximum was selected as rise offset or fall onset. 4
2.5 Statistical analyses
The acoustic measurements of our 2×93 pitch-accent tokens were processed in a two-way multivariate analysis of variance (MANOVA). The analysis was based on the parametric variables listed in Section 2.4 and tested if the variance in these variables could be significantly explained by the two fixed factors Pitch Pattern (pointed vs. plateau-shaped accents) and Vowel Quantity (long vs. short). In the case of a significant main effect, multiple post-hoc comparisons with Bonferroni corrections of significance levels were conducted between the levels of the corresponding factor.
The 10 speakers were included in the MANOVA as a covariate, in order to take between-speaker variation and possible artefacts due to uneven speaker/gender ratios in the fixed-factors conditions into account (over and above expressing F0 distances in gender-independent semitones rather than in absolute Hz values [Nolan 2003]).
The results section provides effect sizes in terms of partial eta squared (ƞ²p) in addition to p-levels (Levine and Hullett 2002). When being multiplied by 100, partial eta squared values represent the percentages of variance associated with each main effect and interaction. So, the higher the partial eta squared, the more successful is the main effect or interaction in explaining the empirical variance. All statistical analyses were done with SPSS (v.21).
We got two significant effects of the covariate Speaker in the MANOVA. The effects concerned the two segmental variables, i.e., accented-syllable duration (F[1,213]=131.3, p<0.001, ƞ²p=0.381) and accented-vowel duration (F[1,213]=21.4, p<0.001, ƞ²p=0.091). Multiple comparisons tests showed that the significances were caused by differences between the four female speakers on the one hand and the six male speakers on the other. So, the duration differences were clearly gender-related. They replicated the well-known cross-linguistic finding that segmental durations are longer for female than for male speakers (Simpson 2009). In line with the conclusion of Nolan (2003) that semitone measurements should be comparable across males and females (and ultimately across the individual pitch ranges of speakers in general), the covariate Speaker was found to have no effect on our F0 range measurements. The same applied to all F0 alignment variables.
After partializing out individual effects in the data by means of the covariate Speaker, the MANOVA yielded the following results for our dependent variables.
3.1 Duration pattern
As one would expect, there is a strong main effect of the fixed factor Vowel Quantity. Compared with the short-vowel condition, both vowels and syllables had larger durations in the long-vowel condition. The average duration varied between 130 and 180 ms for long vowels, as opposed to only 70–85 ms for short vowels; see Figure 5 (accented syllable: F[1,182]=42.8, p<0.001, ƞ²p=0.167; accented vowel: F[1,182]=233.1, p<0.001, ƞ²p=0.523).
Additionally, accented vowels were up to 74 ms longer for plateau-shaped than for pointed peaks. On average, the differences were 50 ms in the case of long and 12 ms in the case of short vowels. The two differences and their non-identical sizes resulted in a significant main effect of the fixed factor Pitch Pattern on vowel duration (F[1,182]=31.2, p<0.001, ƞ²p=0.128), as well as in a significant interaction of Pitch Pattern and Vowel Quantity (F[1,182]=16.1, p<0.001, ƞ²p=0.070); see Figure 5.
3.2 Pitch-accent pattern
The results for the pitch-accent pattern are summarized in Figure 6. They are overall more complex than those of the duration pattern. Pointed pitch accents had larger F0 movements on both sides of the peak. In particular, the range of the rise was greater for pointed than for plateau-shaped accents. Plateau-shaped accents typically had F0 rises smaller than 3 st (1.6–3.9 st). In contrast, pointed accents often had rises of 5 st or larger (2.1–8.3 st). On the other side of the pitch peak, F0 fell on average by 5.2 st for plateau-shaped accents, as opposed to 6.4 st for pointed accents. Accordingly, both range variables (rise and fall) showed significant main effects of the fixed factor Pitch Pattern (rise: F[1,182]=116.2, p<0.001, ƞ²p=0.353; fall: F[1,182]=27.9, p<0.001, ƞ²p=0.116).
While plateau-shaped accents had smaller F0 ranges, their peak duration, i.e., the high F0 section within 1 st of the peak maximum, was clearly longer than that of pointed accents. The peaks of plateau-shaped accents extended on average over 107 or 169 ms. Peak durations of plateau-shaped accents could get up to 263 ms long and never fell below a minimum duration of 87 ms, whereas peak durations of pointed accents were on average only 53 or 72 ms long and never exceeded 96 ms. These differences in peak duration of plateau-shaped and pointed accents are reflected in the MANOVA in a significant main effect of Pitch Pattern (F[1,182]=293.8; p<0.001; ƞ²p=0.579).
Moreover, as both peak durations and the differences in peak duration between plateau-shaped and pointed accents were larger for long than for short vowels, the MANOVA also yielded a significant main effect of the fixed factor Vowel Quantity (F[1,182]=70.5, p<0.001, ƞ²p=0.249), and a significant interaction of Pitch Pattern and Vowel Quantity for peak duration (F[1,182]=24.8; p<0.001; ƞ²p=0.104).
3.3 Alignment pattern
The pitch-accent alignment pattern changes as a function of both Pitch Pattern and Vowel Quantity and is summarized in Figure 7. Note that boxes and lines were drawn manually and are not exactly proportional to one another in terms of time and frequency.
Nonetheless, the most important result of the alignment measurements is clearly visible in Figure 7. That is, not all parts of the accent contour were affected to the same extent by the fixed factors. The most strongly affected part was the high-F0 area around the peak maximum. Furthermore, almost all effects in this area are related to Vowel Quantity. The F0 peak maximum moved leftward from the long-vowel to the short-vowel condition, i.e., 15 ms closer to the accented-vowel onset (F[1,182]=47.2, p<0.001, ƞ²p=0.181). The beginning of the high F0 section, i.e., the rise offset, always started shortly after the accented-vowel onset, irrespective of the shape of the pitch pattern, but slightly (4%) later in relation to the total vowel duration when the vowel was phonologically short (F[1,182]=22.3; p<0.001; ƞ²p=0.095).
As regards the end of the high section, i.e., fall onset alignment, it was noted in Section 3.2 that the peak duration of plateau-shaped accents was only compressed up to a certain limit. Peak durations did not decrease below a minimum value of 87 ms. In contrast, peak durations of pointed accents decreased far below this limit. For this reason, pointed accents allowed the end of the high F0 section to move leftward (to a relatively earlier point in time), when the vowel quantity changed from long to short. In contrast, due to the peak compression limit of plateau-shaped accents, the same change in vowel quantity shifted the end of the high F0 section of plateau-shaped accents to the right (to a relatively later point in time).
So, in combination with long vowels, the high F0 section spanned, on average, the first half (53%) of the vowel for pointed accents, and the initial two-thirds (65%) of the vowel for plateau-shaped accents. In combination with short vowels, the high F0 section shrank to the initial 44% of the vowel for pointed accents, but spanned 107%, i.e., the entire vowel as well as a part of the subsequent (sonorant) consonant, for plateau-shaped accents. This complex fall-onset alignment pattern manifested itself in the MANOVA in significant main effects of Pitch Pattern (F[1,182]=346.5; p<0.001; ƞ²p=0.619) and Vowel Quantity (F[1,182]=11.9; p<0.01; ƞ²p=0.053), and a significant interaction between the two fixed factors (F[1,182]=50.0; p<0.001; ƞ²p=0.190).
The only significant main effect outside the high-F0 area around the peak maximum concerns the rise onset. Its alignment is affected by the fixed factor Pitch Pattern. Although the rise towards the pitch-accent peak always sets in right before the accented-syllable onset (see also the plateau examples in Figures 2b, 3, and 4), it is on average located 34 ms closer to the syllable onset for plateau-shaped than for pointed accents (F[1,182]=192.9; p<0.001; ƞ²p=0.474).
All other main effects and interactions are non-significant. This includes potential influences of Vowel Quantity and Pitch Pattern on the offset of the F0 fall after the pitch-accent peak. The fall offset consistently occurred at average distances of 38–45 ms after the accented-syllable offset for long and short vowels as well as for both pointed and plateau-shaped accents; see also Figures 3 and 4.
The present study started from the observation that high-tone pitch accents in Fering are not limited solely to pointed peaks. In addition to the pointed pattern, which is the default pattern of individual pitch accents in West Germanic languages, we observed instances of a peak pattern with a high plateau in between the rise and the fall. This pattern gives the corresponding pitch accents a very distinctive ‘halted’ sound. This initial informal observation raised four questions:
Are we able to find more instances of such plateau-shaped pitch-accent patterns across different speakers, i.e., are such accents with a high pitch plateau a regular feature of North Frisian intonation?
How can we describe the alignment and scaling properties of plateau-shaped accents with reference to established segmental landmarks?
How do plateau-shaped accents differ in terms of alignment and scaling from high-tone pitch accents in North Frisian with a pointed peak shape?
Is the plateau-shaped accent a contextually-determined variant of the pointed accent, or rather is it a separate phonological pitch-accent category with a separate communicative function?
Sections 4.1–4.3, below, provide answers to questions (I)–(IV), based on an auditory search for pointed and plateau-shaped nuclear pitch accents in old fieldwork interviews of 10 elderly speakers, and a subsequent acoustic-prosodic analysis of extracted pointed and plateau-shaped tokens from prosodically comparable contexts.
4.1 Question I: Frequency of plateau accents
The 177 tokens yielded by the auditory search in the fieldwork interviews (93 of which were acoustically analyzed) are a considerable quantity. It shows that ‘halted’-sounding, plateau-shaped pitch accents are at least not an exotic or accidental phenomenon in Fering. The plateau-shaped pitch accents were also not just an idiosyncratic or gender-related phenomenon, as they were found across all 10 male and female speakers, each of whom contributed more than 10 tokens to the 177 token sample.
In addition, our auditory approach probably only detected the most salient instances of plateau-shaped accents. The actual number of plateau-shaped pitch accents in the fieldwork interviews may be higher than 177. It is quite possible that some relevant pitch accent tokens escaped the observer’s (the first author’s) trained ear, especially since he is not a native speaker of Fering. For example, this could apply to plateau-shaped accents that were realized with weaker prominence levels, in high speaking-rate sections of utterances, or in the middle of the 30-second analysis window. Subsequent studies should investigate our interview data in more detail and also search for plateau-shaped pitch accents on the basis of F0 courses. In the present study, we deliberately refrained from such a visual approach in order to avoid circular reasoning (i.e., an influence of our dependent F0 variables on the selection of tokens).
A total number of 177 tokens means on average about one plateau-shaped pitch accent every 30 seconds. As intonation phrases were mostly between 5–9 seconds long (which seems to be a typical duration range across languages [Tseng et al. 2004; Peters et al. 2005]), there was a plateau-shaped pitch accent in every fourth to ninth phrase.
Furthermore, in order to estimate how large a number 177 tokens really is, we can compare them with frequency statistics of two Standard German speech corpora, the Kiel Corpus of Spontaneous Speech (Peters 2005), and the IMS Radio News Corpus (Rapp 1998). The Kiel Corpus of Spontaneous Speech (Vol. I+II) is 160 minutes long and prosodically fully annotated. Peters et al. (2005) found a total of 1,245 terminal-falling nuclear pitch accents at the end of multi-accent phrases in these 160 minutes, the majority (703) of them being of the ‘medial’ type, i.e., H*. Another 181 are ‘early’ pitch accents, i.e., H+L*; and 361 pitch accents are ‘late’ ones, i.e., L*+H. The prosodic condition investigated by Peters et al. is similar to the analysis condition of the present study, which yielded 93 plateau-shaped pitch-accent tokens. However, we have to take into account the fact that our corpus is 40% smaller than that investigated by Peters et al. So, if our corpus were similarly large, we would have probably found about 40% more, i.e., 130 plateau-shaped pitch-accent tokens. This is less than any of the frequencies reported by Peters et al. for the Kiel Corpus of Spontaneous Speech. Yet, 130 is roughly in the same order of magnitude as the frequency of occurrence of ‘early’ (H+L*) pitch accents (181).
The IMS Radio News Corpus includes 90 minutes of prosodically annotated speech. Based on this material, Schweitzer et al. (2009) found – across all prosodic conditions – 1,223 ‘late’ L*+H pitch accents, 704 ‘early’-like H*+L accents, and 162 ‘medial’ H* accents. Our Fering corpus has about the same length as the IMS Radio News Corpus, and we found 177 plateau-shaped pitch-accent tokens across all prosodic conditions. That is, there were as many – or even slightly more – plateau-shaped pitch-accent tokens in the Fering corpus than H* accents in the IMS Radio News Corpus.
In view of the numeric comparisons, and taking into account that plateau-shaped pitch-accent tokens were neither a speaker-specific nor a gender-specific phenomenon, we conclude that plateau-shaped pitch accents can indeed be considered a regular feature of North Frisian (Fering) intonation. However, they are surely not the most frequent type of pitch accent in Fering intonation, as it was, for example, much easier for us to find pointed high-tone pitch accents in the required prosodic contexts for the reference sample.
4.2 Questions II–III: Alignment and scaling properties of pointed and plateau-shaped accents
Although pointed and plateau-shaped pitch accents sound very dissimilar, the two pitch accent patterns share a number of alignment properties: the rise offset before the peak maximum as well as the F0 peak maximum itself are aligned closely after the accented-vowel onset for both accent patterns. The maximum is reached earlier in the vowel if the latter is phonologically short. This leftward shift under shorter vowel durations is consistent with findings in many previous studies across languages (Kohler 1991; Asu and Nolan 1999; Ladd et al. 2000; Schepman et al. 2006). The rise offset before the peak maximum moves in the opposite direction for shorter vowel durations, i.e., to the right (see below). A further shared property is that the two F0 elbows surrounding the peak – rise onset and fall offset – are both aligned outside the accented syllable. Unlike the peak maximum, their alignment is not influenced by vowel quantity. Last but not least, in terms of scaling, pointed and plateau-shaped pitch accents are both asymmetrical insofar as the rising movements are on average smaller than the falling movements. For example, in the case of plateau-shaped accents, we measured a rising interval of about a second (2 st) and a falling interval of about a fifth (5 st).
However, over and above these acoustic similarities, there are also major differences between pointed and plateau-shaped pitch accents. For example, pointed accents have larger F0 ranges. This is particularly true of the rising movements, which also set in twice as far from the accented-syllable onset than those of the plateau-shaped pitch accents.
Furthermore, the rise of a pointed accent changes almost immediately into a fall. In contrast, in the case of plateau-shaped accents, there is a long high F0 section with a narrow range of 1 st that precedes, but mainly follows, the absolute peak maximum. This section of sustained high pitch spans two-thirds of a phonologically long vowel and covers virtually the entire vowel if the latter is phonologically short. So, from the perspective of the accented vowel, the fall onset after the peak maximum moves to the right in short-vowel conditions. As was mentioned above, the same rightward shift occurs for the rise offset (of both plateau-shaped and pointed accents), but to a lesser degree.
One of our reviewers pointed out that for pitch-accent landmarks to move to the right (rather than to the left) when segment (vowel) durations decrease is at odds with known alignment patterns. This is undoubtedly true, but we think that our findings should not be over-interpreted in this respect for two reasons. First, when viewed from the perspective of peak duration, the fall onset of the plateau-shaped accent moves to the left as well, just as the preceding F0 peak maximum does. That is, the peak duration of plateau-shaped accents decreases in short-vowel contexts as compared to long-vowel contexts (see Figure 7). That the fall onset moves rightward from the perspective of the vowel offset is only due to the fact that the change from long to short vowel causes a much stronger decrease of vowel duration than of peak duration. Similarly, when expressed in absolute (ms) distances rather than percentages, the rise offsets are not moved rightward in short vowel contexts, but remain constantly aligned right after the accented-vowel onset.
Second, rise offset and fall onset were established to measure peak duration and are simply defined by a 1-st scaling difference to the peak maximum. That is, they were included as measuring points and not in order to represent tonal targets. But, even if there are tonal targets around the peak maximum (see footnote 4 in Section 4.4), then they are probably not precisely and consistently represented by F0 values at a 1-st distance from the peak maximum. So, we did not expect our two measuring points around the peak maximum to follow known alignment shifts of established tonal targets. Semitone thresholds other than 1 st could have yielded different alignment patterns.
The decisive points with respect to alignment are, in our opinion, the difference in peak duration between pointed and plateau-shaped accents, and the fact that the peak duration of plateau-shaped accents shrinks in short-vowel contexts only moderately and up to a certain limit, which is well above that of pointed accents (see Section 4.4). The other alignment properties of rise offset and fall onset are additional details and depend on the point of view and the measuring unit.
A final point is worth noting: the musical notations of North Frisian intonation that Tedsen (1906) created more than a century ago describe accents characterized by an overall high pitch level that sets in close to the syllable onset. The high pitch levels are preceded by a rising interval of a second and followed by falling interval of a fifth. The acoustical alignment and scaling properties that we found in response to questions (II)–(III) for the plateau-shaped pitch accent agree remarkably well with Tedsen’s early auditory description. It is quite possible that Tedsen’s and our descriptions refer to the same pitch accent phenomenon. Tedsen characterizes the pitch accent in question as one of the basic patterns of North Frisian intonation, which is also consistent with our conclusion in Section 4.1 that plateau-shaped accents are a regular feature of this language variety.
4.3 Question IV: Pitch-accent categories
A number of facts support the conclusion that pointed and plateau-shaped pitch accents are separate pitch-accent categories rather than contextual variants of one another.
First, the pointed and plateau-shaped pitch accents occurred in comparable prosodic contexts across the two sub-samples. That is, the two accents were in non-complementary distribution. For each plateau-shaped pitch accent in the target sub-sample, there was a pointed counterpart in the reference sub-sample. This pointed accent was also in nuclear position and produced by the same speaker in the same recording session, in combination with the same vowel quantity and syllable structure, and at the same distance from preceding pitch accents or adjacent phrase boundaries as the corresponding plateau-shaped accent.
Second, following a suggestion of one of our reviewers, we tested whether there were traces of a continuum between pointed and plateau-shaped accents in the form of a trade-off between F0 range and peak duration. Such a trade-off would manifest itself in a negative correlation between the two F0 variables. A series of four correlation analyses were conducted using Pearson’s correlation coefficient, with separate tests for rising and falling slopes as well as for the pointed and plateau-shaped sub-samples. All tests were non-significant (–0.07<r<0.04, df=91). Thus, there is no evidence for a continuum between pointed and plateau-shaped accents.
Third, also following a suggestion of one of our reviewers, we additionally conducted a linear discriminant analysis based on all dependent variables in order to assess how distinct pointed and plateau-shaped pitch accents are, i.e., how well they can be separated from one another. The canonical discriminant function was significant (χ²=352.9, p<0.001) and able to correctly classify 90.9% of the pitch accents as coming from either the pointed or the plateau-shaped sub-sample. The error rate was slightly higher for the pointed than for the plateau-shaped accents (12.9% vs. 5.4%). However, overall the classification can be considered very precise. That is, the difference between pointed and plateau-shaped accents is very clear cut, especially when this difference is represented in terms of multiple acoustic variables. There is no continuous transition from one sub-sample to the other, a finding that is consistent with the correlation analysis described above. In terms of the standardized canonical coefficients – i.e., the discriminant weights – the successful separation of pointed and plateau-shaped accents mainly relied on fall-onset alignment (0.732), peak duration (0.706), and the range of the rise (–0.422). All other variables were far less powerful in separating pointed and plateau-shaped accents. For example, the next most relevant discriminant variable is the range of the fall. However, with a discriminant weight of –0.224, it is only half as important as the range of the rise and 3.5 times less powerful than fall-onset alignment and peak duration.
Fourth, regarding a functional difference between pointed and plateau-shaped accents, we got the impression during the compilation of the plateau-shaped sample that the corresponding accents occurred predominantly in phrases containing conditional verb forms, negations, modal particles, interjections, emphatic pronouns, or causal, concessive, comparative, or contrastive conjunctions. A frequency count confirmed our assumption: 77.9% of the plateau-shaped pitch accents (= 72 out of 93 tokens) occurred in phrases that contained one or more of the listed lexical expressions. In contrast, the listed lexical expressions were only found in 31.2% of the phrases with pointed accents (see Figure 8). This is a significantly different distribution (χ²=48.130, df=1, p<0.001).
The distributional difference suggests that pointed and plateau-shaped accents do not have the same communicative function. But, how exactly the functional difference can be characterized will have to be determined in follow-up studies on the basis of semantic-pragmatic production and perception experiments. Taking into consideration the semantics of the lexical expressions listed above, we tentatively conclude that plateau-shaped accents are used in expressive contexts, i.e., when speakers do not just highlight important information in a matter-of-fact fashion, which probably applies to pointed accents. Compared with the latter, plateau-shaped accents additionally mark the important piece of information as surprising, unexpected, stunning, incredible, indignant, undesired, or unintended. Such an expressive function would fit in with the larger accented-syllable and accented-vowel durations that were found for plateau-shaped accents (Baumann et al. 2007; Kügler 2008; Breen et al. 2010; Niebuhr 2010; Dorn and Ni Chasaide 2011; Görs and Niebuhr 2012). Further support for our conclusion comes from informal interviews with native speakers of Fering (taken by the first author in 2012).
In summary, the pointed and plateau-shaped accents represent two distinct intonational profiles whose clear-cut differences cannot be caused by (intrinsic) contextual factors, as the profiles were realized in non-complementary distributions across the two sub-samples. Moreover, differences in the distribution of lexical expressions between the two sub-samples provide initial evidence for a functional difference between pointed and plateau-shaped accents (which may be related to expressive highlighting of information). We conclude on this basis that pointed and plateau-shaped accents are separate pitch-accent categories in the intonational phonology of North Frisian, represented by the island variety of Fering.
4.4 Perspectives for phonological representation
Two aspects have to be taken into account in connection with the question of phonological representation. First, although the results of our acoustic and initial lexical-distribution analyses clearly suggest that pointed and plateau-shaped accents are two phonologically different pitch-accent categories, ultimate evidence for this conclusion must come from functionally oriented studies in which our characterization of the differences in the meanings of pointed and plateau-shaped accents is taken as a starting point for judging stimulus continua or eliciting semantically and phonetically controlled speech data, such as read texts. Second, pitch-accent categories in an intonational phonology should be represented such that their distinctive features indicate how each category differs from all others within the same paradigm. We provided evidence for two nuclear pitch-accent categories here, but it is reasonable to assume – with regard to the number of pitch accents in other languages – that the corresponding paradigm of Fering consists of more than two pitch accents.
For both of these reasons, we can only make suggestions for a preliminary representation in this paper. Using a data-driven approach, we base our suggestions on the acoustic analysis.
The results of the acoustic analysis showed that pointed and plateau-shaped pitch accent patterns mainly differ from each other in three respects. First, unlike the plateau-shaped accent, the pointed accent is characterized by a pronounced F0 rise in terms of both duration and range, which is an important prerequisite to perceive the rise as an actual pitch movement (House 1990). The pointed accent, in turn, differs from the plateau-shaped accent in that the latter features an extensive, high F0 plateau within the accented syllable. By spanning most of the accented vowel, this plateau coincides with the high-intensity area of the syllable. Moreover, plateau duration does not fall below 87 ms, even if short vowels have much smaller durations. Finally, the onset of the F0 fall (1 st below the F0 peak maximum) is located much later in the accented vowel (or syllable) for plateau-shaped than for pointed accents.
In summary, we have, on the one hand, a pronounced rise that is also perceptually very salient. On the other hand, we have an equally salient high and long plateau which delays the subsequent F0 fall towards the end of the vowel or after the vowel offset. These three features yielded the largest effect sizes in the MANOVA and were additionally most effective in separating pointed and plateau-shaped accents in the discriminant analysis. Therefore, they should be selected as distinctive features in the phonological representation of the two proposed accent categories.
So, taking into account the salience of the (long and high) rise, our preliminary suggestion is to represent the pointed pitch accent in the form of a bitonal L+H* sequence. In contrast, for the plateau-shaped pitch accent, we suggest a preliminary representation as H*+L.
The H*+L representation captures the fact that the F0 falls of plateau-shaped accents are much more pronounced than the small initial F0 rises. Moreover, the H*+L representation is also compatible with the observations in Remijsen (2013), which describe a tonal contrast with a difference in fall-onset alignment in Dinka, a Western Nilotic language spoken in South Sudan, similar to that found between pointed and plateau-shaped accents. He points out with reference to the cognitive restrictions in pitch processing revealed by House (1990: 134) that only the later-aligned F0 fall (which starts clearly after the accented vowel) can actually be perceived by listeners as a falling pitch movement. This, in combination with our own auditory impression (see Section 1.2), is another empirical argument that the plateau-shaped accent with its late-aligned fall onset is adequately represented by tones that reflect the fall in a high-low sequence.
The common element in the two suggested phonological representations – H* – reflects that both pitch accents reached their F0 peak maximum at the same place within the accented syllable, i.e., shortly after the vowel onset (Figure 7). This is clear evidence for an identical high tonal target. In addition, this high tonal target behaves like other high pitch accent targets: it moves to the left (i.e., closer to the accented-vowel onset) for shorter vowel durations, if only to a small extent.
One aspect of the suggested preliminary L+H* vs. H*+L contrast can in principle still be further refined. On the one hand, it is reasonable to assume that the pointed pitch accent has only a single high tonal target. The peak duration of the pointed accent is about the minimal time that speakers need in order to change the direction of F0 movements from rising to falling (F0 deceleration followed by F0 acceleration in the opposite direction [Xu and Sun 2002]). This supports our L+H* analysis. On the other hand, in the case of plateau-shaped accents, the time interval between F0 peak maximum and fall onset is very long compared to what it takes to reach and leave a single high tonal target. Moreover, a closer look at the alignment patterns within the high F0 section of plateau-shaped accents revealed a significant correlation between the fall onset and the accented-vowel durations in both quantity conditions; see Figure 9 (long vowels: r=0.664, df=67, p<0.001; short vowels: r=0.493, df=22, p<0.05).
These correlations suggest that the fall onset in plateau-shaped accents is a further tonal target that speakers head for towards the end of the accented vowel (the non-significant correlations in Section 4.3 rule out the possibility that the rising or falling F0 ranges also determine fall-onset alignment).
In summary, it seems that realizing plateau-shaped accents involves two high tonal targets. The first one is reached right after the accented-vowel onset and varies only marginally with vowel quantity/duration; and the second one is located close to the accented-vowel offset and considerably shifted leftward or rightward for shorter or longer vowels, respectively. 5
If the assumption of two high tonal targets is correct, then the next question would be how such a pattern should be represented phonologically. A sequence of two high tones, i.e., H+H, would be an obvious solution. However, we think that a phonological representation in the form of tonal spreading, i.e., H→ (in the notation of Peters 2014), would be more adequate for three reasons. First, H+H – or H+H+L for the entire accent – would violate the theoretical assumptions of the autosegmental-metrical framework that pitch accents can only be bitonal, and that the accented syllable is associated with only one tonal target (Ladd 2008). Second, H+H would weaken the falling characteristic of the pitch accent, and unnecessarily raises the complex question of whether the first or the second tone is the starred tone (H*+H or H+H*; Arvaniti et al. 2000). Third, a representation in the form of tonal spreading better expresses the fact that the first tonal target is both more stably aligned than the second and also consistently higher. However, we must leave it to follow-up studies to determine whether the left edge of H→ has a separate secondary association in the segmental string, and if so, which specific point in the F0 course (see footnote 4) is anchored to which segmental landmark. Our data are inconclusive with respect to the segmental anchor point, although it seems that the accented-vowel offset is somehow involved.
Representing the plateau-shaped accent as H→*+L goes beyond the required minimum of contrastive features. That is, ‘→’ is redundant information. From a phonological point of view, L+H* vs. H*+L is sufficient to mark the assumed distinction between pointed and plateau-shaped accents, and thus we will adhere to this simpler symbolization for now. However, H→*+L would become relevant should follow-up studies reveal a third pitch accent that shows a late-aligned fall similar to the plateau-shaped accent, but without the preceding high F0 plateau. Most Western Germanic languages have such a pitch accent (see Kohler 2005 and Niebuhr 2007 for German; Pierrehumbert and Steele 1989 for English; or Gussenhoven 2004 for Dutch), and the first accent pattern in Figure 2(b) suggests that Fering has this pitch-accent category as well.
4.5 ‘Lifters’ and ‘Flatteners’
Based on a data-driven set of lexical criteria gathered during the compilation of the pitch accent samples, we provided initial evidence in Section 4.3 that pointed and plateau-shaped pitch accents occur in different semantic-pragmatic contexts. This initial evidence led us to tentatively conclude that plateau-shaped accents are used in expressive contexts, i.e., when speakers mark this information as surprising, unexpected, stunning, incredible, indignant, undesired, or unintended. By comparison, the pointed accent seems to highlight important information in a matter-of-fact fashion. If these functional assumptions are true (and we have no counterevidence so far), then the present findings represent a typologically interesting case of substitute features in intonational phonology.
When it comes to signalling contrastive or, more generally, expressive information, the pitch accents involved consistently show across languages a later peak alignment and a larger F0 range than their non-contrastive/non-expressive counterparts (Ladd and Morton 1997; Gordon 2004; Xu and Xu 2005; Baumann et al. 2007; Breen et al. 2010; Dorn and Ni Chasaide 2011; Wang et al. 2011; Görs and Niebuhr 2012; Ladd 2008 and Chen 2012 for summaries). Gussenhoven (2002: 52) notes in the context of his idea of substitute variables in F0 variation that “both higher and later peaks elicit more ‘unusual occurrence’ interpretations”, and in fact speakers make use of both features at the same time when producing contrastive/expressive pitch accents, but to individual and language-specific degrees.
The plateau-shaped pitch accents of Fering are assumed to be related to signaling contrastive/expressive information, and in terms of the fall onset, they are in fact later aligned than the pointed pitch accents. However, the F0 ranges of plateau-shaped pitch accents were lower than those of the pointed pitch accents rather than higher.
At this point, the acoustic point of view must be extended to include perception. It is known from the experiments of Knight (2008) that plateau-shaped peaks do sound higher than pointed peaks with the same physical F0 range/level, and Knight concludes on this basis that “it is possible that a plateau can also be used as a substitute variable for high F0” (p. 242).
The crucial point is that Fering could be an example of Knight’s finding. That is, Fering could follow the same pattern as many (if not all) other language varieties in that contrastive/expressive pitch accents are later aligned and higher than their non-contrastive/non-expressive counterparts. The only notable exception would be that Fering does not increase the F0 range in order to make contrastive/expressive pitch accents sound higher in pitch. Rather, the impression of higher pitch is achieved in Fering by increasing the duration of high F0, particularly in the high-intensity area of the accented vowel.
In other words, Fering speakers could be ‘F0 Flatteners’ rather than ‘F0 Lifters’ when it comes to raising the perceived pitch of pitch accents. Follow-up studies will be needed to test whether this provocative statement is supported by empirical evidence, and if so, whether it is restricted to H tones in nuclear pitch accents or also includes pre-nuclear H tones and/or high boundary tones.
5 Conclusion and outlook
The present study dealt with North Frisian, a highly endangered variety of Frisian, spoken on the west coast of the German state of Schleswig-Holstein. Our primary aim was to lay an initial cornerstone for the phonological analysis of North Frisian intonation, represented by the Föhr island variety of Fering. This aim was achieved; we found two nuclear pitch accent patterns with acoustically clearly distinct F0 profiles in non-complementary distribution, as well as initial indirect evidence for separate communicative functions of these pitch-accent profiles, based on a difference in semantic-pragmatic context. This contextual difference suggests that the two pitch accents highlight important information with, in the case of the plateau shape, or without, in the case of the pointed shape, an expressive component.
We concluded from the different form-function links that the two pitch accents represent a paradigmatic opposition in the intonational phonology of Fering, and with reference to effect sizes and discriminant weights of our dependent acoustic variables, we preliminarily analyzed the phonological contrast as L+H* vs. H*+L. Regarding the unusual plateau shape of the latter accent, we also suggested the alternative representation H→*+L. Finally, relating the plateau shape of the accent to its assumed communicative function, we advanced the idea that Fering speakers could be ‘Flatteners’ rather than ‘Lifters’ in the sense that they raise the perceived pitch level of the H tone not by increasing the F0 range but by increasing the F0 peak duration.
Building on the results of the present study, one of the most important tasks for subsequent studies will be to put the contrast discovered between pointed (L+H*) and plateau-shaped (H*+L) pitch accents on a more solid footing. Elicitation tasks based on target words in semantically controlled contexts, and/or perception experiments with semantic judgments are needed to provide direct evidence that the two assumed pitch accents categories have different communicative functions. Moreover, the two pitch accents should be analyzed in systematically varied syllable structure and time-pressure conditions (cf. Caspers and van Heuven 1993; Remijsen 2013), in order to possibly refine our phonological representations, for example, with respect to the suggested tonal spreading of the H tone in plateau accents, the related target at the plateau’s right edge, and its possible secondary association.
As regards the idea of Fering speakers being ‘Flatteners’ rather than ‘Lifters’, we need to elicit data in which factors that narrow or widen the local and global F0 range are systematically varied (phrasing, information structure, Lombard speech, phonetic bracketing, etc.), and in which possible speaker-individual effects can be taken into account (as in the case of ‘F0 Shapers’ and ‘F0 Aligners’ [Niebuhr et al. 2011]).
Furthermore, we have completely ignored in the present study the fact that plateau-shaped accents also occur next to pointed accents in pre-nuclear position (see Figures 2(b) and 3). The question of whether or not this pre-nuclear difference is formally and functionally identical to the nuclear difference needs to be addressed (cf. Mücke et al. 2006). Ultimately, we expect to find further intonational categories. For example, the initial and final accents in Figure 2(b) suggest that Fering has at least two additional types of pitch accents: one with a late-aligned pointed peak, and one in which peak maximum is located so early in the accented syllable that F0 constantly falls throughout the accented vowel. These two types of pitch accents will require a separate acoustic and functional analysis. As regards boundary tones in Fering, Witte (2015) presented initial evidence for a tripartite phonological distinction between a constant rise, a constant fall, and a combination of rise and subsequent high plateau.
In parallel to broadening the phonetic-phonological basis of North Frisian intonation, a second line of research would examine the generalizability of our findings. We deliberately decided to start from spontaneous data from speakers who were born more than 100 years ago and interviewed more than 50 years ago, when North Frisian still shaped all aspects of the speakers’ lives and influences of Northern Standard German, mass media, and mass tourism were relatively small compared to the present. For the same reason (and because of its detailed documentation), we chose the Fering dialect that is spoken on a small island off the German west coast. A question for future research is to what extent our findings apply to current speakers of Fering and other dialects of North Frisian. We know for sure that the difference between pointed (L+H*) and plateau-shaped (H*+L) pitch accent still exists in modern Fering, but it is possible that the plateau-shaped accents occur less frequently today than in our old dataset.
First, great thanks are due to our two anonymous reviewers, John Haig, and our editor, Haruo Kubozono, for their useful and detailed comments on the manuscript as originally submitted. Moreover, we would like to express our gratitude to Ernst Dombrowski, Erik Thomas, Lenka Weingartová, and Francesco Cangemi for sharing data and discussing the phonetic and phonological aspects of plateau-shaped pitch accents. We are also very much indebted to Rike Huf and Mareike Voß for preparing and helping us with the acoustic analysis. Finally, we would like to thank our informants on the island Föhr for their readiness to support us with the collection, interpretation, and translation of speech material.
Århammar, Nils. 2007. Das Nordfriesische, eine bedrohte Minderheitensprache in zehn Dialekten: eine Bestandsaufnahme. In Horst Munske (ed.), Sterben die Dialekte aus? Vorträge am Interdisziplinären Zentrum für Dialektforschung an der Friedrich-Alexander-Universität Erlangen-Nürnberg, 1–29. Nuremberg: University of Erlangen-Nuremberg. Google Scholar
Arvaniti, Amalia, D. Robert Ladd & Ineke Mennen. 2000. What is a starred tone? Evidence from Greek. In Michael Broe & Janet B. Pierrehumbert (eds.), Papers in laboratory phonology V, 119–131. Cambridge: Cambridge University Press. Google Scholar
Asu, Eva-Liina & Francis Nolan. 1999. The effect of intonation on pitch cues to the Estonian quantity contrast. Proceedings of 14th International Congress of Phonetic Sciences, San Francisco, USA, 1873–1876.
Barnes, Jonathan, Alejna Brugos, Nanette Veilleux & Stefanie Shattuck-Hufnagel. 2011. Voiceless intervals and perceptual completion in F0 contours: Evidence from scaling perception in American English. Proceedings of 17th International Congress of Phonetic Sciences, Hong Kong, China, 108–111.
Barnes, Jonathan, Alejna Brugos, Stefanie Shattuck-Hufnagel & Nanette Veilleux. 2012. On the nature of perceptual differences between accentual peaks and plateaux. In Oliver Niebuhr (ed.), Prosodies – Context, function, communication (Language, Context, and Cognition 12), 93–118. Berlin/New York: de Gruyter. Google Scholar
Baumann, Stefan, Johannes Becker, Martine Grice & Doris Mücke. 2007. Tonal and articulatory marking of focus in German. Proceedings of 16th International Congress of Phonetic Sciences, Saarbrücken, Germany, 1029–1032.
Bohn, Ocke-Schwen. 2004. How to organize a fairly large vowel inventory: The vowels of Fering (North Frisian). Journal of the International Phonetic Association 34. 161–173. Google Scholar
Braun, Bettina. 2005. Production and perception of thematic contrast in German. Oxford: Peter Lang. Google Scholar
Breen, Mara, Evelina Fedorenko, Michael Wagner & Edward Gibson. 2010. Acoustic correlates of information structure. Language and Cognitive Processes 25. 1044–1098. Google Scholar
Caspers, Johanneke & Vincent J. van Heuven. 1993. Effects of time pressure on the phonetic realization of the Dutch accent-lending pitch rise and fall. Phonetica 50. 161–171. Google Scholar
Chen, Yija 2012. Message-related variation. In Abigail Cohn, Cécile Fourgeron & Marie K. Huffman (eds.), Oxford handbook of laboratory phonology, 103–114. Oxford: Oxford University Press. Google Scholar
de Graaf, Tjeerd & Peter Tiersma. 1980. Some phonetic aspects of breaking in West Frisian. Phonetica 37. 109–120. Google Scholar
del Giudice, Alex, Ryan Shosted, Kathryn Davidson, Mohammad Salihie & Amalia Arvaniti. 2007. Comparing methods for comparing pitch ‘elbows’. Proceedings of 16th International Congress of Phonetic Sciences, Saarbrücken, Germany, 1117–1120.
D’Imperio, Mariapaola, Barbara Gili Fivela & Oliver Niebuhr. 2010. Alignment perception of high intonational plateaux in Italian and German. Proceedings of 5th International Conference of Speech Prosody, Chicago, USA, 1–4.
Dombrowski, Ernst. 2013. Semantic features of ‘stepped’ versus ‘continuous’ contours in German intonation. Phonetica 70. 247–273. Google Scholar
Gili Fivela, Barbara & Mariapaola D‘Imperio. 2010. High peaks versus high plateaux in the identification of two pitch accents in Pisa Italian. Proceedings of 5th International Conference on Speech Prosody, Chicago, USA, 1–4.
Gordon, Matthew. 2004. The intonational realization of contrastive focus in Chickasaw. In Chungmin Lee, Matthew Gorden & Daniel Büring (eds.), Topic and focus: Cross-linguistic perspectives on meaning and intonation, 65–78. Dordrecht: Kluwer. Google Scholar
Görs, Karin & Oliver Niebuhr. 2012. Hocus Focus – How prosodic profiles of contrastive focus emerge and change in different elicitation contexts. In 6th International Conference of Speech Prosody, Shanghai, China, 262–265.
Grabe, Esther, Greg Kochanski & John Coleman. 2005. The intonation of native accent varieties in the British Isles: Potential for miscommunication? In Katarzyna Dziubalska-Kołaczyk & Joanna Przedlacka (eds.), English pronunciation models: A changing scene, 311–337. Bern: Peter Lang. Google Scholar
Grice, Martine & Stefan Baumann. 2002. Deutsche intonation und GToBI. Linguistische Berichte 191. 267–291. Google Scholar
Gussenhoven, Carlos. 2004. Transcription of Dutch intonation. In: Sun-Ah Jun (ed.), Prosodic typology – The phonology of intonation and phrasing, 118–145. Oxford: Oxford University Press. Google Scholar
Hoekstra, Jarich. 1991. Oer it beklamjen fan ferhâldingswurden yn it Frysk, it Hollânsk en it Ingelsk. Us Wurk 40. 67–103. Google Scholar
Hoekstra, Jarich. 2007. Fragen zum Possessivpronomen im Fering-Öömrang (Nordfriesisch). Us Wurk 56. 89–113. Google Scholar
Hoekstra, Jarich. 2013. Another quantificational variability effect: The indefinite pronoun neemen ‘no one’ as a floating quantifier and as a negative adverb in Fering-Öömrang (North Frisian). Lingua 134. 194–209. Google Scholar
House, D. 1990. Tonal perception in speech (Travaux de l’institute de linguistique de Lund 24). Lund: Lund University Press. Google Scholar
Knight, Rachel-Anne. 2004. The realisation of intonational plateaux: Effects of foot structure. In Lluïsa Astruc & Marc Richards (eds.), Cambridge occasional papers in linguistics 1, 157–164. Cambridge: Cambridge University Press. Google Scholar
Knight, Rachel-Anne. 2008. The shape of nuclear falls and their effect on the perception of pitch and prominence: Peaks vs. plateaux. Language and Speech 51. 223–244. Google Scholar
Knight, Rachel-Anne & Francis Nolan. 2006. The effect of pitch span on intonational plateaux. Journal of the International Phonetic Association 36. 21–38. Google Scholar
Kohler, Klaus J. 1990. Macro and micro F0 in the synthesis of intonation. In John Kingston & Mary E. Beckman (eds.), Papers in laboratory phonology I, 115–138. Cambridge/New York: Cambridge University Press. Google Scholar
Kohler, Klaus J. 1991. A model of German intonation. Arbeitsberichte des Instituts für Phonetik und Digitale Sprachverarbeitung (AIPUK) 25. 295–360. Google Scholar
Kohler, Klaus J. 2005. Timing and communicative functions of pitch contours. Phonetica 62. 88–105. Google Scholar
Ladd, D. Robert 2008. Intonational phonology. Cambridge: Cambridge University Press. Google Scholar
Ladd, D. Robert, Dan Faulkner, Hanneke Faulkner & Astrid Schepman. 1999. Constant “segmental anchoring” of F0 movements under changes in speech rate. Journal of the Acoustical Society of America 106. 1543–1554. Google Scholar
Ladd, D. Robert, Ineke Mennen & Astrid Schepman. 2000. Phonological conditioning of peak alignment in rising pitch accents in Dutch. Journal of the Acoustical Society of America 106. 2685–2696. Google Scholar
Ladd, D. Robert & Rachel Morton. 1997. The perception of intonational emphasis: Continuous or categorical? Journal of Phonetics 25. 313–342. Google Scholar
Lehiste, Ilse. 1970. Suprasegmentals. Cambridge: MIT Press. Google Scholar
Levine, Timothy R. & Craig R. Hullett. 2002. Eta squared, partial eta squared and the misreporting of effect size in communication research. Human Communication Research 28. 612–625. Google Scholar
Machač, Pavel & Radek Skarnitzl. 2009. Principles of phonetic segmentation. Praha: Nakladatelství Epocha. Google Scholar
Mack, Molly & Bernard Gold. 1986. The effect of linguistic content upon the discrimination of pitch in monotone stimuli. Journal of Phonetics 14. 333–337. Google Scholar
Mücke, Doris, Martine Grice, Johannes Becker, Anne Hermes & Stefan Baumann. 2006. Articulatory and acoustic correlates of prenuclear and nuclear accents. Proceedings of 3rd International Conference of Speech Prosody, Dresden, Germany, 297–300.
Niebuhr, Oliver. 2007. The signalling of German rising-falling intonation categories – The interplay of synchronization, shape, and height. Phonetica 64. 174–193. Google Scholar
Niebuhr, Oliver. 2010. On the phonetics of intensifying emphasis in German. Phonetica 67. 170–198. Google Scholar
Niebuhr, Oliver. 2011. Alignment and pitch-accent identification – Implications from F0 peak and plateau contours. Arbeitsberichte des Instituts für Phonetik und Digitale Sprachverarbeitung (AIPUK) 38. 77–95. Google Scholar
Niebuhr, Oliver & Gilbert I. Ambrazaitis. 2006. Alignment of medial and late peaks in German spontaneous speech. Proceedings of the 3rd International Conference of Speech Prosody, Dresden, Germany, 161–164.
Niebuhr, Oliver, Mariapaola D’Imperio, Barbara Gili Fivela & Francesco Cangemi. 2011. Are there “shapers” and “aligners”? Individual differences in signalling pitch accent category. Proceedings of 17th International Congress of Phonetic Sciences, Hong Kong, China, 120–123.
Niebuhr, Oliver & Klaus J. Kohler. 2004. Perception and cognitive processing of tonal alignment in German. In 1st International Symposium on Tonal Aspects of Languages (TAL): Emphasis on Tone Languages, Beijing, China, 155–158.
Peters, Benno. 2005. The Kiel corpus of spontaneous speech. Arbeitsberichte des Instituts für Phonetik und Digitale Sprachverarbeitung (AIPUK) 35a. 1–6. Google Scholar
Peters, Benno, Klaus J. Kohler & Thomas Wesener. 2005. Melodische Satzakzentmuster in prosodischen Phrasen deutscher Spontansprache - Statistische Verteilung und sprachliche Funktion. Arbeitsberichte des Instituts für Phonetik und Digitale Sprachverarbeitung (AIPUK) 35a. 7–54. Google Scholar
Peters, Jörg. 2008. Saterfrisian intonation. An analysis of historical recordings. Us Wurk 57. 141–169. Google Scholar
Peters, Jörg. 2010a. Intonation des Niederdeutschen. Eine Untersuchung zu Weener (Rheiderland). Jahrbuch des Vereins für niederdeutsche Sprachforschung 133. 105–140. Google Scholar
Peters, Jörg. 2010b. Tonal variation of West Germanic languages. In Thomas Stolz, Esther Ruigendijk & Jürgen Trabant (eds.), Linguistik im Nordwesten. Beiträge zum 1. Nordwestdeutschen Linguistischen Kolloquium, 2008, Bremen, 79–102. Bochum: Brockmeyer. Google Scholar
Peters, Jörg. 2014. Intonation. Heidelberg: Winter. Google Scholar
Peters, Jörg, Judith Hanssen & Carlos Gussenhoven. 2014. The phonetic realization of focus in West Frisian, Low Saxon, High German, and three varieties of Dutch. Journal of Phonetics 46. 185–209. Google Scholar
Peters, Jörg, Judith Hanssen & Carlos Gussenhoven. 2015. The timing of nuclear falls: Evidence from Dutch, West Frisian, Dutch Low Saxon, German Low Saxon, and High German. Laboratory Phonology 6. 1–52. Google Scholar
Pierrehumbert, Janet B. 1979. The perception of fundamental frequency declination. Journal of the Acoustical Society of America 66. 363–369. Google Scholar
Pierrehumbert, Janet B. & Shirley A. Steele. 1989. Categories of tonal alignment in English. Phonetica 46. 181–196. Google Scholar
Rapp, Stefan. 1998. Automatisierte Erstellung von Korpora f¨ur die Prosodieforschung. Stuttgart: University of Stuttgart Ph.D. dissertation. Google Scholar
Remijsen, Bert. 2013. Tonal alignment is contrastive in falling contours in Dinka. Language 89. 297–327. Google Scholar
Remijsen, Bert & Otto G. Ayoker. 2014. Contrastive tonal alignment in falling contours in Shilluk. Phonology 31. 435–462. Google Scholar
Scheffers, Michel & Werner Thon. 1991. Workstation and signal processing software for experimental phonetics. Proceedings of 12th International Congress of Phonetic Sciences, Aix-en-Provence, France, 486–489.
Schepman, Astrid, Robin Lickley & D. Robert Ladd. 2006. Effects of vowel length and “right context” on the alignment of Dutch nuclear accents. Journal of Phonetics 34. 1–18. Google Scholar
Schweitzer, Katrin, Michael Walsh, Bernd Möbius, Arndt Riester, Antje Schweitzer & Hinrich Schütze. 2009. Frequency matters: Pitch accents and information status. Proceedings of 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece, 728–736.
Simpson, Adrian P. 2009. Phonetic differences between male and female speech. Language and Linguistics Compass 3. 621–640. Google Scholar
Tedsen, Julius K.V. 1906. Der Lautstand der föhringischen Mundart. Zeitschrift für Deutsche Philologie 38. 468–513. Google Scholar
’t Hart, Johan. 1981. Differential sensitivity to pitch distance, particularly in speech. Journal of the Acoustical Society of America 69. 811–821. Google Scholar
’t Hart, Johan, René Collier & Antonie Cohen. 1990. A perceptual study of intonation– An experimental-phonetic approach to speech melody. Cambridge: Cambridge University Press. Google Scholar
Tiersma, Peter M. 1999. Frisian reference grammar. Ljouwert: Fryske Akademy. Google Scholar
Tröster, Stefan. 1996. Phonologischer Wandel im Saterländischen durch Sprachkontakt. Niederdeutsches Jahrbuch 119. 179–191.Google Scholar
Tseng, Chiu-Yu, Shaohuang Pin & Ye-Lin Lee. 2004. Speech prosody: Issues, approaches and implications. In Gunnar Fant, Hiroja Fujisaki, J. Cao & Yi Xu (eds.), From traditional phonology to modern speech processing, 417–438. Beijing: Foreign Language Teaching and Research Press. Google Scholar
Walker, Alastair. 1990. Frisian. In Charles V. J. Russ (eds.), The dialects of modern German, 1–30. Oxon: Routledge. Google Scholar
Walker, Alastair & Ommo Wilts. 2001. Die nordfriesischen Mundarten. In Horst H. Munske & Nils Århammar (eds.), Handbook of Frisian studies, 284–304. Tübingen: Niemeyer. Google Scholar
Wang, Bei, Ling Wang & Tursun Qadir. 2011. Prosodic realization of focus in six languages/dialects in China. Proceedings of 17th International Congress of Phonetic Sciences, Hong Kong, China, 144–147.
Wichmann, Anne, Jill House & Toni Rietveld. 2000. Discourse constraints on peak timing in English: Experimental evidence. In Antonis Botinis (ed.), Intonation, 162–183. Dordrecht: Kluwer. Google Scholar
Willkommen, Dirk. 1991. Sölring. Phonologie des Nordfriesischen Dialekts der Insel Sylt. Kiel: Kiel University Press. Google Scholar
Wilts, Ommo, Elene Braren & Nickels Hinrichsen. 2010. Wurdenbuk foer feer an oomram (Nordfriesische Wörterbuchstelle der Uni Kiel). Wittdün auf Amrum: Quedens. Google Scholar
Witte, Tim. 2015. Phrasenfinale Intonationsverläufe des Fering-Öömrang Friesischen. Kiel, Germany: Kiel University B.A. thesis. Google Scholar
Xu, Yi & Xuejing Sun. 2002. Maximum speed of pitch change and how it may relate to speech. Journal of the Acoustical Society of America 111. 1399–1413. Google Scholar
Xu, Yi & Q. Emily Wang. 2001. Pitch targets and their realization: Evidence from Mandarin Chinese. Speech Communication 33. 319–337. Google Scholar
Xu, Yi & Ching X. Xu. 2005. Phonetic realization of focus in English declarative intonation. Journal of Phonetics 33. 159–197. Google Scholar
The audio files of Figures 2–4, as well as 20 plateau examples are available for download from https://www.isfas.uni-kiel.de/de/linguistik/forschung/audio-examples-for-labphon-paper-niebuhr-hoekstra-2015/view.
One of our reviewers suggested we completely stylize the F0 analysis output in PRAAT at 1 st and select all turning points on this basis, as this would make turning-point detection more objective. We think that this is not necessarily true and consider our method sufficiently objective for the following reasons. First, although a 1-st stylization of the F0 course in PRAAT could in fact reduce the number of possible, visually determined turning-point candidates, it often still leaves labelers with some alternative candidates to choose from. So, an F0 stylization (no matter at what arbitrarily chosen frequency interval) does not fully exempt the prosodic labeler from making subjective, but reasonable decisions on F0 turning points based on his/her training and experience. Second, our acoustic analysis is based on three trained labelers, who additionally made a majority decision in problematic cases. This approach reduces individual influences on the measurements. Third, since our analysis addresses an intonationally largely unexplored variety of Frisian, none of the labelers had a preconceived opinion about where relevant F0 turning points should be located relative to the segmental string. Fourth, three of the five turning points – rise offset, peak maximum, and fall onset – were objectively defined as the first local F0 peak maximum and the F0 points 1 st below this maximum.
More precisely, we do not claim that the fall onset exactly corresponds to this additional tonal target. As was stated in Section 4.2, the fall onset was simply defined by a 1-st F0 difference after the peak maximum and introduced as a mere measuring point of peak duration. So, how, where, and at what frequency/scaling distance from the F0 peak maximum the assumed second high tonal target actually manifests itself in the F0 course has to be determined in follow-up studies. In addition to a detailed analysis of the F0 course, these follow-up studies could use the derivative-based approach of Xu and Wang (2001) in order to define and reliably determine the second high tonal target.