Accessible Published by De Gruyter Mouton March 17, 2016

Acoustic Evidence of New Sibilants in the Pronunciation of Young Polish Women

Bartlłomiej Czaplicki, Marzena Żygis, Daniel Pape and Luis M. T. Jesus

Abstract

We present the results of an acoustic study showing that the Polish sibilant system is undergoing changes in the speech of young university-educated women. The results based on the acoustic analysis of 16 speakers’ pronunciation, reveal that the new variants of alveolo-palatals are characterised by spectral peaks at higher frequencies and higher centre of gravity values than their Standard Polish counterparts. In addition, spectral moments, spectral slopes and the formants of preceding vowels differentiate the new variants from Standard Polish alveolo-palatals. We provide the rationale for the development of the new variants by referring to (i) a functional approach involving contrast optimisation in the sibilant system, (ii) a sociolinguistic approach that makes use of a sound-symbolic association between energy concentration in higher frequency regions and smallness and (iii) a speech disorder.

1. Introduction

The Polish system of sibilants is relatively complex and involves three places of articulation, shown in (1).[1]

  1. (1a)

    dento-alveolars: s z

  2. (1b)

    retroflexes (post-alveolars): ʂ ʐ

  3. (1c)

    alveolo-palatals: ɕ ʑ

The Polish system of sibilants is currently undergoing a change in the speech of young women. The innovation involves alveolo-palatals and can be characterised as a change in progress in its initial stages. As a useful approximation, the new variants of alveolo-palatals can be described as involving a type of exaggerated palatalisation of dento-alveolars and will be represented as /sʲ/, /zʲ/, and for the purposes of the present study. [2] In its current distribution, the change is age- and gender-related and restricted to young women. The change has recently come to the attention of many Poles. In a daily radio programme that addresses language-related questions of its listeners, the author, a linguist, comments on it. Some representative quotes from the programme are given in (2) (translations are our own, see Appendix A for original quotes).

  1. (2)

    Quotes from the radio programme “Co w mowie piszczy” [3]

    “Women chirp sweetly.”

    “It is supposed to make them sound kind, sweet and womanly.”

    “They speak like small children.”

    “For goodness’ sake, we don’t have to retrogress.”

    “We can’t accept this.”

    “[...] infantile and pretentious pronunciation.”

    “The worst thing that can happen to us is to hear [this pronunciation] used by a grown woman.”

Such admonitions voiced by a linguist in response to her listeners’ questions indicate that (i) the change has reached common awareness, (ii) it is heavily stigmatised, (iii) its occurrence is restricted to women, and (iv) it shares some characteristics with the way children speak. The change has been noticed by many non-linguists. For example, in a Polish edition of the programme “Top Model” one of the participants who has this trait in her pronunciation is criticised for being too infantile (“Co ty się tak pieścisz?” ‘Why are you talking like a baby?’), to which she responds that she is aware of sounding childish. What is particularly interesting, the clip shows that this speaker can suspend the innovative pronunciation and switch to more standard variants of the sibilants at will under specific social conditions (e.g., using a more formal register).[4] In summary, there is evidence that the alveolo-palatal sibilants show signs of a change in the pronunciation of some young women.

On the basis of such evidence pointing to an increasing social awareness of the change and our own impressions,[5] we assume that there are two distinct realisations of alveolo-palatal sibilants among young women in current Polish.

The aim of this paper is to analyse the new variants and their Standard Polish counterparts from the acoustic point of view. We used the multitaper method (Shadle 2006; Shadle 2012;Lousada, Jesus and Pape 2012) to analyse the spectral properties of the sibilants.

In an attempt to find explanation for the change, we mainly refer to (i) a functional approach involving contrast optimisation in the sibilant system and to (ii) a sociolinguistic approach that makes reference to the existence of a soundsymbolic association between energy concentration in higher frequency regions and smallness (Ohala 1994). It is proposed that the new developments in Polish sibilants serve to enhance this iconic function of palatalisation and produce an impression of “over-palatalisation”, signalling youth and childishness.

This paper is structured as follows. Section 2 provides the description of the acoustic experiment and Section 3 presents the results related to a comparison of the new variants of sibilants and their counterparts in standard speech. In 3.1 a linear discriminant analysis is presented showing which of the investigated variables are the most useful in discriminating the new variant from the Standard Polish variant. A discussion of the results follows. Section 4 focuses on the rationale behind the change and its functional and sociolinguistic aspects. We present a multi-dimensional analysis including all the investigated parameters measured for all sibilants in order to compare the Standard Polish system with the system including the new variants. Section 5 concludes.

2. Acoustic analysis

In this section we present the experimental design and the results of an acoustic experiment in which we investigate the alveolo-palatal /ɕ/ and . We hypothesise that new variants of these sounds pronounced by young female speakers are significantly different from the Standard Polish variants. In addition, in section 4.1 we provide the results of an acoustic analysis of the retroflex /ʂ ʐ / and dento-alveolar sibilants /s z / in order to get more insight into the whole sibilant system.

2.1. Experimental design

In our material, the sibilants /s/, /ʂ/, /ɕ/, , , and appeared in wordinitial, stressed /#_a/ and word-medial, unstressed /a_a/ positions of words embedded in (i) a carrier sentence (Powiedziała __ do ciebie, ‘She said __ to you’) and (ii) a coherent text. In order to include more variability, the words within the coherent text appear in different vowel contexts and at different places in sentences. The words appearing in the carrier sentence were always bisyllabic and stressed on the first syllable. The sentences were repeated five times and the text was read twice. All the words as well as the text are provided in Appendix B.

Sixteen native speakers of Standard Polish took part in this experiment. They were all women studying at the University of Warsaw in Poland, aged 19-23. All the speakers came from central Poland (Masovia) both from urban and rural locations. We made sure that they spoke Standard Polish with no regional traits. We pre-selected our speakers according to a perceptual impression of the first author and confirmation by the second author, both native speakers of Polish. Eight speakers who pronounce the new variant and eight speakers who pronounce the Standard Polish alveolo-palatal /ɕ/ took part in the experiment.[6][7]

All recordings were conducted in a quiet room at the University of Warsaw, Poland. The recordings were made using a Sony ICD-MX20 Solid-State Recorder at a sample rate of 44,100 Hz and 16 bits. We used the internal microphone of the Solid-State Recorder, which has a non-linear, steeply decreasing frequency response above 11,000 Hz. Our spectral analyses were band-limited from 20 Hz to 11,000 Hz, which provides a suitable frequency range, previously used for the computation of multitaper spectra and the resulting spectral moments (Lousada et al. 2012 and Żygis et al. 2012). For the formant analysis the audio data were down-sampled to 11,025 Hz. The items were analysed with Praat version 5.2 (Boersma and Weenink 2011) and MATLAB R2007b.

For the purposes of the present study, six places in the spectrogram of the signal were determined by placing the cursor at the following points:

  1. (3)

    Marking points

    Point 1: The beginning of the vowel preceding the consonant (V1).

    Point 2: The end of the vowel preceding the consonant (V1).

    Point 3: The beginning of the frication noise of the sibilant.

    Point 4: The end of the frication noise of the sibilant.

    Point 5: The beginning of the vowel following the consonant (V2).

    Point 6: The end of the vowel following the consonant (V2).

All the six landmarks in (3) are exemplified in the oscillogram and spectrogram of the Polish name [kaɕa] ‘Kate’ in Figure 1. It should be noted that in the case of affricates two additional points were marked, i.e., the beginning and end of the burst (preceding the noise part of the affricate). We excluded the transitional parts from 2 to 3 and 4 to 5 from consideration for two reasons: (i) it was extremely difficult or in some cases even impossible to conduct a formant analysis there and (ii) these transitions were often produced voiced and therefore not suited to be considered as part of the voiceless fricative. While the respective onsets and offsets of the vowels were based on the clear appearance/ disappearance of the higher formants (especially F2 and F3), the respective onsets and offsets of the sibilants were based on the clear appearance/disappearance of frication noise. Both procedures are classical techniques for vowel and sibilant phoneme annotation.

Figure 1 Oscillogram and spectrogram of Polish [aɕa] from the word Kasia [kaɕa] ‘Kate’.

Figure 1

Oscillogram and spectrogram of Polish [aɕa] from the word Kasia [kaɕa] ‘Kate’.

Since both the spectral properties of sibilants and the formants of the neighbouring vowels significantly contribute to differences in Polish sibilants (Jassem 1962, 1968, 1979, Nowak 2003) we investigated the following acoustic parameters:

  1. (4)

    Parameters:

    1. (i)

      The frequency of the highest spectral peak of the spectrum from 20 Hz to 11,000 Hz;

    2. (ii)

      The four spectral moments according to the Praat 5.2 formulae (see Praat Help pages for the formulae used): Centre of gravity (COG); Standard Deviation of the spectrum (STD); Skewness; Kurtosis;

    3. (iii)

      The spectral slopes m1 and m2 (Jesus and Shadle 2002);

    4. (iv)

      The frequencies of formants F1, F2, and F3 for the vowels preceding and following the target sibilant (endpoint frequencies);

    5. (v)

      The formant frequency range of F1, F2, and F3 for the vowels preceding and following the target sibilant;

    6. (vi)

      The frication duration.

Regarding (4i–4iii), all spectral values were calculated at the midpoint of the frication by computing multitaper spectra, which are more suited for sibilant spectra then other spectral estimation methods like Fast Fourier Transform (FFT) and Linear Predictive Coding (LPC) (see also Żygis et al. 2012 for a discussion of the multitaper method). A 23-ms window was placed at the frication noise midpoint (512 point Hamming window). The power spectral density (PSD) was estimated via Thomson’s (2000) multitaper method (linear combination with unity weights of individual spectral estimates and the default FFT length available in the MathWorks Signal Processing Toolbox Version 6.2 (MathWorks 2007: 470–475)).

The highest spectral peak was computed in the frequency range of from 20 Hz to 11,000 Hz. The frequency range is wider than previously used frequency ranges (in Forrest et al. 1988; Zsiga 1993;Jongmann et al. 2000; and Gordon et al. 2002, the fricative spectra were analysed for the frequency range 0–10,000 Hz for each fricative, and Harrington 2000 analysed the fricative spectra from 0 Hz to 7,000 Hz). We increased the frequency range to 11,000 Hz in order to include all relevant spectral details of the more fronted fricatives (e.g. /s/). We constantly checked manually that there was no low-frequency energy in the noise part of the fricative spectra (e.g. due to spurious voicing), so the 20 Hz lower frequency limitation is justified for our analyses.

We also computed four spectral parameters for the frication noise using Praat 5.2 (Boersma and Weenink 2011): centre of gravity (COG), standard deviation of the spectrum, skewness and kurtosis. The centre of gravity is the pairwise weighting of spectral amplitude with frequency and indicates the average central frequencies for the complete spectrum. The standard deviation (STD) indicates how much frequencies in the spectrum deviate from the centre of gravity. Skewness shows if the spectrum is skewed towards lower or higher frequencies, while kurtosis can be seen as a measure of spectral peakedness.

Figure 2 Examples of both individual fricative spectra (left panels) and affricate spectra (right panels) comparing a typical speaker of the new variant (upper panels) with a typical speaker of Standard Polish (lower panels).

Figure 2

Examples of both individual fricative spectra (left panels) and affricate spectra (right panels) comparing a typical speaker of the new variant (upper panels) with a typical speaker of Standard Polish (lower panels).

The spectral slopes m1 and m2 were computed from 500Hz to the average spectral peak frequency (m1) and from the average spectral peak frequency to 11,000 Hz (m2).[8] The frequency is the overall mean of all highest spectral peaks at the given acoustic landmark (mid of frication noise), rounded to the nearest kHz. In Jesus and Shadle (2002) and Lousada et al. (2012), it was shown that this computation of and its rounding adapts well to finding the endpoint between the two parts of the spectra (low frequency part from 500 Hz to (m1) and high frequency part from to the highest frequency (m2)) (see Jesus and Shadle 2002: 447 for further information). In Jesus and Shadle (2002), a value of 4,000 Hz was found for post-alveolar fricatives of European Portuguese, and in Lousada et al. (2012: 10), a value of 3,900 Hz was found for the burst of dental stops. However, in order to apply this measure to the present study, we had to adapt it to our Polish data, assuming (and thus simplifying) that the place of articulation is more or less consistent for corresponding phonemes. In our case, we found a value of 5,000 Hz for all alveolo-palatals. Thus, m1 is the slope of the spectral regression line for the frequency range between 500 Hz and the corresponding value 5,000 Hz, and m2 is the slope of the spectral regression line for the range between the value of 5,000 Hz and 11,000 Hz.

Figure 3 Mean multitaper spectra for  for each speaker (light grey) and the corresponding overlaid regression lines (used to calculate spectral slopes m1 and m2) (black) for each speaker.

Figure 3

Mean multitaper spectra for for each speaker (light grey) and the corresponding overlaid regression lines (used to calculate spectral slopes m1 and m2) (black) for each speaker.

Before outlining in detail the spectral measurements, we provide individual examples of fricative and affricate spectra (Figure 2) and mean spectra for each speaker (Figure 3). Figure 2 presents examples of individual fricative multitaper spectra (left panels) and affricate spectra (right panels) comparing a speaker of the new variant (upper panels) with a speaker of Standard Polish (lower panels). In Figure 3, we present mean multitaper spectra of over all items for each speaker (light colour) and the corresponding regression lines (dark colour) comparing Standard Polish and the new variant.

Regarding (4iv–4vi), the vowel frequency formants F1, F2, and F3 were measured at the onsets and offsets of both preceding and following vowels, i.e. at Points 1, 2, 5, and 6. The offset was defined as the end of the stable formant structure. The formants of the vowel segments were measured semiautomatically by means of Linear Predictive Coding (LPC). Prior to formant analysis (and only for the formant analysis) the audio signals were downsampled to 11,025 Hz (to only allow for formant peak picking in the first five formants up to 5,500 Hz). The LPC spectra were calculated by using the following parameters: pre-emphasis frequency 50 Hz; analysis window duration 0.0256 s; time step 0.001 s; prediction order of 13. Five peaks from the LPC spectrum derived by peak picking were considered as formant candidates. Since in some cases a certain formant value could not be detected by the peak-picking algorithm, for each spectrum, the three formant candidate values were visually inspected and manually corrected if necessary in order to determine the correct formant values.

If the values obtained were positive, the formant preceding the consonant was considered to be rising and in the case of the vowel following the consonant, the formant was considered to be falling. If the values were negative, the formant of the preceding vowel was falling and the formant of the following vowel was rising.

Regarding (4vi), the duration of the frication was measured from Point 3 to Point 4 (see Figure 1).

2.2. Statistics

Statistical analyses were conducted in the R environment (R Development Core Team 2010). We used linear mixed effects models to analyse the influence of the fixed effects VARIANT [new variant, Standard Polish variant], STRESS [stressed, unstressed] and SPEECH STYLE [carrier sentence, text] as well as REPETITION [9] on the following dependent variables: the highest spectral peak of the complete spectrum (20–11,000 Hz); Centre of Gravity; Standard Deviation; Skewness; Kurtosis; Spectral slopes m1 and m2; F1, F2, F3 frequencies of the preceding and following vowel; F1, F2 and F3 frequency range of the preceding and following vowel and duration. In addition, interactions of dependent variables were included into initial models. To account for variability among speakers and items and to minimise Type I error (Barr et al. 2013), random intercepts for participants and items as well as by-speaker slopes for STRESS, SPEECH STYLE and REPETITION were included as well. Furthermore, very high correlations between random-effects terms were excluded from the model. The residuals of the models were tested for their distribution. The analysis showed that they were normally or nearly normally distributed. By means of ANOVAs the maximised models were tested against less complex models and the best fit model was taken as a final model. All p-values were based on Satterthwaite approximation available in the package “lmerTest” (Kuznetsova et al. 2015), providing different kinds of tests for linear mixed-effect models implemented in the “lme4” package (Bates et al. 2015). See also our note commenting on p-values in Appendix C. The lme models were run for /ɕ/ and separately.

The results presented in section 3 are based on 699 tokens (394 containing /ɕ/ and 305 containing ; 325 tokens were produced in the carrier sentence and 374 in the text). The multidimensional analysis presented in section 4.1 is based on 1942 tokens.

3. Results

In the following, we compare the Standard Polish alveolo-palatal /ɕ/ and and their new variants. The presentation of individual parameters follows the order given in (4). Note that the boxes presented in all figures correspond to the range between the 25th and 75th percentile; the black dot represents the median and the whiskers correspond to ±1.5 inter-quartile range; data above or below this range are outliers and are represented as points in the graph. Very few instances of extreme outliers have been removed.

Parameter (i): The highest spectral peak frequency in the frequency range from 20 Hz to 11,000 Hz.

Figure 4 shows the results of the spectral peak frequency of the burst found in Standard Polish (SP) and new variant (NV) of /ɕ/ and in the spectral range from 20 to 11,000 Hz.

The results reveal that the spectral peak frequency is significantly higher in the new variant than in Standard Polish for both /ɕ/ and (/ɕ/: NV 5,490 Hz vs. SP 4,176 Hz, t=8.024, p<.001; : NV 5,546 Hz vs. SP 4,164 Hz, t=6.706, p<.001).

Figure 4 Box plots for frequency of spectral peaks of /ɕ/ and  as pronounced in Standard Polish and in the new variant.

Figure 4

Box plots for frequency of spectral peaks of /ɕ/ and as pronounced in Standard Polish and in the new variant.

Figure 5 Box plots for centre of gravity (top) and standard deviation (bottom) of /ɕ/ and  as pronounced in Standard Polish and in the new variant.

Figure 5

Box plots for centre of gravity (top) and standard deviation (bottom) of /ɕ/ and as pronounced in Standard Polish and in the new variant.

Figure 6 Box plots for skewness (top) and kurtosis (bottom) of /ɕ/ and  as pronounced in Standard Polish and in the new variant.

Figure 6

Box plots for skewness (top) and kurtosis (bottom) of /ɕ/ and as pronounced in Standard Polish and in the new variant.

Parameter (ii): The spectral moments COG, STD, skewness and kurtosis.

Figure 5 presents the results obtained for measures of centre of gravity and standard deviation.

The centre of gravity values of /ɕ/ and were significantly higher in the new variant than in Standard Polish (/ɕ/: NV 5,367 Hz vs. SP 4,375 Hz, t=5.327, p<.001; : NV 5,205 Hz vs. SP 4,468 Hz, t=3.894, p<.01). In addition, as far as /ɕ/ is concerned, COG values were lower in words embedded in carrier sentences than when embedded in the text (t=−4.733; p<.001). Standard deviation did not significantly differ in the production in the new variant and Standard Polish fricatives and affricates.

Figure 6 presents the results of the two other spectral moments, i.e., skewness and kurtosis.

Skewness was significantly different in the Standard Polish alveolo-palatal sibilants in comparison to their new variants. The new variants of both /ɕ/ and displayed significantly lower skewness values than their Standard Polish corresponding segments (/ɕ/: NV −0.079 vs. SP 0.338, t=−2.050, p=.06 (at the level of statistical tendency) : NV −0.317 vs. SP 0.179, t=−2.822, p<.05). Thus, the spectra of the Standard Polish /ɕ/ and are right-skewed in contrast to the spectra of the new variants which are skewed to the left. Kurtosis was not significantly different in SP and the NV.

Figure 7 Box plots for the spectral slope measures m1 (top) and m2 (bottom) of /ɕ/ and  as produced in Standard Polish and in the new variant.

Figure 7

Box plots for the spectral slope measures m1 (top) and m2 (bottom) of /ɕ/ and as produced in Standard Polish and in the new variant.

Parameter (iii): The spectral slopes m1 and m2.

Figure 7 shows the results for the two spectral slopes m1 and m2 for both /ɕ/ and in Standard Polish and in the new variant.

The two measures m1 and m2 of the regression lines of frication spectra are very useful in differentiating the Standard Polish and new variants of /ɕ/ and . The first regression line slope (m1) of was significantly lower in the new variant than in Standard Polish (: NV 1.31 dB/kHz2 vs. SP 2.51 dB/ kHz2, t=−2.348, p<.01). The second regression line slope (m2) of both sibilants shows significantly lower values in the new variant when compared to the

Figure 8 Frequency of F1 and F2 of the vowel preceding /ɕ/ and  in Standard Polish and in the new variant.

Figure 8

Frequency of F1 and F2 of the vowel preceding /ɕ/ and in Standard Polish and in the new variant.

Standard Polish variant (/ɕ/: NV −4.76 dB/kHz2 vs. SP −4.4 dB/kHz2, t=−3.738, p<.01; : NV −4.91 dB/kHz2 vs. SP −4.6 dB/kHz2, t=−4.623, p<.001). In sum, for the phoneme the spectral slope was significantly lower for the new variant as compared to Standard Polish for the frequency range from 500 Hz to 5,000 Hz (m1). For the higher frequency range from 5,000 Hz to 11,000 Hz, both /ɕ/ and show a significantly steeper spectral decrease for the new variant as compared to Standard Polish,[10] cf. also Figure 3 where the mean multitaper spectra of in Standard Polish and the new variant are shown. As can be seen, all mean spectra clearly rise more steeply in the lower frequency range (up to 5,000 Hz) for the Standard Polish as compared to the new variant.

Figure 9 Frequency of F3 of the vowel preceding and following /ɕ/ and  in Standard Polish and in the new variant.

Figure 9

Frequency of F3 of the vowel preceding and following /ɕ/ and in Standard Polish and in the new variant.

For the higher frequency range (5,000 Hz to 11,000 Hz) there is a statistically significant difference between the two variants, namely, the slopes of the regression lines of the new variants are steeper.

Parameter (iv): The formants F1, F2, and F3 of the vowels preceding and following the consonant.

Figure 8. shows the results for the first and second formant at the end of the vowel preceding /ɕ/ and in the Standard Polish and new variants.

As far as the frequency of F1 of the preceding vowel is concerned, the results point to a lower F1 in the new variant of than in its Standard Polish corresponding sound (NV 594 Hz vs. SP 665 Hz, t=−2.730, p<.05). In addition, REPETITION had also a significant effect (/ɕ/: t=−2.006, p<.05; : t=−2.204, p<.05). Regarding the frequency of F2 of the preceding vowel, no significant differences were stated between the new variants and Standard Polish sibilants.

Furthermore, F3 was significantly different for the two variants when preceding /ɕ/ for the two variants (NV 2,889 Hz vs. SP 2,728 Hz, t=3.777, p<.01).

Figure 10 F1 and F2 formant frequency range of the preceding vowel in Standard Polish and in the new variant.

Figure 10

F1 and F2 formant frequency range of the preceding vowel in Standard Polish and in the new variant.

Figure 9 shows the results of F3 for both the preceding and the following vowel.

The results also showed that the F3 frequency of the following vowel was significantly higher in the new variants in comparison to the Standard Polish variants. This conclusion applies to both the fricative /ɕ/ and the affricate (/ɕ/: NV 2,960 Hz vs. SP 2,831 Hz, t=2.763, p<.05; : NV 2,939 Hz vs. SP 2,810 Hz, t=2.567, p<.05). In the case of /ɕ/ SPEECH STYLE also showed a significant effect (t=−3.007, p<.01) and in the case of . In the case of the influence of REPETITION was significantly different in the Standard Polish and new variant (t = −2.325, p<.05).

By contrast, both F1 and F2 of the following vowel did not show significant differences in Standard Polish and the new variant.

Parameter (v): Formant frequency range of F1, F2, and F3 of the vowels preceding and following the consonant.

Figure 10 presents results of F1 and F2 formant frequency range of the preceding vowel.

The results show that only the F1 frequency range of the vowel [a] preceding the sibilants /ɕ/ and was significantly different in the Standard Polish and new variants; F1 frequency range was lower in the new variant (/ɕ/: NV −87 Hz vs. SP −33 Hz, t=−2.086, p<.05; : NV −121 Hz vs. −67 Hz, t=−2.086, p=.055 (at the level of statistical tendency)). No significant difference between the two variants was found for the F2 and F3 frequency range of the preceding vowel. Similarly, the F1, F2 and F3 frequency ranges of the following vowel were not significantly different in Standard Polish and the new variant.

Parameter (vi): Frication duration

No significant differences were found in the duration of frication of both the fricative and the affricate: while Standard Polish /ɕ/ has an average duration of 0.098 s, the average value of the new variant was 0.101 s. The fricative part of the affricate was 0.061 s in Standard Polish and 0.058 s in the new variant.

In all the analyses presented above the influence of SPEECH STYLE and STRESS was significant only in a few models. The results regarding speech style are consistent to some extent with the study of Maniwa et al. (2009), which showed that English fricatives exhibit different acoustic properties in a conversational vs. a clear style. REPETITION was more often significant, albeit exclusively, in formants and duration, and more frequently in affricates than in fricatives; see the results of the statistical analyses in Appendix C.

3.1. Linear discriminant analysis

The analysis presented above points to significant differences in several parameters between the new and Standard Polish sibilant. The purpose of the following linear discriminant analysis (LDA) it to show which of the investigated variables are the most useful in discriminating the new variant from the Standard Polish variant and what percentage of the data can be correctly classified by using this statistical method. We used the LDA with jacknifed (i.e., leave one out) predictions. The LDA was previously successfully used for the discrimination of English fricatives (Jongman et al. 2000; McMurray and Jongman 2011). In line with the analysis presented above, we will keep fricatives and affricates in separate groups.

All the acoustic parameters presented in (4) were included as predictors in the LDA regardless of the fact that some of them were not significant in the linear regression models. The LDA outputs a single function that best discriminates both variants of fricatives (Standard Polish vs. new variant). Since the spectral peak and COG strongly correlated with each other (0.79), the former was also excluded from the model. In total, 19 predictors fitted the LDA, 9 of which did not significantly contribute to the discrimination of the Standard Polish versus new variant. Hence, the remaining parameters (10) were fitted into the final model. The model was highly significant (Wilks’ Lambda = 0.5251, χ2 = 197.103, df = 10, p<.0001) confirming that the parameters are able to successfully discriminate the two types of fricative sibilants. In total 83.3% of the data were correctly classified. In particular, the model correctly classified 89.3% of the tokens with the Standard Polish variant and 76.3% with the new variant. The correlation ratio of each parameter with the discriminant function, Fstatistics and p-values are presented in Table 1. The results show that the discriminator correlates most strongly with COG as well as the third formant of the preceding and following vowel.

Table 1

Results of linear discriminant analysis for fricatives.

ParametersCorrelation ratioF-statisticp-value
COG0.342163.230p<.0001
Standard Deviation0.0227.300p<.01
Skewness0.04615.303p<.001
Kurtosis0.03812.336p<.001
m10.06220.574p<.0001
Formant_V1_End_F20.04514.904p<.001
Formant_V1_End_F30.13849.954p<.0001
Formant_V2_Begin_F30.13046.609p<.0001
Formant_Range_V1_F10.0165.078p<.05
Duration0.0155.048p<.05

As far as affricates are concerned, 13 out of 20 parameters turned out to significantly discriminate the Standard Polish from the new variant. Again, the frequency peak strongly correlated with the COG (0.75) and therefore was excluded from further consideration. The final model fitted with 12 parameters was highly significant (Wilks’ Lambda = 0.4236, χ2= 101.354, df = 12, p<.0001). The correlation values for the variables together with F-statistics and p-values are presented in Table 2. The model correctly classified 91.2 % of the data. 94% of the Standard Polish affricates and 88.3% of the new affricates were correctly classified. The discriminant most strongly correlated with COG, as in the case of the fricatives, but a high correlation ratio is also found with Skewness and the first formant, including the formant range of the preceding vowel.

Table 2

Parameters of linear discriminant analysis for affricates.

ParametersCorrelation ratioF-statisticp-value
COG0.21634.303p<.0001
Standard Deviation0.0648.539p<.01
Skewness0.15723.121p<.0001
m10.0719.579p<.01
m20.0435.625p<.05
Formant_V1_End_F10.16223.982p<.0001
Formant_V1_End_F20.10113.984p<.001
Formant_V1_End_F30.0577.619p<.01
Formant_V2_Begin_F20.0324.112p<.05
Formant_V2_Begin_F30.0310.968p<.05
Formant_Range_V1_F10.1060.893p<.001
Formant_Range_V2_F20.0430.956p<.05

The results of the linear discriminant analyses are presented in Figure 9. The left panels show the linear discriminant for Standard Polish /ɕ/ versus its corresponding new variants and the right panel shows the linear discriminant for Standard Polish versus its corresponding new variants.

In summary, the results point to significant spectral differences between Standard Polish sibilants and their new variants. The analysis shows that the parameters used in the LDA are able to discriminate as much as 83.3% of the data in the case of fricatives and 91.2 % of the data in the case of the affricates.

4. Finding motivation for the change

Having established that the new variant is acoustically significantly different from the standard variant, we now attempt to shed some light on the possible causes of this change. We advance three hypotheses referring to (i) contrast optimisation, (ii) sociolinguistics and (iii) a speech disorder.

Figure 11 Histograms for the observations of Standard Polish and new variants of fricatives (left panels) and affricates (right panels) plotted according to their linear discrimination function values.

Figure 11

Histograms for the observations of Standard Polish and new variants of fricatives (left panels) and affricates (right panels) plotted according to their linear discrimination function values.

4.1. Functional hypothesis – contrast optimisation

A large literature in phonology has focused on identifying mechanisms promoting contrast maintenance in linguistic systems (Gillieron 1918; Martinet 1955). It has been demonstrated that contrasts in phonological inventories tend to be evenly dispersed. This cross-linguistic tendency has been variably attributed to active constraints within the grammar that refer directly to properties of the relation of contrast (e.g. Flemming 2004, Padgett 2001, Padgett and Żygis 2007) or to a production-perception feedback loop within self-organising systems (Pierrehumbert 2001; Wedel 2007).

Regarding our study, a question arises whether and how the new variant of the sibilant changed the relations to other sibilants. In order to answer this question we provide more insight into the system containing the new variant and compare it to the sibilant system of Standard Polish. For this purpose we chose multidimensional scaling which illustrates the distances among the objects, i.e. the sibilant phonemes in our case. Usually, this explorative multivariate technique is used to capture perceptual or cognitive properties of stimulus objects in a low-dimensional, extended Euclidean space (Kruskal and Wish 1978). In the present study the distances are expressed by means of acoustic parameters. Based on all 20 acoustic parameters presented in (4), which were calculated for /s ʂ ɕ/ and , we obtained two two-dimensional systems: one for speakers producing Standard Polish sibilants and the other system for speakers producing the new variant. Such a classification will give us more insight into the differences between the systems.

The results are shown in Figure 12. The larger triangle stands for the sibilant system inferred from speakers who produced the new alveolo-palatal fricative and the blue triangle for the speakers with Standard Polish fricatives. The subscript [n] denotes the new system.

Figure 12 Multidimensional scaling of Polish fricatives.

Figure 12

Multidimensional scaling of Polish fricatives.

The results presented in Figure 12 reveal that (i) the distance between the palatalized /sʲ/ and the retroflex /ʂ/ is greater in the new system than the distance between /ɕ/ and /ʂ/ in the system of Standard Polish and (ii) the distance between the palatalized /sʲ/ and /s/ in the new system is also larger than in Standard Polish. These results strongly suggest that, due to the innovative realisation of alveolo-palatals, the sibilant system of Standard Polish is becoming more optimal in terms of acoustic and presumably perceptual distances. Exactly the same conclusion can be drawn with respect to affricates (see Figure 13). In addition, it seems that the distance between segments in the new and Standard Polish system is greater than the distance between the corresponding /s/ segments. This in turn suggests that the dento-alveolar affricates may also be undergoing changes even if the differences are not that perceivable to language users.[11]

Figure 13 Multidimensional scaling of Polish affricates.

Figure 13

Multidimensional scaling of Polish affricates.

4.2. Sociolinguistic hypothesis

This section begins with evidence indicating that sociophonetic variation is a function of age, gender and other social factors. This finding becomes relevant in the discussion of the sociolinguistic aspects of the change in the Polish sibilant system. It is argued that the innovative alveolo-palatals have acoustic cues that evoke the image of childishness and that these properties have a certain appeal for some young women. The new variant becomes an identity marker and diffuses through a community, differentiating social groups.

The fact that speech is inherently variable has been known for a long time. However, it was not until the second half of the twentieth century that speech variability became the subject of systematic studies. Speech is variable due to articulatory constraints and the natural laws of aerodynamics and acoustics operating within the vocal tract (Ohala 1983). However, not all variation is explainable in terms of purely phonetic considerations. Labov (1963), in a study conducted among the inhabitants of Martha’s Vineyard, set the stage for variationist studies when he observed that the choice among linguistic variants is neither random nor biologically determined but depends on social factors. Systematic variation has been studied extensively in relation to such social factors as age, gender, social class, ethnicity, group affiliations and geographical origin.

Age is an important determinant of sociophonetic variation. In a study based in Milton Keynes, a town that experienced high rates of in-migration, Kerswill and Williams (2000) found that the extent to which children of in-migrants adopted features of the local dialect differed according to age: 4-year-olds showed a considerable number of features of their parents’ dialects, the dialects of 8-year-olds were more homogeneous as a group and 12-year-olds showed almost no traces of their parents’ dialects. A likely explanation for these differences relates to the different ways of socialisation and the source of input that children receive at these particular ages. Four-year-olds are cared for mostly in their family homes and receive most input from their parents. With increasing age, children connect more with their peers and this becomes their major source of linguistic input. Adolescence has been identified as the age when children are under the strongest influence from their peers, which shows up in their linguistic output. This is unlikely to be a coincidence because adolescence is the time when a person initiates the process of constructing a social identity vis-a-vis the peer group (Eckert 2000).

Foulkes et al. (2005) looked at pre-aspiration and found that its rates in children differed not only as a function of age – another important factor was gender. Boys and girls at the age of two years showed no differences in the us age of pre-aspiration. The differences among the production of boys and girls began to be clearly discernible at the age of three years and six months. Foulkes and Docherty (2006: 424) concluded that children’s speech production may “show signs of recognising the social indexicality of linguistic forms, although it may take some time for this recognition to develop and be reflected in speech output”.

Recent years have witnessed a surge of interest in sociophonetic studies, i.e. studies that focus on identifying phonetic variants that convey social categories or speaker attributes. Naslund (1993) looked at the male-female production of /s/ in American English and found that women tended to use a more fronted, slit variant of /s/, while men tended to use a more alveolar, grooved variant. These gender-related differences in the production of /s/ become noticeable in the great majority of the boys and girls that Naslund studied already at the age of 8. As the anatomy of the vocal tract of prepubescent boys and girls is similar, the reported differences in the production of /s/ are very likely to have a function of coding membership in a particular social group (i.e. social-indexing). This study shows that gender-related phonetic variants are acquired very early. In a similar vein, the results of experiments reported in Fuchs and Toda (2009) indicate that the sex differences in the production of fricatives among adults result from active articulatory manipulations, not simply anatomic differences. Stuart-Smith (2007) investigated the production of /s/ in Glaswegian English in relation to sex, age and social class. She reported that younger, working-class girls produced a more retracted variant of /s/ than younger, middle-class girls or middleaged women of both social classes. It is unlikely that the retracted variant of /s/ is an instance of misarticulation as it would be difficult to explain why its occurrence is limited to a particular social class. It is far more likely that the retracted variant of /s/ reflects talkers’ tacit or overt social-indexing. Similarly, Foulkes and Docherty (2000) showed that the usage of labiodental variants of /r/ in variants of English spoken in the United Kingdom shows traits of social-indexing, in spite of superficial similarities to variants used by children.

In sum, these and many other studies suggest that age, gender, social class and many other social categories and speaker attributes are important determinants of phonetic variation. In addition, it is evident that phonetic variation is not reducible to anatomic differences. It will be argued that both age and gender play a role in the social distribution of the new variants of alveolo-palatals.

We now turn to evidence indicating that certain acoustic properties of sounds have an iconic function and connote “childishness”. Ohala (1994) hypothesises that the concentration of energy in higher frequency regions that characterises certain palatal(ised) consonants and certain front vowels (as well 26 B. Czaplicki et al. as high pitch) is universally associated with smallness.[12] This has to do with the fact that smaller and/or younger individuals have smaller vocal tracts and emit sounds with higher pitch. As a result, in animal communication, energy concentration in higher frequency regions (including high pitch) universally signals subordinacy, politeness and non-threatening attitude (Ohala 1994). A similar correlation has been identified in language. Kochetov and Alderete (2011) use the term “expressive palatalisation” (EP) to refer to this iconic relation between certain palatalised consonants and the meaning of “smallness”, “childishness” and “affection”. EP is a relatively common property of sound symbolism, diminutive morphology, hypocoristics and “babytalk” – conventionalised adults’ speech directed to small children (Kochetov and Alderete 2011). The data below provide specific examples of EP in Japanese sound symbolism and babytalk (Kochetov and Alderete 2011: 346) and in Polish hypocoristics.

  1. (5a)

    Japanese sound symbolism (mimetics):

    [ʧ̲oko-ʧ̲oko] vs. [t̲oko-t̲oko] ‘moving like a small child’ vs. ‘trotting’

    [kaʧ̲a-kaʧ̲a] vs. [kat̲a-kat̲a] ‘the sound of keys hitting against each other’ vs. ‘the sound of a hard object hitting the hard surface’

    [pʲ̲oko-pʲ̲oko] vs. [p̲oko-p̲oko] ‘hopping around in a childish bobbing motion’ vs. ‘making holes here and there’

  2. (5b)

    Japanese babytalk:

    [os̲arus̲an] > [oʧ̲aruʧ̲an] ‘monkey (honorific)’

    [kuʦ̲u] > [kuʧ̲u] ‘shoe’

    [tabemas̲uka] > [tabemaʧ̲uka] ‘Will you eat?’

    [omiz̲u nominasai] > [omiʤ̲u nominaʧ̲ai] ‘Drink your water!’

  3. (5c)

    Polish hypocoristics:

    [magdalɛna] > [magduɕ̲a], [maʥ̲a] ‘Magdalena’, proper name

    [marta] > [martuɕ̲a], [marʨ̲a] ‘Marta’, proper name

    [mixaw] > [mixaɕ̲] ‘Michał’, proper name

    [darjuʂ] > [daruɕ̲] ‘Dariusz’, proper name

Palatalisation of certain consonants modifies the meaning of the forms in (5) by supplying the additional feature of “smallness”, “childishness” and “affection”. Kochetov and Alderete (2011: 368) note that sound substitutions involving palatalised sibilants in the babytalk register are found in languages as diverse as Spanish, Thai, Estonian, Cree and Kannada. The appearance of alveolo-palatals in the Polish hypocoristics in (5c) can be interpreted as sound symbolic, connoting the meaning of “smallness” and “affection”. It is interesting to note that while alveolo-palatals are commonly used for EP, the corresponding retroflexes do not appear in this function, cf. especially the last item in (5c), where a retroflex fricative is replaced with an alveolo-palatal fricative in the hypocoristic form. This indicates that retroflexes, although historically originating from palatalisation (Długosz-Kurczabowa and Dubisz 2006), are no longer perceived and produced as phonetically palatalised, in contrast to alveolo-palatals.

Having shown that Polish alveolo-palatals take part in sound-meaning associations connoting smallness and affection, we proceed to provide motivation for the new developments. We use the results of our acoustic analysis and propose that the innovation involving alveolo-palatals, /ɕ ʨ/ > /sʲ tsʲ/, enhances the iconicity of these sounds by shifting energy weight into higher frequency regions, while at the same time maintaining the phonetic cues of palatalisation.[13] As a result, this effect may be treated as a kind of exaggerated palatalisation or “over-palatalisation”. It is significant to note that this innovation has been introduced by young women and is virtually absent from the speech of young men of the same age. This is in line with Ohala’s (1994) observation that males in the animal kingdom tend to strive for lower pitch voices (with energy concentration in lower frequency regions), as this is associated with the image of physical largeness and authority. Energy weight in higher frequency regions, on the other hand, is universally linked with smallness because smaller larynxes and vocal tracts generate higher pitch voices. Higher pitch is also characteristic of young individuals. It is hypothesised that a possible motivation behind the emergence of the new variant among young women has to do with transmitting the image of youth. Such “over-palatalisation” effects often evoke an impression of the speaker’s childishness.[14] A similar linguistic trait has been identified in the Songyaun dialect of Chinese, where young women are reported to replace /ɕ/ with /sʲ/ in the standard lexical contrast /s ʂ ɕ/ under the influence of the Beijinghua “feminine accent” (Li 2005;Beckman 2012). This situation closely parallels the discussed developments in Polish. However, it is not necessary to assume that the young women who adopt these phonetic variants “want to” sound childish or youthful. In another scenario, the choice is motivated by peer pressure and is not teleological. The young women simply imitate their peers and the desire to sound childish does not constitute the prime motivation for the change. It is possible that the stereotypes and stigmatisations are later pushed onto the new variants of sibilants by older speakers. Non-teleological explanations along these lines have been suggested for the use of creaky voice by American women (Yuasa 2010).

The process of change invariably involves its diffusion through a community. Manly (1930), Weinreich et al. (1968) and more recently Aitchison (2001) have argued that language change is most commonly observed among adolescents. As mentioned at the beginning of this section, this is rooted in social behaviour. Young children are quite alert to and pick up on the sound changes present in the speech of their caregivers. That is, adults (usually women) are the primary models for imitation by young children (Aitchison 2001: 210). This changes when children go to school. “At the preadolescent stage, we find the beginnings of a move from parent-oriented to peer-oriented networks” (Kerswill 1996: 196). When children reach adolescence, their social networks are fully developed and their speech comes to resemble the speech of their peers. The role of their parents’ speech is consequently diminished. It is at this age that linguistic innovations initiated by some influential adolescents are the most likely to spread to their peers. Such linguistic traits become identity markers setting off a particular group from other, especially older speakers. When adolescents grow up, their speech usually becomes more in line with the speech of older speakers (Aitchison 2001: 210).

Thus, it seems no coincidence that the newly introduced variants of alveolopalatal sibilants, /sʲ tsʲ/, are employed exclusively by young individuals, as the peer-oriented social pressures among members of this age group are the strongest. Currently, we are witnessing an initial stage of a linguistic change. Whether the new variant catches on and is adopted by a wider community depends on the strength of social networks among adolescents and its perception as an identity marker within this group. It may also happen that the new variant, being agegraded, will disappear as soon as its users enter adulthood.[15]

4.3. Speech disorder

Another hypothesis regarding the appearance of the new variant relates the realisation of the Standard Polish /ɕ/ as /sʲ/ to anatomy and, more specifically, to a higher palate and the impossibility of reaching the palatal place of articulation by the tongue.[16] While we do not exclude this possibility in terms of initiating the sound change, we strongly believe that due to (i) a frequent appearance of this pronunciation type, (ii) the fact that it is employed by young female speakers, (iii) similarities to other languages and (iv) the fact that some speakers can suspend this pronunciation in more formal registers, the spread of this change is more likely to be rooted in sociolinguistics. Moreover, if the change was due to a speech impairment, it would be difficult to explain its rapid spread in recent times, as opposed to several centuries ago.[17]

In a possible scenario, the change is initiated by a group of influential speakers with a higher palate. However, the subsequent spread of the change must be sociolinguistically conditioned, otherwise it would be difficult to explain the fact that its occurrence is limited to young women. Similarly, while the first hypothesis, contrast optimisation, is useful in providing the initial motivation for the change, it is still necessary to explain the social distribution of the innovative sibilants. In short, sociolinguistic factors play the central role in the proposed account of the change and its propagation.

5. Summary and conclusion

The results of the acoustic experiment show that the new sounds are produced with a significantly higher spectral peak in the range from 20 Hz to 11,000 Hz than the alveolo-palatal sibilants of Standard Polish. While the COG values were higher for the new variants than for their Standard Polish counterparts, the new variants of both /ɕ/ and displayed significantly lower skewness values than their Standard Polish corresponding segments. The spectra of the Standard Polish /ɕ/ and were right-skewed in contrast to the spectra of the new variants which were skewed to the left.

As for spectral slopes, the first slope (m1) of was significantly lower in the new variant than in Standard Polish, i.e., for there was a steeper rise for the Standard Polish spectra compared to the new variant for the frequency region from 500 Hz to 5,000 Hz. The second spectral slope (m2) of both sibilants showed significantly lower values in the new variant as compared to the Standard Polish variant, i.e., the spectra of the new variant fell steeper for the higher frequency region above 5,000 Hz compared to the spectra of the Standard Polish speakers.

Furthermore, F1 frequency of the preceding vowel differed in the new variants in comparison to the Standard Polish variants, whereas F3 frequency of both the preceding vowel (only in the case of /ɕ/) and the following vowel was significantly higher in the new variants in comparison to the Standard Polish variants. Finally, only F1 frequency range of the vowel [a] preceding the sibilants /ɕ/ and was significantly different in the Standard Polish and new variants: F1 frequency range was significantly lower in the new variants.

The linear discriminant analysis showed that both spectral parameters and formant values sufficiently discriminate the Standard Polish and new variant. The discriminant correctly classified 89.3% of the Standard Polish vs 76.3% of the new fricatives and 94% of the Standard Polish vs 88.3% of the new affricates.

We have attempted to provide the rationale for the development of the new variants of sibilants. The first explanation invokes contrast optimisation. We used multidimensional scaling to show that the innovative system of sibilants is more dispersed in terms of contrast and such systems tend to be preferred in the world’s languages. The second explanation assumes that energy concentration in higher frequency regions enhances the cues of iconic expressive palatalisation and is used to connote the meaning of “youth” and “childishness”. This innovation then spreads among adolescent girls and young women and becomes an identity marker. It is propagated among adolescent girls and young women because peer-oriented social pressures are the strongest among members of these age groups. It remains to be seen whether this linguistic trait will establish its role as a marker of a Polish “feminine accent” in the future. The third explanation relates the change to a speech disorder. While all three hypotheses can be used to explain the emergence and propagation of the change, the speech disorder hypothesis appears to be the least likely. Of the remaining two hypotheses, the sociolinguistic one is the most plausible as it manages to explain why the change is restricted to a particular social group, i.e. young women.


Institute of English Studies, University of Warsaw, Nowy Świat 4, 00-497 Warszawa, Poland

References

Aitchison, J. 2001. Language change: Progress or decay? (3rd edition.) Cambridge: Cambridge University Press. Search in Google Scholar

Barr, D.J., R. Levy, Ch. Scheepers and H.J. Tily. 2013. “Random effects structure for confirmatory hypothesis testing: Keep it maximal”. Journal of Memory and Language 68. 255–278. Search in Google Scholar

Bates, D., M. Maechler, B. Bolker, S. Walker, R.H.B. Christensen, H. Singmann, B. Dai and G. Grothendieck. 2015. lme4. R package version 1.1-10. Search in Google Scholar

Boersma, P. and D. Weenink. 2011. “Praat: Doing phonetics by computer” [Computer program]. Version 5.1.05 and Version 5.2.45. Retrieved 28 Sep 2011 from . Search in Google Scholar

Beckman, M. 2012. “Aligning the timelines of phonological acquisition and change”. Paper given at the 2nd Workshop on Sound Change. Munich. Search in Google Scholar

Długosz-Kurczabowa, K. and S. Dubisz. 2006. Gramatyka historyczna języka polskiego. Warszawa: Wydawnictwa Uniwersytetu Warszawskiego. Search in Google Scholar

Eckert, P. 2000. Linguistic variation as social practice. Oxford: Blackwell. Search in Google Scholar

Evers, V., H. Reetz and A. Lahiri. 1998. “Crosslinguistic acoustic categorization of sibilants independent of phonological status”. Journal of Phonetics 26. 345–370. Search in Google Scholar

Flemming, E. 2004. “Contrast and perceptual distinctiveness”. In: Hayes, B., R. Kirchner and D. Steriade (eds.), Phonetically-based phonology. Cambridge: Cambridge University Press. 232–276. Search in Google Scholar

Forrest, K., G. Weismar, P. Milenkovic and R.N. Dougall. 1988. “Statistical analysis of word-initial voiceless obstruents: Preliminary data”. Journal of the Acoustical Society of America 84. 115–124. Search in Google Scholar

Foulkes, P. and G. Docherty. 2000. “Another chapter in the story of /r/: ‘Labiodental’ variants in British English”. Journal of Sociolinguistics 4. 30–59. Search in Google Scholar

Foulkes, P. and G. Docherty. 2006. “The social life of phonetics and phonology”. Journal of Phonetics 34. 409–438. Search in Google Scholar

Foulkes, P., G. Docherty and D.J.L. Watt. 2005. “Phonological variation in child directed speech”. Language 81. 177–206. Search in Google Scholar

Fuchs, S. and M. Toda. 2010. “Do differences in male versus female /s/ reflect biological or sociophonetic factors?” In: Fuchs, S., M. Toda and M. Żygis (eds.), Turbulent sounds. An interdisciplinary guide. Berlin: Mouton de Gruyter. 281–302. Search in Google Scholar

Gillieron, J. 1918. Genealogie des mots quie designent l’abeille. Paris: Champion. Search in Google Scholar

Gordon, M., P. Barthmaier and K. Sands. 2002. “A cross-linguistic acoustic study of voiceless fricatives”. Journal of the International Phonetic Association 32. 141– 174. Search in Google Scholar

Hamann, S. 2003. The phonetics and phonology of retroflexes. Utrecht: LOT. Search in Google Scholar

Harrington, J. 2000. Phonetic analysis of speech corpora. Malden, MA: Wiley-Blackwell. Search in Google Scholar

Jassem, W. 1962. “Noise spectra of Swedish, English, and Polish fricatives”. Proceedings of the Speech Communication Seminar, Stockholm, Royal Institute of Technology, Speech Transmission Laboratory. 1–4. Search in Google Scholar

Jassem, W. 1968. “Acoustic description of voiceless fricatives in terms of spectral parameters”. In: Jassem W. (ed.), Speech analysis and synthesis. Warsaw: Państwowe Wydawnictwo Naukowe. 189–206. Search in Google Scholar

Jassem, W. 1979. “Classification of fricative spectra using statistical discriminant functions”. In: Lindblom, B. and S. Ohman (eds.), Frontiers of speech communication research. New York: Academic Press. 77–91. Search in Google Scholar

Jesus, L.M.T. and C.H. Shadle. 2002. “A parametric study of the spectral characteristics of European Portuguese fricatives”. Journal of Phonetics 30(3). 437–464. Search in Google Scholar

Jongman, A., R. Wayland and S. Wong. 2000. “Acoustic characteristics of English fricatives”. Journal of the Acoustical Society of America 108. 1252–1263. Search in Google Scholar

Kerswill, P. 1996. “Children, adolescents and language change”. Language Variation and Change 8(2). 177–202. Search in Google Scholar

Kerswill, P. and A. Williams. 2000. “Creating a new town koine: Children and language change in Milton Keynes”. Language in Society 29. 65–115. Search in Google Scholar

Kochetov, A. and J. Alderete. 2011. “Scales and patterns of expressive palatalization: Experimental evidence from Japanese”. Canadian Journal of Linguistics 56(3). 345–376. Search in Google Scholar

Kruskal J.B. and M. Wish. 1978. Multidimensional scaling. Bevery Hills, CA: Sage Press. Search in Google Scholar

Kuznetsova, A., P.B. Brockhoff and R.H.B. Christensen. 2015. Package “lmerTest”. R package version 2.0-29. Search in Google Scholar

Labov, W. 1963. “The social motivation of a sound change”. Word 19. 273–309. Search in Google Scholar

Li, F. 2005. “An acoustic study on feminine accent in the Beijing Dialect”. In: Qian Gao (ed.), Proceedings of the 17th North American Conference on Chinese Linguistics. University of Southern California Linguistics Publications. 219–224. Search in Google Scholar

Lorenc, A. and R. Święciński. 2014. “Application of phonetics in speech therapy: A case of abnormally convex tongue setting in Polish”. In: Szpyra-Kozłowska, J., E. Guz, P. Steinbrich and R. Święciński (eds.), Recent developments in applied phonetics. Lublin: Wydawnictwo KUL. 287–324. Search in Google Scholar

Łobacz, P. and K. Dobrzańska. 1999. “Opis akustyczny głosek sybilantnych w wymowie dzieci przedszkolnych”. Audiofonologia 14. 5–26. Search in Google Scholar

Lousada, M., L.M.T. Jesus and D. Pape. 2012. “Estimation of stops’ spectral place cues using multitaper techniques”. D.E.L.T.A. 28(1). 1–26. Search in Google Scholar

Maniwa, K., A. Jongman and T. Wade. 2009. “Acoustic characteristics of clearly spoken English fricatives”. Journal of the Acoustical Society of America 125. 3962–3973. Search in Google Scholar

Manly, J.M. 1930. “From generation to generation”. In: Borgholm, N., A. Brusendorff and C.A. Bodelsen (eds.), A grammatical miscellany offered to Otto Jespersen on his 70th birthday. London: George Allen and Unwin. 287–289. Search in Google Scholar

Martinet, A. 1955. Economie des changements phonetiques. Bern: Francke. Search in Google Scholar

MathWorks. 2007. Signal Processing Toolbox 6 User’s Guide. Natick: MathWorks. Search in Google Scholar

McMurray, B. and A. Jongman. 2011. “What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectation”. Psychological Review 118. 219–246. Search in Google Scholar

Naslund, D.T. 1993. “The /s/ phoneme: A gender issue”. Unpublished manuscript, University of Minnesota, Duluth. Search in Google Scholar

Nowak, P.M. 2006. “The role of vowel transitions and frication noise in the perception of Polish sibilants”. Journal of Phonetics 34(2). 139–152. Search in Google Scholar

Ohala, J.J. 1983. “The origin of sound patterns in vocal tract constraints”. In: Mac-Neilage, P.F. (ed.), The production of speech. New York: Springer. 189–216. Search in Google Scholar

Ohala, J.J. 1994. “The frequency code underlies the sound-symbolic use of voice pitch”. In: Hinton, L., J. Nichols and J.J. Ohala (eds.), Sound symbolism. Cambridge: Cambridge University Press. 325–347. Search in Google Scholar

Padgett, J. 2001. “Contrast dispersion and Russian palatalization”. In: Hume, E. and K. Johnson (eds.), The role of speech perception in phonology. San Diego, CA: Academic Press. 187–218. Search in Google Scholar

Padgett, J. and M. Żygis. 2007. “The evolution of sibilants in Polish and Russian”. Journal of Slavic Linguistics 15(2). 291–324. Search in Google Scholar

Pierrehumbert, J. 2001. “Exemplar dynamics, word frequency, lenition, and contrast”. In: Bybee, J. and P. Hopper (eds.), Frequency effects and the emergence of linguistic structure. Amsterdam: John Benjamins. 137–157. Search in Google Scholar

R Development Core Team. 2010. “R: A language and environment for statistical computing”. Vienna: R Foundation for Statistical Computing. Search in Google Scholar

Rochoń, M. and B. Pompino-Marschall. 1999. “The articulation of secondarily palatalized coronals in Polish”. Proceedings of XIVth International Congress of Phonetic Sciences, San Francisco. 1897–1900. Search in Google Scholar

Shadle, C.H. 1991. “The effect of geometry on source mechanisms of fricative consonants”. Journal of Phonetics 19. 409–424. Search in Google Scholar

Shadle, C.H. 2006. “Acoustic Phonetics”. In: Brown, K. (ed.), Encyclopedia of Language and Linguistics. (2nd ed., vol. 9.) Oxford: Elsevier. 442–460. Search in Google Scholar

Shadle, C.H. 2012. “The acoustics and aerodynamics of fricatives”. In: Cohn, A., C. Fougeron and M.K. Huffman (eds.), The Oxford handbook of laboratory phonology. Oxford: Oxford University Press. 511–526. Search in Google Scholar

Shadle, C.H. and S.J. Mair. 1996. “Quantifying spectral characteristics of fricatives”. In: Proceedings of the International Conference on Spoken Language Processing (ICSLP 96), Philadelphia. 1517–1520. Search in Google Scholar

Stuart-Smith, J. 2007. “Empirical evidence for gendered speech production: /s/ in Glaswegian”. In: Cole, J. and J. Hualde (eds.), Laboratory phonology 9. Berlin: Mouton de Gruyter. 65–86. Search in Google Scholar

Thomson, D.J. 2000. “Multitaper analysis of nonstationary and nonlinear time series data”. In: Fitzgerald, W., R. Smith, A. Walden and P. Young (eds.), Nonlinear and nonstationary signal processing. Cambridge: Cambridge University Press. 317– 394. Search in Google Scholar

Wedel, A. 2007. “Feedback and regularity in the lexicon”. Phonology 24. 147–185. Search in Google Scholar

Weinreich, U., W. Labov and M.I. Herzog. 1968. “Empirical foundations for a theory of language change”. In: Lehmann, W.P. and Y. Malkiel (eds.), Directions for historical linguistics: A symposium. Austin: University of Texas Press. 95–195. Search in Google Scholar

Yuasa, I.P. 2010. “Creaky voice: A new feminine voice quality for young urbanoriented upwardly mobile American women?” American Speech 85(3). 315–337. Search in Google Scholar

Zarębina, M. 1965. Kształtowanie się systemu dźwiękowego dziecka [The development of a child’s sound system]. Wrocław–Warszawa–Krakow: Ossolineum. Search in Google Scholar

Zsiga, E. 1993. Features, gestures, and the temporal aspects of phonological organization. (PhD dissertation, Yale University.) Search in Google Scholar

Żygis, M. 2003. “Phonetic and phonological aspects of Slavic sibilant fricatives”. ZAS Papers in Linguistics 32. 175–213. Search in Google Scholar

Żygis, M. 2006. Contrast optimization in Slavic sibilant systems. (Habilitationsschrift, Humboldt Universitat zu Berlin.) Search in Google Scholar

Żygis, M. and S. Hamann. 2003. “Perceptual and acoustic cues of Polish coronal fricatives”. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona 3–9 August. 395–398. Search in Google Scholar

Żygis, M. and J. Padgett. 2010. “A perceptual study of Polish sibilants, and its implications for historical sound change”. Journal of Phonetics 38(2). 207–226. Search in Google Scholar

Żygis, M., D. Pape and L. Jesus. 2012. “(Non)retroflex Slavic affricates and their motivation. Evidence from Czech and Polish”. Journal of the International Phonetic Association 42(3). 281–329. Search in Google Scholar

APPENDIX A

Original admonitions taken from the radio programme “Co w mowie piszczy” [‘What is happening in speech’] made against the new variants of sibilants, aired on the 7th of February 2012 on Polish Radio, Programme 3.

Source: .

“Panie słodko szczebioczą.”

“Ma to je uczynić łagodnymi, słodkimi, kobiecymi.”

“drogie szczebiotki”

“Tak mowią małe dzieci.”

“Na litość boską, nie musimy cofać się w rozwoju.”

“Nie możemy się na to godzić.”

“wymowa infantylna i pretensjonalna”

“Najstraszniejsze co nam się może przydarzyć, to usłyszeć ją w wykonaniu dorosłej kobiety.”

APPENDIX B: MATERIAL

Carrier Sentence Material

Carrier sentence: Powiedziała __ do ciebie ‘She said __ to you’.

Carrier sentence words:

a_a#_a
/ɕ/Kasia /kaɕa/ (proper name)siaka /ɕaka/ ‘another’
waciak /va͡tɕak/ ‘quilted jacket’ciasno /tɕasnɔ/ ‘tightly’
/s/kasa /kasa/ ‘check-out counter’sama /sama/ ‘alone’
taca /ta͡tsa/ ‘platter’cały /tsawɨ/ ‘entire’ masc.
/ʂ/kasza /kaʂa/ ‘grits’szata /ʂata/ ‘clothes’
kacza /ka͡ʈʂa/ ‘duck’ adj.fem.czasy /ʈʂasɨ/ ‘times’

Coherent Text Material

Coherent text:

Mimo że ciężko pracuje, Kasia ma teraz trudne czasy w pracy. Rzadko kiedy dziadek sadza ją samą na kasie. Madzia cały czas musi patrzeć czy kasa jest bezpieczna. Nasza gaża nie zawsze pokrywa koszty. Madzia i Kasia mają zakaz ubierania się w pasiaste ubrania. Całe ciało ma być zakryte. Czasem chodzą do Kazia, żeby się poskarżyć. Rady Kazia są mało użyteczne. Często im mowi „Ziarnko do ziarnka a uzbiera się miarka”. Czasami żartuję, że moj szalony braciak maczał w tym palce. W takich sytuacjach siadam w kuchni i piekę ciasto marchewkowe. Ta mazia, ktora często mi wychodzi nie smakuje nawet dziadkowi. A jednak nie jest mi go żal, bo nikt nie jest tak nadziany jak on.

Translation:

Although she works hard, Kasia is going through difficult times at work. Hardly ever does grandpa make her sit alone at the check-out counter. Madzia has to make sure that the check-out counter is safe. Our pay does not always cover the costs. Madzia and Kasia have a ban on wearing striped clothes. The whole body must be covered. Occasionally, they go to Kazio to complain. Kazio’s advice is not very useful. He often tells them “A penny saved is a penny gained”. Sometimes I say jokingly that my crazy brother has played a part in it. On such occasions, I sit in the kitchen and bake a carrot cake. The substance that often results is not tasty even for grandpa. But I don’t feel sorry for him because no one is as rich as he is.

Coherent text words

a_a#_a
/ɕ/Kasia /kaɕa/ (proper name) × 2 pasiaste /paɕastɛ/ ‘striped’ neut.nom.pl.siadam /ɕadam/ ‘I sit’
braciak /bra͡tɕak/ ‘brother’ciasto /tɕastɔ/ ‘cake’
ciało /tɕawɔ/ ‘body’
/s/kasa /kasa/ ‘check-out counter’samą /samɔw̃/ ‘alone’ fem.acc.sg.
całe /tsawɛ/ ‘entire’ neut.
/ʂ/nasza /naʂa/ ‘our’ fem.szalony /ʂalɔnɨ/ ‘crazy’ masc.
maczał ‘he dipped’czasy /ʈʂasɨ/ ‘times’
czasem /ʈʂasɛm/ ‘sometimes’

APPENDIX C: STATISTICAL RESULTS

Table 1

Mean values (left column) and standard deviations (right column) of spectral peaks and spectral moments.

Standard PolishNew Variant
/ɕ//ɕ/
Peak 20Hz–16kHz [Hz]4176638416456555468395546719
COG [Hz]4375453446853753676355205753
STD1428262164935215674071898501
Skewness0.3500.6810.1790.643−0.0800.736−0.3170.621
Kurtosis2.8452.5881.6021.5603.8413.0752.1662.371
m13.0611.2932.5111.3762.4361.6601.3101.500
m2−4.4040.467−4.6040.468−4.7620.495−4.9180.461
Table 2

Mean values (left column) and standard deviations (right column) of formant frequencies F1, F2, F3 of the preceding (V1) and following vowel (V2) as well as formant ranges of V1 and V2.

Standard PolishNew Variant
/ɕ//ɕ/
F1_V1 [Hz]710996658165811259474
F2_V1 [Hz]1777118168815018621491794131
F3_V1 [Hz]2728149277619428891372902143
F1_V2 [Hz]57955582575966859057
F2_V2 [Hz]1815155182816618631531880130
F3_V2 [Hz]2831150281013029601172939116
F1range_V1 [Hz]−3385−6782−8797−12179
F2range_V1 [Hz]138121109146157131148131
F3range_V1 [Hz]267201153252249197179143
F1range_V2 [Hz]−5887−6284−5189−4981
F2range_V2 [Hz]146123162206153126174184
F3range_V2 [Hz]97159184221109137182176
Table 3

Mean values (left column) and standard deviations (right column) of duration.

Standard PolishNew Variant
/ɕ//ɕ/
Duration [s]0.0980.01970.0610.0110.1010.02180.0580.014

Results of mixed models effects[18]

Peak [20 Hz –11 kHz]/ɕ/
t-valuep valuet-valuep-value
Variant (new)8.024< 0.001***6.706< 0.001***
Stress (unstressed)−0.0090.9932.2660.024*
Speech style (carrier)−1.5620.1190.7210.471
Repetition0.5290.597−0.6820.504
Centre of gravity COG/ɕ/
t-valuep-valuet-valuep-value
Variant Stress5.327 −0.160< 0.001*** 0.8883.894 −0.5620.001** 0.624
Speech style−4.523< 0.001***0.1320.904
Repetition−0.1260.900−0.8030.423
Standard deviation STD/ɕ/
t-valuep-valuet-valuep-value
Variant1.3530.1971.7330.105
Stress0.3800.7801.1530.508
Speech style1.9550.0670.8850.559
Repetition−0.7740.448−0.5980.550
Skewness/ɕ/
t-valuep-valuet-valuep-value
Variant−2.0500.060−2.8220.012*
Stress−0.9090.466−3.209(0.101)
Speech style−1.0530.293−0.0390.972
Repetition−1.3820.183−0.2450.807
Kurtosis/ɕ/
t-valuep-valuet-valuep-value
Variant0.8770.3950.9390.364
Stress2.351(0.220)−1.2540.210
Speech style−1.6300.115−0.9740.330
Repetition0.4750.635−0.3500.726
m1/ɕ/
t-valuep-valuet-valuep-value
Variant−1.6390.124−2.3480.034*
Stress−1.5870.277−1.5300.290
Speech style−1.6060.110−1.2260.429
Repetition−1.4050.1740.9060.365
m2/ɕ/
t-valuep-valuet-valuep-value
Variant Stress−3.738 0.2580.002* 0.824−4.623 0.072> 0.001*** 0.942
Speech style−0.4750.636−3.0610.002*
Repetition−0.1870.851−0.8100.427
F1_V1/ɕ/
t-valuep-valuet-valuep-value
Variant−1.5860.135−2.7300.016*
Speech style2.8510.004*0.3481.000
Repetition−2.0060.045*−2.2040.029*
F2_V1/ɕ/
t-valuep-valuet-valuep-value
Variant0.8120.4311.8910.079
Speech style1.5080.150-0.1661.000
Repetition0.2900.7721.2690.207
F3_V1/ɕ/
t-valuep-valuet-valuep-value
Variant3.7770.002**1.7060.113
Speech style Repetition−4.015 0.917> 0.001*** 0.3600.009 −2.1581.000 0.033*
F1_V2/ɕ/
t-valuep-valuet-valuep-value
Variant0.5380.5990.4770.640
Stress−0.7140.5530.2480.821
Speech style0.4530.655−0.1730.874
Repetition−0.9430.346−2.7690.006**
F2_V2/ɕ/
t-valuep-valuet-valuep-value
Variant0.7630.4580.8500.409
Stress−1.0080.419−1.3620.337
Speech style−1.3890.1800.7240.541
Repetition0.6900.4910.0910.337
F3_V2/ɕ/
t-valuep-valuet-valuep-value
Variant Stress2.763 −0.4150.015* 0.7192.567 −0.7650.022* 0.510
Speech style Repetition−3.007 −1.2080.006** 0.2270.890 −2.3250.436 0.020*
F1range_V1/ɕ/
t-valuep-valuet-valuep-value
Variant−2.1790.047*−2.086(0.055)
Speech style−0.0950.925−0.3531.000
Repetition−0.0130.989−0.1740.862
F2range_V1/ɕ/
t-valuep-valuet-valuep-value
Variant−0.0060.9950.9880.341
Speech style1.3160.2050.5691.000
Repetition−0.9260.3551.6020.112
F3range_V1/ɕ/
/ɕ/t-valuep-valuet-valuep-value
Variant0.8600.404−0.3180.756
Speech style−0.4200.6800.5071.000
Repetition−1.2810.202−2.1250.036*
F1range_V2/ɕ/
t-valuep-valuet-valuep-value
Variant−0.0100.9920.3020.767
Stress0.4190.7172.3220.161
Speech style5.488> 0.001***1.4370.261
Repetition0.4190.7170.4980.161
F2range_V2/ɕ/
t-valuep-valuet-valuep-value
Variant0.3020.7661.3990.185
Stress0.8070.504−1.4260.251
Speech style−2.5530.016*−1.1240.341
Repetition−0.1280.898−0.6470.518
F3range_V2/ɕ/
t-valuep-valuet-valuep-value
Variant1.3580.179−0.1620.873
Stress−0.5780.6221.1470.333
Speech style−4.228> 0.001***−0.3880.721
Repetition−0.7050.484−0.6160.538
Duration/ɕ/
/ɕ/t-valuep-valuet-valuep-value
Variant0.5900.564−0.5320.603
Stress−0.0880.937−1.3060.292
Speech style Repetition1.830 −4.4770.081 > 0.001***1.887 −3.0730.153 > 0.001***
Published Online: 2016-3-17
Published in Print: 2016-3-1

© 2016 Faculty of English, Adam Mickiewicz University, Poznań, Poland