Insertion of vowels in English syllabic consonantal clusters pronounced by L1 Polish speakers


 The aim of this study was an attempt to verify whether Polish speakers of English insert a vowel in the word-final clusters containing a consonant and a syllabic /l/ or /n/ due to the L1–L2 transfer. L1 Polish speakers are mostly unaware of the existence of syllabic consonants; hence, they use the Polish phonotactics and articulate a vocalic sound before a final sonorant which is deprived of its syllabicity. This phenomenon was examined among L1 Polish speakers, 1-year students of English studies, and the recording sessions were repeated a year later. Since, over that time, they were instructed with regard to phonetics and phonology but also the overall practical language learning, the results demonstrated the occurrence of the phenomenon of vowel insertion on different levels of advanced command of English. If the vowels were inserted, their quality and length were monitored and analysed. With regard to the English system, pronouncing vowel /ə/ before a syllabic consonant is possible, yet not usual. That is why another aim of this study is to examine to what extent the vowels articulated by the subjects differ from the standard pronunciation of non-final /ə/. The quality differences between the vowels articulated in the words ending with /l/ and /n/ were examined as well as the potential influence from the difference between /l/ and /n/ on the occurrence of vowel reduction. Even though Polish phonotactics permit numerous consonantal combinations in all word positions, it proved to be challenging for L1 Polish speakers to pronounce word-final consonantal clusters containing both syllabic sonorants. This result carries practical implications for the teaching methodology of English phonetics.


Introduction
The Polish language consists of numerous complex consonantal clusters in all word positions. As for the initial one, there are words such as pstrąg/pstrɔŋk/"trout" and krztusiec/ˈkʃtuɕɛʦ̑/"pertussis." The consonantal combinations in the middle-word position occur in mężczyzna/mɛw̃ʃˈʧɨzna/"man" or przetrwanie/pʃɛ ˈtr̥ fãɲɛ/"survival," whereas, in the word-final position, there are words such as gwóźdź/ɡvuɕʨ̑/"nail" and pójść/pujɕt ͡ ɕ/"to go." Yet, Polish, unlike English, lacks syllabic consonants, which constitute the peak of the syllable. There is also no nasal and lateral release. That is why L1 Polish learners of foreign languages are mostly unaware of the existence of syllabic consonants as they do not exist in Polish. In English, there is a plethora of word-final consonantal clusters consisting of syllabic consonants, which constitute a syllable nucleus, even though no vowel occurs. Therefore, L1 Polish speakers may be confronted with the problem of articulating such clusters while speaking English as they might use Polish phonotactic constraints and articulate a vocalic sound before a final sonorant, depriving it of its syllabicity.
The area of examining the vowel insertion in word-final /-Cl/ and /-Cn/ English consonantal clusters pronounced by L1 Polish speakers seems to remain unexplored. Therefore, the study aims to verify whether advanced L1 Polish speakers of English insert a vowel in word-final clusters containing a consonant and a syllabic /l/ or /n/. In other words, the intention is to examine the possible language transfer from L1 to L2. Another aim of this study is, if vowels are inserted, to examine their quality and length so as to verify to what extent the sound parameters differ from the standard pronunciation of non-final /ə/. The secondary goal of the study is to assess whether formal instruction in phonology and supervised pronunciation practice prove effective in eliminating non-standard pronunciation.
The paper is divided into four parts. In the first one, the theoretical background of the research topic will be presented in order to introduce the characteristics of a syllable, discuss sonority hierarchy, as well as syllabic consonants and schwa. The next section is devoted to the methodology used in the study and its hypotheses. It is followed by the analysis of the results obtained and the analysis of variance. The last part of the paper concerns conclusions drawn from the results, limitations of the study, and suggestions for further investigation.

Theoretical background
To introduce the topic of consonantal clusters including syllabic sonorants, defining syllable is an apt starting point. The definitions of a syllable in both Polish and English turn out very similar. As a rule, it is a basic unit of speech with a mandatory part in the form of a vocalic peak (nucleus), which may be preceded and/or followed by consonants, meaning onset and coda (Clark and Yallop 1995: 68). Thus, in both languages, the occurrence of a vowel is specified as the main condition for a syllable to exist. Yet, in some syllables, in English, a consonant may also serve as a peak of a syllable, for instance, alveolar laterals and nasals in a word-final position become syllabic (Ladefoged and Johnson 2015: 243). It means that they constitute a syllable, not requiring any vowel. To exemplify, consonantal sequences are consisting of /-Cl/ as in couple [ˈkʌpl̩ ], /-Cn/ as in button [ˈbʌtn̩ ], /-Cm/ as in rhythm [ˈrɪðm ̩ ], /-Cr/ as in memory [ˈmemr̩ ̩ ɪ], and /-Cŋ/ as in bacon [ˈbeɪkŋ̍ ].
Every language has a set of constraints permitting certain combinations of sounds. Some sound sequences are permitted in a particular language, while others do not occur as they do not conform with restrictions regarding the combination or location of sounds (Yule 2010: 45). Phonotactic constraints concern various units of speech, yet the syllable is the most applicable point of reference for any phonotactic facts (Sobkowiak 2008: 220). Certain constraints derive from the Sonority Sequencing Principle, which is a phonotactic rule contouring the structure of the syllable. According to Ladefoged (1996: 245), "the sonority of a sound can be estimated from measurements of the acoustic intensity of a group of sounds that have been spoken on comparable pitches and with comparable degrees of length and stress." The Sonority Sequencing Principle assumes that each syllable has only one peak, which occurs in the nucleus, and that the sonority of the accompanying consonants rises towards the nucleus (Parker 2002: 8). One of the first attempts to classify the sequences in syllable structure was made at the end of the nineteenth century by Sievers (1881). The peak of the syllable, sonant, was characterised by the greatest sonority. As for the remaining elements of a syllable, consonants, the greater their proximity to the peak is, the more sonorous they are. He analysed types of syllables frequently occurring in language, such as mla, mra, alm, and arm and compared them to those used rarely, e.g. lma, rma, aml, and amr. Such analysis concluded that the liquids are more sonorous than the nasals. The sonority scale was further developed by, among others, Jespersen (1904), Saussure (1914), and Grammont (1933) until Foley (1972 proposed scalar feature of resonance, which gained widespread recognition and affected the evolution of the syllable theory. The scale was organised from the least to the most sonorous soundsoral stops, fricatives, nasals, liquids, glides, and vowels. Further research on the sonority included works of Hooper (1976) who suggested that the syllable-final consonant has to be more sonorous than the following syllable-initial consonant. This view was analysed by Murray and Vennemann (1983: 520) and resulted in the Syllable Contact Law, which assumes that in the syllable boundary C 1 ·C 2 , the greater sonority falls from C 1 to C 2 , the more such sequence is preferred. Clements (1990: 299) observed that, in the simplest syllables, sonority rises maximally at the beginning of the syllable and it drops minimally at the end, which can be perceived as quasiperiodic. Yet, he noticed that there were also more complex syllables, departed from the optimal profile. To clarify the analysis of the structure of syllables, Clements (1990: 303) proposed the Sonority Dispersion Principle, in which a syllable is divided into two overlapping demisyllablesone referring to the onset and nucleus, whereas the other one to nucleus and coda. Owing to that, one part of the syllable is independent of the other one with regard to sonority.
As for syllabic liquids and nasals, which come right after glides on the sonority scale, they are perceived as phonological hermaphrodites (Scheer (2004: 283)). Their physiological body remains consonantal, yet their phonological behaviour proves to be vocalic. That is why they are the only sounds, apart from vowels that can constitute the peak of the syllable. Among various sound sequences occurring in the English language, syllabic consonants could be among the most problematic for Polish speakers of English, even though Polish phonotactics allow many complex consonantal clusters. The only phenomenon in Polish phonology that resembles syllabic consonants are the so-called trapped consonants. They are sonorants occurring between sounds of a higher sonority, as e.g. the consonant [r] in krtań "larynx," or between an obstruent and another sonorant. This aspect of Polish phonotactics was widely described by Rubach (1997) who labelled such consonants extrasyllabic. However, they differ from syllabic ones regarding phonological processes (Kijak 2008: 72).
Whether a vowel may be inserted before a syllabic consonant in English is controversial. According to Roach (1991: 79), articulating a vowel before a syllabic sonorant would be considered mispronunciation. As can be seen in Table 1, in some learner's dictionaries, the phonetic transcription of words consisting of a syllabic sonorant does not include any preceding vowel, while the remaining ones provide phonetic transcription including the superscript schwa ' ə '. In Longman Pronunciation Dictionary 3rd edition (2008), Wells describes such a variation as recommended to be omitted by the English language learners, yet admissible. To sum up, articulating a vowel before a (non)syllabic consonant remains a co-variation that is accepted even though not frequently practised by native speakers.
Though frequently characterised as featureless, the schwa remains an unstressed, central, open-mid/ close-mid, and unrounded sound (cf. Figure 1). The classical and idealised characterisation of schwa consists of the formant frequencies F1 = 500 Hz, F2 = 1,500 Hz, F3 = 3,500 Hz, yet in many cases, depending on the word position, they prove to be variable (Silverman 2011: 629). The word-final schwa is classified as mid-central, whereas in the word-internal position, the formant frequencies of schwa are influenced by the surrounding sounds. It is rather high and varies widely in backness (Flemming and Johnson 2007). Since schwa assimilates to its context, it should smoothly move from the articulatory position of the preceding sound towards the position of the following one (Browman and Goldstein 1992). The variability in the quality of schwa widens as the duration of the sound decreases. The tongue movements in order to /ˈlɪtl/ articulate schwa are difficult to complete in a limited time, particularly when its target position is far from the surrounding sounds (Flemming 2007: 12). The average length of /ə/ in a non-final word position is 64 ms (Flemming and Johnson 2007). This is the position in which it occurs in syllables that are routinely realised as non-vocalic whose articulation by L1 speakers of Polish is discussed in what follows.
As for vowel reduction in Polish, its native speakers sometimes reduce words in careless speech as if consonant were syllabic, e.g. bym/bɨm/"I would" into /bm ̩ /, or tylko/ˈtɨlkɔ/"only" into [ˈtl̩ kɔ] (Sobkowiak 2008: 233). Yet, such instances are sporadical. Therefore, Polish speakers of foreign languages are not aware of the pronunciation of words containing reduced sounds if they are not explicitly taught this aspect of their phonology. Moreover, they are not familiar with syllabic consonants. Therefore, the word cotton [ˈkɒtn̩ ] might be pronounced as * [ˈkɒtɒn] or [ˈkɒtən]. The latter pronouncing variant was not marked with an asterisk since it may also be used by native speakers while being infrequent, as indicated above.
This study was conducted to observe to what extent L1 Polish speakers, advanced learners of English, insert vowels before syllabic consonants and what the quality of such vowels is. The vowels which occurred were analysed concerning their formant frequencies as well as duration. They were then compared with the standard realisation of non-final /ə/ as proposed by Silverman (2011). Also, the vowel quality was analysed separately for the following syllabic consonants /n/ and /l/ to observe potential assimilation.

Methodology
The experiment designed for this study involved recording sessions of ten L1 Polish female undergraduate students (aged 19-21) during their first year of studying English studies. The prerequisite for admission to this major is at least level B1 of English. During the recording sessions, they were asked to read different sets of sentences in order to collect data for a group project. The subset used for this study was only a part of substantial research material and consisted of 15 sentences including 10 tokens ending with a consonant and either /l/ or /n/ : eagle, waffle, hidden, apple, table, miracle, little, kitten, lesson, and riddle. All the words occurred in the same syntactic environment and each token occurred three times. The tokens occurred as citations in carrier sentences, as illustrated in (1). (1) He thought "apple," but he said "waffle." Meg wrote "apple" on the board and "waffle" on the wall. Sarah types "apple," but John typed "waffle." To avoid any data manipulation, the subjects were instructed only to read out loud the sentences printed on a piece of paper. They were supposed to complete the task at their own pace. The samples were recorded with a condenser microphone via the digital audio workstation REAPER. The recording sessions took place in one of the small lecture rooms (approx. 20 m 2 ) on the Faculty of English Studies with windows facing the yard in order to minimalise the noises coming from the street.
During the first year, all the subjects participate in a 60-hour practical phonetics course (British English) over two semesters and in a 15-hour lecture series on phonology. That is why, the same subjects were asked to participate in the recording sessions a year later, to verify whether one year of formal phonetic instruction improved the subjects' pronunciation abilities. The second recording session took place in the same room as the year before. Also, the sentences the subjects were supposed to read remained the same.
The speech samples were analysed with PRAAT software. Every token was examined in terms of the occurrence of a vowel through the formant tracker. If vowels were articulated, their first and the second formant frequencies were measured to determine the quality of the pronounced sounds. Then, the length of the inserted vowels was measured to compare them with the characteristics of the standard non-final schwa. The number of tokens in which a vowel occurred was statistically analysed by means of the chisquare test to verify the influence of the syllabic consonant quality (/l/ vs /n/) on the occurrence of a vowel. To examine whether there are statistically significant differences in the quality of vowels inserted before /n/ and /l/, their formant frequencies were analysed using one-way analysis of variance (ANOVA) with the quality of inserted vocalic sounds as the dependent factor and /n/ and /l/ as the independent one.
Do L1 Polish speakers, as a matter of L1-L2 transfer, insert vowels before syllabic consonants? If it is assumed that such a phenomenon occurs, then to what extent is the inaccurate pronunciation fossilised? It is possible to improve their pronunciation? Do the quality of /l/ and /n/ influence the occurrence and the formant frequencies of potentially inserted vowels? The hypotheses for this study are as follows: H 1 : The words pronounced by the subjects do include inserted vowels before syllabic consonants. The following consonant affects whether a vocalic sound is articulated. That is, different numbers of inserted vowels will occur in syllables including /n/ and /l/, respectively. H 6 : The vowels inserted before /l/ do not differ from those articulated before /n/ in (1) the first formant, (2) the second formant, and (3) duration. H 7.1 : There is no difference in the vowels inserted before /l/ between year I and year II regarding: (1) occurrence, (2) F1, (3) F2, (4) duration, and H 7.2 : There is no difference in the vowels inserted before /n/ between year I and year II regarding: (1) occurrence, (2) F1, (3) F2, (4) duration.

Analysis
As the results demonstrate, the majority of the subjects articulated a vowel between the consonant and the syllabic /n/ or /l/. Out of the 600 token realisations recorded within the space of two years, vowel articulation was observed 487 times (81%). This included 253 out of 300 tokens in the first year (84%). In the second year of studies, this number decreased to 234 (78%). The difference between the number of vowels inserted in year I in comparison to year II was statistically significant (df = 1, chi 2 = 3.936, p < 0.05). The percentage of vowels inserted before the syllabic consonant differs among the tokens, as illustrated in Table 2.
As can be seen in Table 3, the quantity of the articulated vocalic sounds also varies among the lexemes pronounced by individual students. Riddle is the word in which all the subjects articulated a vowel between the /d/ and /l/. In the word eagle, the subjects articulated a vowel between /g/ and /n/ 42 times. The data of the first year show 24 instances of vowel insertion out of 30 and 18 instances during the second year. The table displays the words in the ascending order of the frequency of vowel articulation.
The presented instances of vowel insertion demonstrate that subjects' pronunciation is influenced by L1-L2 transfer. Yet, the statistically significant difference between year I and year II shows that such influence is not fossilised. While there is a considerable improvement, the percentage of inserted vowels is still high (81%).
As discussed in the introduction, according to the rules of English phonology, a vowel in a position immediately preceding a syllabic sonorant can be pronounced, even if it does not usually occur (Wells 2008). The vowel articulated in such contexts is schwaan unstressed, central, open-mid/close-mid, and unrounded sound of an average length of 64 ms. As can be seen in Table 4, F1 and F2 frequencies for the first and the second year, as well as the vowel duration, differ among all the tokens. For example, in the word apple, the average F1 value of the vowel is 512 Hz in the first year and 545 Hz in the second year. For F2, the mean frequency is 1,069 Hz in the first year and 1,041 Hz in the second. These qualities of F1 and F2 classify the vowel as open-mid and back, respectively.
For all subjects and tokens, the mean formant frequencies in the first year are F1 = 497 Hz and F2 = 1,361 Hz. The calculations repeated a year later show a change of the first formant to 508 and to 1,370 Hz of the second one, which may lead to a conclusion that, on average, the vowels pronounced before the potentially syllabic sonorants approximate schwa. Yet, the minimal and maximal formant frequencies in Table 4 demonstrate major differences in the vowel realisations and reveal their randomness. First formant frequencies fluctuate between 256 and 757 Hz, whereas F2 values fluctuate between 705 and 2,626 Hz. The   Table  Little  Miracle  Lesson  Kitten  Riddle   YEAR  I  II  I  II  I  II  I  II  II  I  I  II  I  II  I  II  I  II  I  average vowel duration measured in the first year was 92 ms, and it decreased to 85 ms in the second year. Thus, there is a statistically significant difference between year I and year II as t = −3.258, p = 0.0013 (t-test for dependent means with erased unmatched pairs). Nevertheless, the vowels proved to be somewhat longer than the standard non-final schwa. Table 5 shows the classification of vowels articulated by the Polish subjects before syllabic consonants. The average values categorise inserted vowels as open-mid central sounds. Regarding the differences between year I and year II, the sound classification remains the same. Yet, the minimal and maximal parameters prove the sounds to greatly differ in their quality. Table 6 shows the percentages of inserted vowels separately for words containing /l/ and /n/ for every subject. The results do not indicate that the quality of the final consonant affects the occurrence of vowel insertion.
Yet, the vowels inserted before /l/ and /n/ prove to differ in the duration, as illustrated in Table 7. The vocalic sounds preceding syllabic /l/ were longer than those articulated before /n/. As for the former, the vowels inserted in year I lasted on average 98 and 89 ms in year II, which results in a total average of 94 ms. As for the latter, the vowels lasted on average 79 ms in year I and 76 ms in year II. In total, the vowels inserted before /n/ lasted on average 77 ms.
The obtained results also reveal a substantial difference in the second formant frequencies of the words ending with /n/ and /l/. F2 values for the inserted vowels in kitten, lesson, and hidden classify these sounds as front, whereas the second formant frequencies in apple, waffle, little, riddle, miracle, table, and eagle are characteristic for either back or rather central vowels. Figure 2 exemplifies the discrepancy between the  realisations of inserted vowels articulated by one subject, in the word-final consonantal clusters containing the syllabic /n/ and /l/. A one-way ANOVA was conducted to compare the effect of the syllabic sonorant on the quality of the vowel. The analysis of variance shows that the effect of syllabic consonant type (/n/ or /l/) on the articulated vowel's second formant values is significant, F (1, 254) = 314.998, p < 0.001. The result achieved a year Table 6: Percentages of vowels inserted before /l/ and /n/ Vowels inserted before /l/ (%) Vowels inserted before /n/ (%)   . The factorial analysis of variance among separate subjects shows significant difference only in 8 instances (subject 2, subject 5, subject 8, subject 9 in the first year and subject 2, subject 6, subject 8, and subject 10 in the second one). In the remaining thirteen situations, there is no significant effect of the type of the syllabic consonant on the first formant value of the inserted vowel quality, as can been seen in Table 8. The possible effect of consonant quality was also assessed. The potential difference between the numbers of vowels realised before /n/ and /l/ and their change over time was tested by means of the chi-square test. In the first year, vowels were articulated in 172 out of 210 cases before /l/ (82%) and 81 times out of 90 before /n/ (90%). The difference was statistically insignificant. In the second year, the occurrence of vowels went down to 160 before /l/ (76%) and 74 before /n/ (82%). Also in that year the difference in the occurrence of vowels preceding /l/ and /n/, respectively, did not reach significance. While for the whole sample there occurred a weakly significant reduction of the total number of vowels realised in the second year compared with the first (see above), it could not be demonstrated that this change was different for vowels preceding /l/ and /n/, respectively. As for the duration of the vowel, it was significantly shorter before /n/ than before /l/ in both years (t-test for independent samples, year one: t = 4.3216, p < 0.05, year two: t = 2.497, p < 0.005). The differences in the quality of the vocalic sounds preceding /l/ and /n/, respectively, and their changes over time were on the whole negligible.

Conclusion
The analysis of syllabic consonants revealed that hypothesis H 1 has been confirmed as the subjects articulated a vowel between the consonant and the syllabic sonorant in 81% of cases. The remaining 19%  ). Therefore, H 2 concerning the difference in vowel quality and duration has been refuted.
In the first-year data, there were 257 instances (86%) of vowel insertion, thus the subjects have made statistically significant progress when articulating only 230 vowels (77%) in the subsequent year. The decrease between the number of inserted vowels between year I and year II has proved to be statistically significant, which means that with regard to the occurrence, hypothesis H 3 has been refuted; therefore, the L1-L2 influence is not fossilised (notwithstanding that each subject should be separately considered concerning the improvement and phonetic awareness). As for H 4 , it was partly confirmed as the subjects have not demonstrated significant progress regarding the quality of the vowels between year I and year II (F1 and F2). However, as for the vowel duration, the average measured in the first year was 88 ms, and it decreased to 82 ms in the second year. Thus, there is a statistically significant difference as p = 0.00151 (two-tailed t-test for dependent means). Yet, the vowels still proved to be somewhat longer than the standard non-final schwa.
It could not be demonstrated that the phonetic difference between /l/ and /n/ affects the occurrence of preceding vowels as predicted in hypothesis H 5 . However, a considerable difference between words containing /l/ and /n/ was observed in the quality and duration, so hypothesis H 6 could be refuted. The differences in F2 proved to be statistically significant, whereas those in F1 were only partly significant, i.e. disappeared in the second year of studies.
The last two hypotheses (H 7.1 , H 7.2 ) concerned the differences between year I and year II with regard to the occurrence, quality, and duration of the vowels inserted before /l/ and /n/, respectively. As for H 7.1 , it has been partly refuted since there is a significant difference in the number of inserted vowels before /l/ and their duration, yet not in the quality. The occurrence of the vocalic sounds has improved by 9% from year I (82%) to year II (76%), which proved to be statistically significant (df = 1, chi 2 = 3.936, p < 0.05). The average quality has not changed substantially as in year I F1: 503 Hz, F2: 1,171 Hz, whereas in year II -F1: 513 Hz, F2: 1,170 Hz. The duration has changed by 9 ms. Moving to H 7.2 , the hypothesis has been confirmed as the percentage of the inserted vowels before /n/ changed by 8% (90% year I, 82% year II). The quality of the vocalic sounds has not changed considerably as in year I; on average, F1 was 482 Hz and F2 -1,767 Hz, whereas in year II, F1 was 499 Hz and F2 was 1,770 Hz. The average duration of the vowels was 79 ms in year I and 76 ms in year II. The longer duration of the vocalic sounds and its reduction in the second year may suggest that the reason for the insertion is difficulty in articulation rather than lacking awareness of syllable structure. This is also suggested by the differences between the mean duration of the vowel inserted before different consonants.
To sum up, vowel insertion and its quality remained almost stable, and stable for all practical purposes, in spite of intense pronunciation training and raising phonological awareness through explicit teaching of syllable structure in English including vowelless syllables. That may mean that more emphasis should be put on teaching the sound system of English, allophonic variation, and phonological processes at the early stages of education in order for learners to obtain control over sounds and sound sequences unfamiliar in Polish, or other native languages in the world.
This study examined only one phenomenon concerning L1-L2 transfer in pronunciation. It would be beneficial to explore other possibly problematic, for L1 Polish speakers, sound sequences in English. Since it only spanned one year, the change or otherwise in the subjects' pronunciation skills would be more visible if the study was continued in year III, or even until year V (if they chose to gain an MA degree). However, it would also be beneficial to examine to what extent the native speakers consider vowel insertion as an error or an irritation, and whether it disturbs the act of communication. The results might indicate that such deviation from the standard English pronunciation does not pose a problem and the instances of L1-L2 transfer are inconspicuous, especially as English has become the modern-day Lingua Franca and L2 English speakers may be now freer to introduce their own pronunciation idiosyncrasies deriving from their mother tongues.