On the two rhotic schwas in Southwestern Mandarin: when homophony meets morphology in articulation

: This is an acoustic and articulatory study of the two rhotic schwas in Southwestern Mandarin (SWM), i.e., the er -su ﬃ x (a functional morpheme) and the rhotic schwa phoneme. Electromagnetic Articulography (EMA) and ultrasound results from 10 speakers show that the two rhotic schwas were both produced exclusively with the bunching of the tongue body. No retro ﬂ ex versions of the two rhotic schwas were found, nor was retraction of the tongue root into the pharynx observed. On the other hand, the er -su ﬃ x and the rhotic schwa, though homopho-nous, signi ﬁ cantly di ﬀ er in certain types of acoustic and articulatory measurements. In particular, more pronounced lip protrusion is involved in the production of the rhotic schwa phoneme than in the er -su ﬃ x. It is equally remarkable that contrast preservation is not an issue because the two rhotic schwas are in complementary distribution. Taken together, the present results suggest that while morphologically-induced phonetic variation can be observed in articulation, gestural economy may act to constrain articulatory variability, resulting in the absence of retro ﬂ ex tongue variants in the two rhotic schwas, the only two remaining r-colored sounds in SWM.


Introduction
This work is an acoustic and articulatory study of the two rhotic schwas in an understudied dialect group of Mandarin, Southwestern Mandarin (henceforth SWM), namely the er-suffix and the vowel phoneme /ɚ/.Impressionistically speaking, the two rhotic schwas are homophonous, distinguished only in that the er-suffix is a functional morpheme.Therefore, the two rhotic schwas in SWM present an interesting case study of the rhotic vowels from a typological perspective.To set the stage, general descriptions of er-suffixation, the rhotic schwa phoneme, and the acoustic and articulatory properties of the rhotic vowels/approximants are provided in the sections that follow.

Er-suffixation in Mandarin
Er-suffixation (a.k.a. the r-suffix, the rhotic suffix, or érhuà 'er-ize/ization') is perhaps one of the most well-known morpho-phonological processes in Mandarin Chinese.Diachronically speaking, the documented cases of érhuà have at least dated back to the Ming dynasty (1368-1644 C.E.;Li 1986).Along with other sources, this suffix was primarily derived by means of attaching er 'child' to a stem to form diminutives.For obvious reasons, previous studies have overwhelmingly focused on the er-suffixation in Beijing Mandarin, on which Standard Chinese (Pǔtōnghuà "common speech of the Chinese language") is based.There is no doubt that the er-suffix is an "r-like sound" (Hartman 1944: 33), although Chao's seminal work (1970 [1968]) on modern Chinese grammar describes the er-suffix as a subsyllabic suffix /-l/ in Beijing Mandarin.In previous acoustic studies, it has been confirmed that F3 is lowered in er-suffixed vowels (or, a relatively stable small F3-F2 distance; see Huang 2010, Lee and Zee 2014, Shi 2003, Xing 2021, among others), which is conventionally taken as an indication of rhoticity (first observed in Potter et al. (1947) book, Visible Speech, as cited in Delattre and Freeman (1968), but see Lindau (1985)).The on-going debate, nevertheless, is whether the er-suffix is "segment-bound," forming a sequence of a non-rhotic vowel plus the rhotic schwa/approximant, e.g., [paɚ], or realized as rhotacization throughout the whole of the rime, e.g., [a˞].In the Chinese-language literature, Li (1986) claims that the er-suffix is a retroflex apical vowel (/ʅ /) and proposes that the er-suffix may be attached to a stem, forming a diphthong, i.e., {aʅ, əʅ}, or merged into a stem, resulting in a rhotacized rime, i.e., {a ʅ , ə ʅ , u ʅ }.Lin and Shen (1995), among others, hold that rhoticity is almost synchronous with the vowel across the board.However, a more prevalent view, as far as we know, is that the er-suffix is a (subsyllabic) rhotic schwa [ɚ] (i.e., the second part of a diphthong), as described by Duanmu (2007), Wang (1997), Lee and Zee (2014), and Lin (2007), among others.
Results from articulatory studies may shed light on the debate over the phonetic realizations of the er-suffixation.Lee's (2005) Electromagnetic Articulography (EMA) results from three Beijing Mandarin speakers show that er-suffixation is realized as a subsyllabic /ɚ/, forming a sequence of a non-rhotic vowel plus /ɚ/, when the rime ends with a non-back vowel.On the other hand, the entire rime is rhotacized when the (unsuffixed) rime ends with a back vowel, e.g., [u˞].Jiang et al. (2019) report similar EMA results for the er-suffixed forms in Northeastern Mandarin, a closely related dialect of Beijing Mandarin.Similarly, through a qualitative exploration of the dynamics of the er-suffixed /au/ in Beijing Mandarin, Xing (2021: 121) remarks that "rhoticity is present from the beginning of the vowel" in her ultrasound results: [a˞u˞].In sum, previous studies basically all agree that the er-suffix may be a subsyllabic /ɚ/ (i.e., the second part of a diphthong), or may lead to a rhotacized rime, depending on the context, as far as Northern Mandarin (here, Beijing and Northeastern Mandarin) is concerned.
Regarding the well-established retroflex versus bunched tongue shapes of the /ɹ/ sound in American English (Delattre and Freeman (1968), et seq.), Lee and Zee's (2014: 386) EMA results indicate that the er-suffix in Beijing Mandarin "does not result in retroflexing but rhotacizing the vowels," because "the tongue tip or tongue front is not curled up and backward, and the underside of the tongue does not touch the anterior part of the hard palate."Jiang et al. (2019) also report that er-suffixation consistently involves a bunched tongue configuration in Northeastern Mandarin.On the other hand, results of ultrasound studies instead indicate that the er-suffix may be produced with either a retroflex or a bunched tongue shape by Mandarin speakers from Beijing (Xing 2021) or from Beijing, Hebei, and Shandong (Chen and Mok 2021).Details aside, Xing's (2021) finding is that the retroflex variant is the dominant type (14 out of 18 participants) among Beijing Mandarin speakers, while there are more bunched variants (8 out of 12 speakers) identified in Chen and Mok (2021).
Regarding the other components in the articulation of rhotics, first, Lee and Zee (2014: 386) remark that "the tongue body is retracted towards the pharynx" during er-suffixation in Beijing Mandarin.Xing (2021) also makes a similar observation based on her ultrasound results.Second, it is still not clear if the production of the er-suffix involves lip rounding, a known characteristic of the English rhotic schwa (Delattre and Freeman (1968), et seq.).
Finally, little attention has been paid to the rhotic schwa phoneme /ɚ/, the stand-alone rhotic schwa in Mandarin.Jiang et al. (2019) report that the rhotic schwa is produced with tongue tip (TT) raising and involves substantial movement of tongue when gliding from initial to final vowel quality (or, diphthongization) in Northeastern Mandarin, whereas Chen and Mok (2021) find more instances of bunched tongue configurations (8 out of 12 speakers from Beijing, Hebei and Shandong) in their ultrasound results of the rhotic schwa (their syllabic /ɹ/).
The brief description above suggests that the er-suffix (and the rhotic schwa) in Beijing and Northeastern Mandarin may well be subject to distinct articulatory realizations, in the same way as the consonantal and syllabic /ɹ/'s in American English (see Mielke et al. 2016 for a recent overview).In addition, tongue root retraction may also be found in the production of er-suffixation.Therefore, the entry point of the present study is to contribute more empirical data to a growing body of work on the (un)expected diversity of closely related languages/dialects such as the different varieties of Mandarin, by investigating the acoustic and articulatory properties of the two rhotic schwas in SWM.

The two rhotic schwas in Southwestern Mandarin
Southwestern Mandarin (SWM), with over 250 million native speakers, is the most spoken variety of Mandarin Chinese (Li 1997).SWM belongs to one of the eight groups of Mandarin Chinese and is mainly spoken in Southwest China, including Sichuan, Chongqing, Yunnan, Guizhou, most areas of Hubei, and some areas of Hunan, Shaanxi, Guangxi and Jiangxi (Li 2009; see also the colored areas in Figure 1).In the present study, our data were collected from young speakers from different subdialects in the representative group of SWM: the Chéngdū-Chóngqìng group (often abbreviated as the Chéng-yú dialect group in the Chinese-language literature), spoken in western Hubei, Chongqing, and eastern Sichuan (Wurm et al. 1987), indicated in yellow in Figure 1.It is widely accepted that the (sub)dialects in SWM are highly stable and homogenous in terms of phonetic and phonological patternings, since they descend from the Mandarin dialect spoken by a continuous influx of immigrants from the same neighboring provinces of Hubei, Hunan, and Jiangxi during the Ming and Qing dynasties (Li 1997).The er-suffix has been semantically bleached 1 in SWM; more importantly, unlike its counterpart in Northern Mandarin (Beijing and Northeastern alike), the er-suffix in SWM features the following unique characteristics: first, there are only four output forms of the er-suffix in SWM: {ɚ, jɚ, wɚ, ɥɚ} (Yang 2002, Zheng 1987; recall the debate over the subsyllabic /ɚ/ vs. rhotacized vowel in Section 1.1).Second, with some rare exceptions, the stem must be a polysyllabic word, which is typically a reduplicated disyllabic word.Third, "rime usurpation" is obligatory in er-suffixation in SWM (cf.Zimmermann's (2013) analysis of mora usurpation in Yine), meaning that the entire rime must be completely deleted to accommodate the er-suffix, except for the high and rounded vocoids (more precisely, the high/rounded vowels as well as the prenuclear glides), which are preserved under glide formation.Representative examples of the four variants are provided in Table 1, where tones are omitted and √ means a lexical root.
On the other hand, SWM has a rhotic schwa phoneme: /ɚ/ (see also fn. 5).This phoneme cannot be combined with an onset or a coda to form a syllable, so its distribution is highly restricted in the lexicon; only a few real words/morphemes exist, e.g., ɚ 2 'two', ɚ 3 'bait', ɚ 3 '√ear: lexical root for "ear" (bound morpheme)', etc.In other words, the rhotic schwa phoneme /ɚ/ may be regarded as a marginal phoneme in SWM (see, e.g., Hall 2013).It is equally remarkable that there are only two rhotic/r-colored sounds in SWM, namely the /ɚ/ phoneme and the er-suffix.In contrast, Beijing Mandarin has a rhotic onset phoneme, which is represented as 'r' in Pinyin romanization and is transcribed as an apical post-alveolar approximant /ɹ̺ / in Lee and Zee (2003).This syllable-initial /ɹ/ sound is produced with a bunched tongue posture in all 12 speakers from Northern China, according to Chen and Mok (2021) and in 10 out of 18 speakers from Beijing (Xing 2021).This rhotic/r-colored phoneme  1).
The two rhotic schwas in Southwestern Mandarin corresponds to a voiced alveolar fricative /z/ in SWM, presumably as the result of onset fortition, however.Furthermore, the famous three-way contrast (alveolar vs. retroflex vs. alveopalatal; see, e.g., Duanmu (2007), Lee andZee (2003, 2014), Lin (2007) and references cited therein) in Mandarin sibilants has been lost in most varieties of SWM (in particular, the Chéngyú group; see Figure 1), resulting in a two-way contrast of sibilants: alveolar versus alveopalatal.Therefore, the "retroflex" apical vowel in Beijing Mandarin, always co-occuring with the "retroflex" sibilants is lost in SWM as well.Note that the "retroflex" apical vowel is transcribed as an apical postalveolar approximant, based on Lee and Zee's (2014) EMA results (but see Lee-Kim 2014).In sum, the variants of the er-suffix as well as the contrasts in the sound inventory, have been, to a significant extent, simplified in SWM, in comparison to Northern Mandarin.The significance of these cross-dialectal differences will be addressed in Section 4.1.This section is closed with a description of the sound system of Chengdu Chinese, the representative variety of SWM (He and Rao 2014).Like Duanmu's (2007) analysis of Standard Chinese, Chengdu Chinese also has a maximal syllable template, CGVX (where C = Consonant, G = Glide, V = Vowel, and X = Nasal coda or Glide), with the following consonant phonemes: {p, p h , t, t h , k, k h , ts, ts h , tɕ, tɕ h , m, n/l, ɲ, ŋ, f, v, s, z, ɕ, x}, vowel phonemes: {i, u, y, a, o, e, ɚ} and four lexical tones in Chao's tone notation: {T1: 45, T2: 21, T3: 42 and T4: 213}.See also Section 4.3 for description on word-level prosody in SWM.

Why the rhotic vowels in Southwestern Mandarin are
"special" Rhotic vowels are typologically rare (Maddieson 1984); nevertheless, SWM presents an interesting case of the cross-linguistic rarity of the rhotic vowels from a completely novel angle.Precisely, the phonemic /ɚ/ and the er-suffix are not, impressionistically speaking, distinguishable at all, the only difference being that the er-suffix is a functional morpheme.Importantly, the fact that the er-suffix is not a phoneme per se has not yet received due attention in the literature and one we believe carries significant consequences.Specifically, regarding the relationship between morphemic status and phonetic implementation of homophonous affixes and their "non-morphemic" counterparts, Plag et al. (2017) and subsequent works examine distinct acoustic realizations of the non-morphemic /s/ and /z/ versus the /s/ and /z/ morphemes (e.g., plural, genitive, etc.) in a corpus study, and suggest that morphological structures may have a bearing on surface phonetic realization.
In view of this, we raise the possibility that the er-suffix and the /ɚ/ phoneme may differ in their phonetic realization as well.Support for this view follows from the EMA results reported in Jiang et al. (2019), according to which only the rhotic schwa phoneme, not the er-suffix, is produced with tongue tip raising in Northeastern Mandarin.The present study is thus an attempt to distinguish between the phonetic characteristics of the er-suffix and the /ɚ/ phoneme in SWM from data collected from multiple speakers using EMA and ultrasound imaging methods.The novelty of the present study is that the contentive (the /ɚ/ phoneme) versus functional (the er-suffix) divide is systematically investigated by comparing the acoustic and articulatory measurements of the rhotic schwas in an understudied dialect group of Mandarin, SWM.
Four research questions to be addressed are listed below.i. Are the er-suffix and the rhotic schwa phoneme produced with both retroflexion and bunching variants?ii."Within-group" comparisons: are the er-suffixes attached to different stems produced identically in acoustics and articulation?iii.Are the er-suffix and the rhotic schwa phoneme produced identically in acoustics and articulation?iv.Are the er-suffix and the rhotic schwa phoneme produced with lip rounding and/or tongue root retractions?
The paper is organized as follows.Section 2 is a description of experimental methods and data analysis.The results of our articulatory and acoustic data are presented in Section 3. Section 4 discusses the findings of the study.Finally, Section 5 concludes this paper.
2 Experimental methods

Participants
Ten native speakers (9 female) of Southwestern Mandarin participated in this study.They were undergraduate or graduate students in their twenties at the time of the experiments (average = 23.3y.o., SD = 2.95) and were born and raised in the Chéngyú dialect group-speaking areas (see Figure 1; specifically, 5 from Yichang, Hubei, 3 from Enshi, Hubei, 1 from Chengdu, Sichuan, and 1 from Guang'an, Sichuan).It was confirmed via background screenings that they acquired Standard Chinese only as part of their school education.The participants had no self-reported speech or hearing problems.They all gave written informed consent and received compensation for their participation.Due to a data recording issue, we report the results of EMA data from seven participants.The ultrasound image data are based on the results of all ten participants.
The two rhotic schwas in Southwestern Mandarin

Materials
The recording materials are comprised of 69 meaningful words, including (i) 33 unsuffixed monosyllabic words, (ii) 32 er-suffixed disyllabic forms, and (iii) 4 disyllabic words containing the /ɚ/ phoneme in word-final position.The syllable structures of the stimuli include CV, CGV, CVG and C(G)VN, where C = {p, t, k, tɕ h }, G = {j, w, ɥ}, V = {i, y, e, a, o, u, ɚ} and N = {n, ŋ}.Tones are not controlled for primarily because tone values may not be identical across all of the subdialects under investigation.However, actual pitch values for each tone category are quite similar.Below are some representative examples (Table 2).See Appendix A for the complete wordlist.

Recording procedures
Prior to recording participants were asked to read a newspaper paragraph in SWM.
The participants were then asked to read a randomized list of the target words from a computer screen in a sound-proof room in the phonetics lab, National Tsing Hua University.The stimuli were displayed using the Articulate Assistant Advanced (AAA, Articulate Instruments) software and each slide was shown for 4 s.The participants were asked to embed the target words in the carrier phrase "__, pa __ pa", meaning "__, give __ Sentence Final Particle: (Speaking of)___, just give____(to me)!" in SWM.Six repetitions were collected for each token and in order to control for outside factors, only the more naturally rendered second occurrence of a stimulus in the carrier phrase was analyzed and reported.A total of 2,989 EMA tokens (= 69 words × 6 repetitions × 7 participants) were analyzed and reported, and 4,140 tokens (= 69 words × 6 repetitions × 10 participants) were analyzed and reported for the ultrasound image results.

Apparatuses
The articulatory data were recorded concurrently using EMA (WAVE; Northern Digital Inc.) at a sampling rate of 200 Hz, and ultrasound (Micro system; Articulate Instruments Ltd.) at 65 fps.Acoustic data were simultaneously recorded using a Sennheiser unidirectional shotgun microphone at 24 kHz.Regarding the EMA experiment, seven sensors were attached to the tongue, lips, upper incisors and lower incisors (jaw) using the instant dental adhesive α QUIN (BSA), together with the dental cement GC Fuji I. Specifically, three sensors were affixed midsagittally to the tongue: one on the tongue tip, about 0.5 cm back from the anatomical tip, one on the dorsum of the tongue, as far back as comfortable, and one midway between the tongue tip and tongue back sensor.One sensor was affixed to the lower incisors to track jaw movements and two additional sensors were placed on the vermillion border of the upper and lower lips.Three reference sensors were also placed on the left and right mastoid processes and upper incisor to correct for head movement.The occlusal plane was identified from a bite plane using a fixed triangular protractor with three sensors glued to it.A palate trace was collected using a spare sensor attached to a stir stick; participants were instructed to trace the stick from the back of the hard palate to their front teeth (Rebernik et al. 2021).The articulatory dataset produced by the EMA recordings was post-processed and analyzed using custom MATLAB scripts.Ultrasound data were collected using a transducer with a 92°field of view, set at a depth of 120 mm.The frame rate was set to 65 fps.The participants wore an all-plastic UltraFit headset (Articulate Instruments Ltd.; see Spreafico et al. (2018) for more detail) to stabilize the probe under the chin during imaging of the midsagittal tongue profile (Wrench and Scobbie 2016).
Acoustic recordings were synchronized with the EMA and ultrasound image data by means of the WaveFront software (NDI) and the synchronization unit of the Micro system (Articulate Instruments), respectively.

Statistical analysis
For quantitative results, the articulatory and acoustic data are analyzed using generalized additive mixed modeling (GAMM) analysis (Wood 2017(Wood [2006]]).Our analysis is primarily based on the procedures and suggestions provided in Wieling (2018) as well as in Sóskuthy (2021) since the trajectories of the EMA sensors (as well as the tongue contours in ultrasound imaging and the formants) are nonlinear in nature.

EMA data
Regarding EMA experiments, the head-corrected data were z-transformed for subsequent GAMM analysis.We used the R package mgcv (Wood 2019)

for model
The two rhotic schwas in Southwestern Mandarin fitting and models were constructed with the bam() function.For each model, "Sound" (e.g., the er-suffix vs. the /ɚ/ phoneme) was included as the main effect, and the measurement of interest was specified as the dependent variable (i.e., z-transformed positions for each EMA sensor).The models included a by-word smooth function through time to investigate articulatory changes over time, and a random smooth to account for variation between all seven SWM speakers.See also Figure 5 for a visual summary of the GAMM models fitted for EMA sensor trajectories in Section 3.2.1.

Ultrasound data
The ultrasound data were analyzed with the help of Articulate Assistant Advanced (AAA) software.We extracted the tongue contours at the first quartiles (25 %), midpoints (50 %), and the third quartiles (75 %) of an acoustically defined rime using the default 42 point positions exported by AAA for each tongue contour.Following Mielke's (2015) suggestion, the extracted tongue contours were transposed into polar coordinates using AAA software.Again, we tested these predictions using Generalized Additive Mixed modeling (GAMMs; Sóskuthy 2017;Wieling 2018;Wood 2017Wood [2006]]), with the help of the R script in Heyne et al. (2019), adapted to our data by us.We ran various models to evaluate the best fit one (e.g., no random effects, random effects, multiple predictors including Type (i.e., er-suffixed vs. unsuffixed, different vowels, etc.)).The model we adopted is summarized below.We modeled one variable DIST (the distance of the fitted tongue contour point from the origin), based on the following predictor variables.The tongue contours at the first quartiles, midpoints, and the third quartiles of a rime are compared using the GAMM analysis.See also Figure 6 for a visual summary of the GAMM models fitted for ultrasound splines in Section 3.2.2.
main effect of Sound (e.g., unsuffixed vs. er-suffixed; er-suffix attached to stem /a/ vs. er-suffix attached to stem /an/; er-suffix vs. the rhotic schwa phoneme /ɚ/, etc.) smooth term for theta (the angle in relation to the origin) smooth term for theta by the interaction of Type and Vowel random by-subject smooths for theta by Vowel

Acoustic data
The acoustic data were analyzed using Praat (Boersma and Weenink 2007, version 6.0.30).Formant values for F1, F2, and F3 in the sonorous rimes were extracted using Praat scripts developed in the Phonetics Lab at National Tsing Hua University.The formant values subsequently were normalized using Labov's method, as in the Atlas of North American English (ANAE).Labov's ANAE method uses logarithmic means to normalize the formant values.Unlike Nearey's methods, ANAE is speaker-extrinsic in that it computes a single grand mean for all speakers included in this study, thereby preserving sociolinguistic variation (see Thomas and Kendall 2007 for more detail and references cited).Comparisons of the formant values were conducted using Generalized Additive Mixed modeling (Wood 2017(Wood [2006]]) as well.See Section 2.5.1 for the analytical procedures.

Bunched configurations and the Tongue Retroflexion Angle (RA)
The first research question (i) is whether the er-suffix is produced with both retroflexion and bunching variants.There are no cases where an obvious Tongue Tip (TT) gesture is identified through visual inspection of the articulatory data. 2 However, we did find two distinct subtypes of the er-suffix.Consider now Figures 2 and 3, where the two distinct subtypes are illustrated.For ease of visual comparison, the temporal changes of the tongue configurations of the er-suffix are represented as solid lines, which refer to the different (acoustically determined) deciles of a sonorous rime, whereby the blue line refers to the onset of an er-suffixed rime (t1, the first decile of the rime), the brown line the offset (t10, the last decile of the rime), and so on.The positions of each EMA sensor are averaged over six repetitions for each target word and connected using a cubic spline.2000)).Dorsum-up bunched er's involve (mild) tongue retraction, while dorsum-down bunched er's feature a considerably more convex tongue body followed by (some) tongue retraction, especially in the presence of a prenuclear glide (Figure 3).According to our data, speakers F01, F03, F05 and F07 belong to Type A and F02, F04 and F06 Type B. The present discrepancy cannot be ascribed to sub-dialectal differences since, for example, subject F01 is from Yichang, Hubei, whereas subject F03 is from Chengdu, Sichuan, which is approximately 860 km apart as the crow flies.On the other hand, speakers F01, F04, F05 and F06 are all from Yichang, Hubei, but only speakers F04 and F06 may be classified as Type B.
As a further step, the EMA data for bunching are quantitatively analyzed by means of the Tongue Retroflexion Angle (RA), proposed in Tiede et al. (2019).Precisely, the RA is subtended by the extension of lines between TD:TB and TB:TT, as illustrated in Figure 4.A bunched tongue posture is defined (in red), if the RA is positive (measured CW) and a retroflex tongue configuration is defined as a negative RA (measured CCW; in blue).
The RA (Tongue Retroflexion Angle) values of the rhotic schwa phoneme and the er-suffixes attached to the six monophthongal stems {i, y, e, a, o, u} were calculated.The RA values are obtained at the offset of an er-suffix to minimize the potential impact from the gliding motions by the high vocoids (see Figure 3).As we shall see in Tables 3 and 4      For ease of discussion, we arbitrarily define the two subtypes in Figures 2 and 3 as (a) Type A: a "slightly bunched" tongue posture (whose RA is positive and is smaller than or equal to 15°) and (b) Type B: a "typically bunched" tongue posture (whose RA is greater than 15°).Consider now Tables 3 and 4, where darkly shaded cells refer to more tokens of Type B (typically bunched) and more lightly shaded tokens of Type A are lightly shaded.The tallies of the three categories of the RA values are represented as, for example, (0:6:0), meaning (0 tokens for Retroflex [≤0°]: 6 tokens for Slightly Bunched [≤15°]: Typically Bunched [0 tokens >15°]).
From the measurements of the Tongue Retroflexion Angle (RA), our finding is that there is no single instance of a typical retroflex er-suffix and a retroflex schwa (i.e., RA ≤ 0°) across all the participants.In sum, we can say that only bunched tongue postures were observed in this study, as far as the two rhotic schwas are concerned.

The er-suffix: "within-group" comparison
We now test whether these er-suffixes differ in tongue movements/postures, namely whether (in)complete neutralization takes place in the production of these er-suffixes (i.e., research question (ii)).The results are presented in this order: EMA, ultrasound, and acoustic data.

"Within-group" comparison: EMA results
Regarding the EMA results, the pair-wise comparisons are based on the four variants of the er-suffix illustrated in  5 for a complete list).The trajectories of the sensors for the Tongue Tip (TT), the Tongue Body (TB) and the Tongue Dorsum (TD) are compared along the horizontal (x) and vertical (z) dimensions, by means of the Generalized Additive Mixed Model analysis (GAMM, See Section 2.5.1).We used the R package itsadug (van Rij et al. 2017) for visualizing the resulting patterns.Consider now Figure 5,3 where the trajectories of TDx (Tongue Dorsum-longitudinal) and TBz (Tongue Body-vertical) of the er-suffixes in [tu.twɚ] 'cheek' and [toŋ.twɚ]'bare to the waist' are compared.
A summary of GAMM results in the lingual articulators is given in Table 5.Note that two check signs (√√) mean the two er-suffixes significantly differ along a certain dimension (Horizontal or Vertical) of a given EMA sensor (e.g., Tongue Tip, TT) throughout at least 80 % of the entire rime (see the lower panel of Figure 5); while a check sign (√) means the two trajectories are significantly different throughout at least 50 % of the entire rime.No difference or difference less than 50 % of the entire rime is left blank.
As seen in Table 5, the er-suffixes are not articulatorily indistinguishable in a pair-wise comparison (i.e., 15 out of 27 pairs show significant differences at least 50 % of the rime).No significantly different trajectory of any EMA sensor can be found across all the pair-wise comparisons, however.In other words, there is no consistent "within-group" difference among the er-suffixes attached to different stems, as far as the EMA data are concerned.
Table : Summary of GAMM results in the lingual articulators: the er-suffixes (TT = Tongue Tip, TB = Tongue Body, TD = Tongue Dorsum, x = front-back, z = up-down; √√ = significant difference greater than  % of the rime; √ =  %- % of the rime; blank = no difference or less than  % of the entire rime; see the lower panel of Figure ).

"Within-group" comparisons: ultrasound results
The ultrasound data were concurrently collected along with the NDI Wave.For "co-referencing" purposes, the ultrasound data are used to observe holistic midsagittal tongue shapes.To begin, below is a sample illustration of how the ultrasound data are displayed in a polarscatter plot.In Figure 6, the red solid line refers to the Type 1 tongue shape (here, the unsuffixed stems), while the blue dotted line indicates the Type 2 tongue shape (here, the er-suffixed stems).Both were extracted from the midpoints of an acoustically defined rime.The thinner dotted lines of each color indicate the region of 95 % confidence, and an area where the background is shaded gray is where there is a statistically significant difference between the positions (or, region of significance, which was produced by the itsadug function where C = {p, p h , t, k, tɕ h }, if available.In Figures 7 and 8, the comparisons of the four representative pairs at the first quartiles (25 %), the midpoint (50 %), and the third quartiles (75 %) of the rime are illustrated.respectively: Figure 7   4 R syntax for the model of Figure 6: bam (DIST ∼ Type.Vowel + s(theta, bs = "cr", k = 10) + s(theta, bs = "cr", k = 10, by = Type.Vowel) + s(theta, subject, bs = "fs", k = 10, m = 1, by = Vowel), data = df_gam, AR.start = df_gam$start, rho = rho, discrete = TRUE, nthreads = ncores), where DIST is the distance of the fitted tongue contour point from the origin (of the polar coordinate), and theta is the angle in relation to the origin.The variable Type.Vowel encodes the interaction of Type (i.e., unsuffixed vs. er-suffixed) and Vowel (i.e., a, i, etc.).It is used as a contour adjustment for the random effect that uses subject ID, used to model the within-speaker variations; AR.start applied to tell the model the 42 points of each frame are making a tongue spline, while k is number of knots to control for the degree of non-linearity in the smooth.

The two rhotic schwas in Southwestern Mandarin
We can see from Figures 7 and 8 that there is no significant difference between these pairs across all speakers, suggesting that these er-suffixes have similar tongue contours at the first quartiles (25 %), the midpoints (50 %), and the third quartiles (75 %) of an acoustically defined rime.Finally, the same conclusion may be made for the other pairs.See Appendix D for the full array of the polarscatter plots.studies (e.g., Yang 2002, Zheng 1987, among others), namely that there are only four variants of the er-suffix: {ɚ, jɚ, wɚ, ɥɚ} in SWM, even though it is fair to say that there is a substantial degree of incomplete neutralization both in acoustics and articulation.

3.3
The er-suffix and the rhotic schwa phoneme /ɚ/ Recall from Section 1.2 that SWM also has a rhotic schwa phoneme (/ɚ/), whose distributions are highly restricted, hence a marginal phoneme.Impressionistically speaking, the er-suffix and the rhotic schwa phoneme are not perceptibly distinctive.In this section, we compare the following pairs to see if the two rhotic schwas differ in acoustics and articulation (Table 7): These pairs are produced in commensurable environments since the final syllable is prosodically non-prominent in SWM (see Section 4.3).In most cases, labial onsets are used as it is assumed that labial onsets trigger the least coarticulatory carryover effects on the following vowels, especially with respect to lingual movement.

Comparing the two schwas: EMA results
Regarding the EMA results, a summary of GAMM results in the lingual articulators is given in Table 8.See Section 2.5.1 for analytical procedures and fn. 3 for the model adopted in this study.
The present GAMM results of the EMA recordings indicate that the rhotic schwa phoneme and the er-suffix mostly differ in the vertical dimension of the Tongue Dorsum (TD) sensor, with the rhotic schwa phoneme being higher than the er-suffix in this regard (not shown here; see Appendix C for the plots of the GAMM results of the EMA experiments).

Comparing the two schwas: ultrasound results
The polar scatter plots of the er-suffix versus the rhotic schwa phoneme are illustrated in Figure 9. See Section 2.5.2 for analytical procedures and fn. 4 for the model adopted in this study.
As shown, the er-suffix and the rhotic schwa phoneme /ɚ/ have significantly different tongue postures both at the midpoints (50 %) and at the third quartiles (75 %) in all the three pairs across all ten speakers.9, in which the er-suffix and the vowel phoneme /ɚ/ are compared with respect to formant values.See Sections 2.5.1 and 2.5.3 for the analytical procedures of GAMM analysis and fn. 3 for the model adopted in this study.

Next consider Table
The acoustic results thus suggest that the /ɚ/ phoneme is acoustically different from the er-suffix along the F1 dimension in all the three pairs across all 10 speakers.It is equally remarkable that the rhotic schwa phoneme /ɚ/'s have higher formant values across the board (not shown here; see Appendix E for the plots of the GAMM results).

Interim summary: the er-suffix versus the rhotic schwa phoneme
In sum, the phonetic differences between the er-suffix and the vowel phoneme /ɚ/ can be recapitulated as follows: -The rhotic schwa phoneme /ɚ/ is usually higher along the vertical dimension of the Tongue Dorsum (TDz) sensor (EMA results) -The rhotic schwa phoneme /ɚ/ and the er-suffix have significantly different tongue shapes both at the midpoints (50 %) and the third quartiles (75 %) of an acoustically defined rime (Ultrasound results) The two rhotic schwas in Southwestern Mandarin -The rhotic schwa phoneme /ɚ/ has higher F1 values across the board (Acoustic results)

Non-lingual components of the rhotics: lip rounding and pharyngealization
In this section, we move on to two more articulatory characteristics of the rhotic sounds reported in the literature, which will be addressed in turn below.

Tongue root retraction to the pharynx
Rhotic vowels are not produced with tongue root retraction to the pharynx (see Hussain and Mielke (2021) for a recent survey).But recall that Lee and Zee (2014: 386) report that "the tongue body is retracted towards the pharynx" during er-suffixation in Beijing Mandarin (see also Xing 2021).For this reason, it is necessary to examine if tongue root retractions occur in the production of the er-suffix and rhotic schwa phoneme in SWM.In this section, the ultrasound data are used to observe the posterior portion of the tongue dorsum, which cannot be reliably captured by flesh-point tracking systems like EMA restricted to the anterior oral tract for sensor placement.
We compare the tongue postures between the first quartiles (25 %) and the third quartiles (75 %) of the same er-suffixes and rhotic schwas using ultrasound imaging data.In Figures 10 and 11, the blue dashed lines refer to the tongue postures at the first quartiles (25 %) of the rime, the red lines represent the tongue postures at the third quartiles (75 %).Note further that we did not include the results of rising diphthongs (i.e., {jɚ, wɚ, ɥɚ} here).The representative data are provided in Figure 9, where the er-suffixed forms are {[Ca.Cɚ]; [Co.Cɚ]; [Ce.Cɚ] and [Cai.Cɚ], where C = {p, Table : The er-suffix versus the rhotic schwa phoneme: GAMM results (√√ = the two rhotic vowels differ along a certain dimension (F, F, or F) throughout  % or more of the entire rime; √ =  %- % of the entire rime; blank = less than  % of the entire rime).

F
F F The two rhotic schwas in Southwestern Mandarin p h , t, k, tɕ h }, if available).See Appendix D.7 for the polarscatter plots of {[Cei.Cɚ]; [Cen.Cɚ]; [Can.Cɚ]}, whereby similar observations may be made.We can see in Figure 10 that there is no obvious tongue root retraction in the left-hand halves of the polarscatter plots.It is also remarkable that the tongue postures differ significantly between the first and third quartiles, suggesting that the ersuffix is, to some extent, diphthongized (see Jiang et al. 2019 for an identical finding regarding the rhotic schwa phoneme in Northeastern Mandarin).Likewise, the same observations hold true for the rhotic schwa phoneme, too.Consider now Figure 11.
In sum, we conclude that the two rhotic schwas do not involve pharyngealization, unlike their counterparts in Beijing Mandarin (Lee and Zee 2014; Xing 2021).Moreover, SWM and Northeastern Mandarin are similar in that the rhotic schwas are both diphthongized.

Lip rounding
Lip rounding is one of the key components in English rhotic sounds, especially in prevocalic positions (see King and Ferragne 2020 for a recent update and references cited therein), although Hussein and Mielke (2021: 22) remark that "vowel rhoticity does not entail lip rounding."For the sake of thoroughness, the empirical issue to be addressed in this section is whether lip protrusion can be found in the two rhotic schwas.In this section, we compare the trajectories of the Upper Lip (UL) and Lower Lip (LL) sensors along the longitudinal (front-back) direction between the monosyllabic stems versus the er-suffix as well as the rhotic schwa phoneme.It is generally acknowledged that the advancement of the UL sensor corresponds to lip protrusion (Farnetani 1999, Westbury andHashi 1997, among others), while LL may be The two rhotic schwas in Southwestern Mandarin confounded with jaw movement (see, e.g., Fletcher and Harrington 1999).In Tables 10  and 11, both the comparisons between the UL and LL sensors are provided, again, for the sake of thoroughness.See also Appendix C for all the GAMM results.
We can see from Table 10 that only three pairs differ along the longitudinal dimension of Upper Lip (ULx), suggesting that the er-suffix does not frequently involve lip protrusion (4 out of 9 pairs in comparisons, whereby no difference found in {[po] vs. [po.pɚ]}means that this particular er-suffixed form also involves lip rounding; see also Table 11).The differences are more robust along the longitudinal dimension of Lower Lip (LLx), but as mentioned earlier, LLx movements may well be a passive consequence of jaw lowering.Consequently, the er-suffix may not be described as a rounded vowel.
In the same vein, we compare the trajectories of ULx and LLx between the monosyllabic stem [po] 'thin' and the rhotic schwa phoneme /ɚ/ since the mid rounded vowel /o/ is closest to the rhotic schwa phoneme in SWM.Consider now Table 11.
We can see from Table 11 that the rhotic schwa phoneme and the mid rounded vowel do not differ along the longitudinal dimensions of both Upper Lip (ULx) and Lower Lip (LLx), suggesting that the rhotic schwa phoneme is not different from the  √ Pairs that show no difference: po versus po.pɚ/te versus te.tɚ/ta versus ke.tɚ rounded vowel /o/ with respect to lip protrusion.To this end, we may conclude that, unlike the results in Table 10, the /ɚ/ phoneme may be transcribed as a rounded/ labialized rhotic schwa in SWM.

Summary
In this section, we have presented the results of the acoustic and articulatory experiments of the er-suffix and the rhotic schwa phoneme in SWM.To recapitulate, our principal findings are itemized as follows: -Both the er-suffix and the rhotic schwa phoneme are invariably produced with a bunched tongue configuration (i.e., Tongue Retroflexion Angle > 0°; see Tables 3  and 4).This study attests not a single instance of retroflex/tip-up rhotic schwas.-The er-suffixes may be different when attached to different stems, articulatorily (EMA/Ultrasound), acoustically (F1/F2/F3 values), or both.However, no consistent difference may be found both in acoustics and articulation.In other words, there are four variants of the er-suffix: {ɚ, jɚ, wɚ, ɥɚ}.-There is no significant tongue root retraction in the production of the two rhotic schwas.
-The er-suffix does not involve (consistent) lip protrusion and the rhotic schwa phoneme may be described as a rounded rhotic schwa.-The er-suffix and the rhotic schwa phoneme are both diphthongized (irrespective of the instances of the diphthongs in {jɚ, wɚ, ɥɚ}).

Discussion
There are two principal findings in this study.First, our quantitatively based results show that no retroflex versions of the two rhotic schwas were found.Second, the er-suffix and the /ɚ/ phoneme differ in acoustics and articulation, even though the two rhotic schwas are perceptibly indistinguishable.Finally, our discussion is closed with a note on the diachrony, synchrony, and typology of the er-suffixation across Sinitic languages.
4.1 Whence comes the articulatory uniformity in the production of the two rhotic schwas?
It is well-established in previous articulatory studies that there is intra-and interspeaker variation in the tongue shape of the consonantal /ɹ/ in English (Delattre and The two rhotic schwas in Southwestern Mandarin Freeman 1968, et seq.).In the syllabic context, Mielke et al. (2016), among others, note that bunching is more frequently found in /ɚ/ than in onset /ɹ/ in American English.Most of the speakers (23 out of 27) produced /ɚ/ with a bunched tongue posture.In addition, Mielke (2015) finds a very similar rate of bunching (6 out of 7) in Canadian French rhotic vowels.More recently, Hussein and Mielke (2021) report that Kalasha rhotic vowels (i.e., {i˞, ĩ˞, e˞, ẽ˞, a˞, ã˞, o˞, õ˞, u˞, ũ˞}) are bunched for the four speakers in their ultrasound study.In Hussein and Mielke (2022: 12), the emergence of the rhotic vowels in Kalasha is hypothesized as a reflex of the diachronic loss of the retroflex approximant coda /-ɻ/, via the following evolutionary path (where "*" indicates reconstructed forms): *Vɖ (Old Indo-Aryan) → *V(ɻ)ɖ (Kalasha) → *Vɻ (Kalasha) → V˞ (Kalasha).That being the case, it may not be surprising to see that the rhotic vowels are consistently produced with a bunched tongue shape in Kalasha because the synchronic rhotic vowels have had an identical source. 5In the same vein, Hussein and Mielke (2022) further entertain the possibility that the emergence of the rhotic schwa in Canadian French can be attributed to the gradual exaggeration of the lowered F3 as the result of an increase of bunching among the front rounded vowels.
Returning to SWM, the two rhotic schwas seem no exception to this pattern (i.e., the bunching as the dominant tongue shape).Indeed, one of our principal findings is that the two rhotic schwas are produced exclusively with the bunching of the tongue body, since not a single token of retroflex versions of them is identified in our quantitatively-attained results (i.e., Tiede et al.'s 2019 Tongue Retroflexion Angle; see Tables 3 and 4), at least in the present study. 6n this limited cohort of languages discussed so far, the rhotic vowels tend to favor the bunching of the tongue body.As a matter of fact, the correlation of the (syllabic) rhotic vowels and the bunched tongue shape has been dated back to Uldall (1958), according to Mielke et al. (2016).This cross-linguistic preference is further strengthened in our study of the two rhotic schwas in SWM, a Sinitic language.To this end, it is tempting to anticipate a unified analysis for it.For example, Mielke et al. (2016) propose an OT-style constraint *CODA ɻ to penalize retroflexion in coda position and this constraint could be motivated by a putative preference for larger anterior gestures in onset position.Mielke et al. (2016: 128) further remark that "[r]etroflexion is more frequent in contexts that do not place conflicting demands on the tongue tip, such as word boundaries, labial consonants, back vowels, and /l/" (see also Heyne et al. (2020) for similar results of (non-rhotic) New Zealand English; cf. the biomechanical modeling of rhotic variation set forth by Stavness et al. (2012)).On the other hand, Scobbie et al. (2015) report that it is extremely rare for Scottish English speakers to have this particular pattern of /ɹ/ allophony: bunched (B) onsets and retroflexed (R) codas, while the other patterns, RR, BB, RB are more or less evenly distributed in the corpora.Scobbie et al. (2015) speculate that the retroflexed shape is inherently more rhotic ("stronger") than a bunched one.That being the case, the retroflexed tongue shape might be more compatible with "strong" onsets.The above-mentioned cannot be carried over to the case of the rhotic schwas in SWM, however.For one thing, the two rhotic schwas both occupy the nuclear position, not the "weak" coda position.For another thing, contextual segmental effects have been mostly if not all excluded due to our experimental design (see Appendix A for the wordlist).Here we offer one plausible explanation, based on Maddieson's (1995: 574) proposal of gestural economy, according to which "there is [a tendency] to be economical in the number and nature of the distinct articulatory gestures used to construct an inventory of contrastive sounds, and it is this (rather than a more abstract featural analysis) that underlies the observed system symmetry" (see also Bybee (2001) for a similar account).In essence, gestural economy is analogous to Clements's (2003: 287) principle of feature economy (namely, "languages tend to maximise the ratio of sounds over features") operating at the phonetic level.By the same token, it is likely the reason why no rhotic schwa with the retroflexed tongue shape is found in this study is because the Tongue Tip gesture is not used in the production of vowels in SWM.In other words, if the acoustic target can be reliably achieved via the bunching of the tongue body, there seems no need to add an extra one to the repository.Note further that there is no rhotic or r-colored sound in the phoneme inventory; in particular, recall that the rhotic approximant onset /ɹ̺ /, the "retroflex" apical vowel, and the "retroflex" sibilants in Beijing Mandarin have been lost in contemporary SWM already (see Section 1.2).We should also acknowledge the likely role of morphological differences in hard palate shape, including parasagittal shape, which may favor bunching as a strategy for achieving lowered F3.Finally, as previously noted, it is possible that we did not sample enough speakers of SWM to actually observe an instance of retroflex /ɚ/ production.In sum, it remains to be seen as to how and why the retroflexed tongue shapes seem cross-linguistically dispreferred in postvocalic and syllabic contexts.

Morphologically-induced contrast preservation
The other major finding in this study is that the two rhotic schwas differ in acoustics and articulation, as summarized in Section 3.5.At first glance, these differences might be simply treated as a consequence of "contrast preservation" in phonological mappings (Kiparsky 1973;Martinet 1967Martinet [1961]]; Trubetzkoy 1971Trubetzkoy [1939]]) as well as in phonetic implementation (Flemming 2004), even though it must be emphasized that the contrast between the two rhotic schwas is perceptually inconspicuous, albeit articulatorily distinct.Further scrutiny, however, reveals that the relation between the er-suffix and the rhotic schwa phoneme is interesting in that the two rhotic vowels are always in complementary distribution.Precisely, recall that the rhotic schwa phoneme cannot be combined with any syllable margin (i.e., onset and/or coda), and the er-suffix cannot stand alone (i.e., an onset is obligatory for the er-suffix).Taken together, it is obvious that the two rhotic vowels in question never endanger a contrast across the board.That being the case, the functional motivation behind contrast preservation seems irrelevant here.It is thus quite puzzling as to why reuse of phonetic targets or individual gestures across multiple speech sounds is not invoked, as has been amply documented in the literature (Chodroff and Wilson 2017;Chodroff and Wilson 2022;Faytak 2018;Fruehwald 2017;Guy and Hinskens 2016;Keating 2003;Lindblom 1983;Maddieson 1995;Ménard et al. 2008), especially when contrast preservation (or, phonological distinctiveness) is apparently not at issue here.Finally, our experimental design excluded or minimized the effects of other potential confounds as well.In particular, the target syllables under comparison are all in final position.In SWM, it has been confirmed the final syllable is prosodically weak in a disyllabic window (see Section 4.3 below for more detail).Therefore, it is fair to say that the er-suffix and the rhotic schwa phoneme were compared in commensurable contexts.To this end, one remaining possibility we can think of is that the grammar (or, the module of phonetic implementation) strives to distinguish between contentive and functional morphemes in phonetic realization.The present finding is thus reminiscent of Plag's et al. (2017) findings, according to which the morphemic /s/ and /z/, for example, are significantly different from the nonmorphemic /s/ and /z/ in terms of acoustic realization in English.We agree with Plag et al. (2017) that morphologically-induced phonetic variations of this sort cannot by adequately explained by both phonological theory and extant psycholinguistic models.For now, we leave the exact mechanisms for further studies in the future, by remarking that the morphologically-driven homophony avoidance in articulation, albeit imperceptible, has not been reported elsewhere, especially when contrast preservation is not at issue.

Diachrony, synchrony and typology of Er-suffixation
The er-suffix in SWM can be regarded as a half-grammaticalized suffix, if compared with its cognate suffix in Northern Mandarin.As mentioned in Section 1.1, the er-suffix may be realized as either a part of a diphthong or a floating feature, resulting in a rhotacized vowel in Beijing and Northeastern Mandarin (cf. the umlauting in German; see, e.g., Trommer 2021 for an updated discussion).By contrast, as witnessed in Section 3, our data have confirmed that the er-suffix in SWM is invariably a rhotic schwa, inducing the process of rime usurpation.One possible explanation is that the "segmental contents" of the er-suffix are not completely lost in SWM, or in terms of the mainstream framework of syllable weight, the er-suffix is underlyingly moraic.From a cross-linguistic/dialectal perspective, the diachronic evolution of the er-suffix may be sketched in (1), following Lin's (2004) terms: (1) Full-segment as a separable affix ➝ Full-segment incorporated into the root ➝ Feature-sized Affix At one extreme is the er-suffix in Hangzhou Chinese (a dialect of Wu Chinese, which has been extensively influenced by Pre-modern Mandarin ever since the Qing dynasty), for example.In Hangzhou Chinese, the er-suffix remains a separable, full-segment suffix and is transcribed as a retroflex lateral, according to Yue and Hu's (2019) experimental results.At the other extreme, by contrast, are affixes that have lost their segmental contents and have fully grammaticalized into a floating feature.
The er-suffix in Jiyuan Chinese (a dialect of Zhongyuan or "Central Plains" Mandarin) is a case in point (see Lin 2004 and references cited therein).Similarly, Lee (2005) and Jiang et al.'s (2019) EMA results confirm that certain output forms of the er-suffixation are a rhotacized rime in Beijing and Northeastern Mandarin, respectively (e.g., /u˞/).The SWM er-suffix appears to be the transitional stage between the full-sized, standalone er-suffix and the feature-sized er-suffix.That being the case, we propose that the rime usurpation phenomenon may be attributed to the fact that disyllabic words are prototypically trochaic in SWM because the full-toned final syllable is significantly shorter than the initial syllable in duration.Liu et al. (2022) report the results (n = 6) that the initial syllables are 1.5/1.35times longer than the final syllables for disyllabic compound/monomorphemic words in Chengdu Chinese (i.e., the representative variety of SWM; see Qin 2015 for similar results), while the ratio is 0.9 for Standard Chinese, (which is based on the data from the same group of Chengdu Chinese speakers).The final syllable is slightly longer than the initial syllables in Standard Chinese, probably due to the effect of phrase-final lengthening.Liu et al. (2022) also remark that Chengdu Chinese is more likely to be a "stress-timed" language than Standard Chinese because the mean nPVI (normalized Pairwise Variability Index) is significantly greater in Chengdu than in Standard Chinese: 55.3 and 35.4,respectively.Likewise, the same result may be found among tri-syllabic compound words, too.Recall that the er-suffix must be attached to a polysyllabic stem in SWM.It follows that suffixing a rhotic vowel leads to an "oversized" rime in final position, hence rime usurpation.
below, the RA values are positive across the board in the current data.

Figure 4 :
Figure 4: A schematic illustration of the measurement of Tongue Retroflexion Angle (RA).
The two rhotic schwas in Southwestern Mandarin Table : Tallies of the three categories of the rhotic schwa phoneme (retroflex : slightly bunched : typically bunched; light gray cells: more tokens of slightly bunched [ɚ], dark gray cells: more tokens of typically bunched vowel phoneme [ɚ]).

Figure 5 :
Figure 5: GAMM models of EMA sensor trajectories of the er-suffixes in [tu.twɚ] 'cheek' (in red: ERsuffixed1) and [toŋ.twɚ]'bare to the waist' (in blue: ERsuffixed2), where TB = Tongue Body, TD = Tongue Dorsum, x = longitudinal dimension/front-back, z = vertical dimension/up-down.Upper panel: shaded bands represent the point-wise 95 %-confidence interval.Lower panel: when the shaded point-wise 95 %-confidence interval does not overlap with the x-axis (i.e., the value is significantly different from zero), this is indicated by a red line on the x-axis (and vertical dotted lines).Results are based on the data from 7 speakers.

Figure 6 :
Figure 6: The fitted smoothing splines at the midpoints (50 %) for all tokens of the unsuffixed stems /o/ (o.NR) and its er-suffixed forms (o.R).Gray area indicates positions with a statistically significant difference.The speakers (n = 10) are facing right.

Table  :
The four variants of the er-suffix in Southwestern Mandarin.

Table  :
Pairs to be compared: some representative examples.

Table  :
Tallies of the three categories of er-suffix (retroflex : slightly bunched : typically bunched; light gray cells: more tokens of slightly bunched er-suffix, dark gray cells: more tokens of typically bunched er-suffix).

Table  :
Summary of GAMM results in the lingual articulators: the er-suffix versus the rhotic schwa phoneme /ɚ/ (TT = Tongue Tip, TB = Tongue Body, TD = Tongue Dorsum, x = front-back, z = up-down; √√ = difference greater than  % of the rime; √ = greater than  %; blank = no difference or less than  % of the entire rime).

Table  :
Summary of GAMM results in the labial articulators: monosyllabic stem vowels versus the er-suffixed stems (UL = Upper Lip, LL = Lower Lip, x = front-back; √√ = difference greater than  % of the rime; √ = greater than  %; blank = no difference or less than  % of the entire rime).