Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton May 26, 2019

The Goldilocks Zone of Perceptual Learning

  • Molly Babel , Michael McAuliffe , Carolyn Norton , Brianne Senior and Charlotte Vaughn ORCID logo
From the journal Phonetica


Background/Aims: Lexically guided perceptual learning in speech is the updating of linguistic categories based on novel input disambiguated by the structure provided in a recognized lexical item. We test the range of variation that allows for perceptual learning by presenting listeners with items that vary from subtle within-category variation to fully remapped cross-category variation. Methods: Experiment 1 uses a lexically guided perceptual learning paradigm with words containing noncanonical /s/ realizations from s/ʃ continua that correspond to “typical,” “ambiguous,” “atypical,” and “remapped” steps. Perceptual learning is tested in an s/ʃ categorization task. Experiment 2 addresses listener sensitivity to variation in the exposure items using AX discrimination tasks. Results: Listeners in experiment 1 showed perceptual learning with the maximally ambiguous tokens. Performance of listeners in experiment 2 suggests that tokens which showed the most perceptual learning were not perceptually salient on their own. Conclusion: These results demonstrate that perceptual learning is enhanced with maximally ambiguous stimuli. Excessively atypical pronunciations show attenuated perceptual learning, while typical pronunciations show no evidence for perceptual learning. AX discrimination illustrates that the maximally ambiguous stimuli are not perceptually unique. Together, these results suggest that perceptual learning relies on an interplay between confidence in phonetic and lexical predictions and category typicality.


*Molly Babel, Department of Linguistics, University of British Columbia, 2613 West Mall, Totem Field Studios, Vancouver, BC V6T 1Z4 (Canada), E-Mail


1 Allen, J S, & Miller, J L (2004): Listener sensitivity to individual talker differences in voice-onset-time. The Journal of the Acoustical Society of America 115(6: 3171-3183.10.1121/1.1701898Search in Google Scholar

2 Baese-Berk, M. M., Bradlow, A. R., & Wright, B. A. (2013). Accent-independent adaptation to foreign accented speech.The Journal of the Acoustical Society of America, 133(3), EL174EL180. 10.1121/1.47898640001-4966Search in Google Scholar PubMed

3 Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models Using lme4.Journal of Statistical Software, 67(1), 148. 10.18637/jss.v067.i011548-7660Search in Google Scholar

4 Bradlow, A. R., & Alexander, J. A. (2007). Semantic and phonetic enhancements for speech-in-noise recognition by native and non-native listeners.The Journal of the Acoustical Society of America, 121(4), 23392349. 10.1121/1.26421030001-4966Search in Google Scholar PubMed

5 Bradlow, A. R., & Bent, T. (2008). Perceptual adaptation to non-native speech.Cognition, 106(2), 707729. 10.1016/j.cognition.2007.04.0050010-0277Search in Google Scholar PubMed

6 Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science.Behavioral and Brain Sciences, 36(3), 181204. 10.1017/S0140525X120004770140-525XSearch in Google Scholar PubMed

7 Clarke, C. M., & Garrett, M. F. (2004). Rapid adaptation to foreign-accented English.The Journal of the Acoustical Society of America, 116(6), 36473658. 10.1121/1.18151310001-4966Search in Google Scholar PubMed

8 Clarke-Davidson, C. M., Luce, P. A., & Sawusch, J. R. (2008). Does perceptual learning in speech reflect changes in phonetic category representation or decision bias?Attention, Perception & Psychophysics, 70(4), 604618. 10.3758/PP.70.4.6041943-3921Search in Google Scholar PubMed

9 Cooper, A., & Bradlow, A. R. (2016). Linguistically guided adaptation to foreign-accented speech.The Journal of the Acoustical Society of America, 140(5), EL378EL384. 10.1121/1.49665850001-4966Search in Google Scholar PubMed

10 Cutler, A. (2012). Native Listening: Language Experience and the Recognition of Spoken Words. Cambridge: MIT Press.10.7551/mitpress/9012.001.0001Search in Google Scholar

11 Cutler, A., Mehler, J., Norris, D., & Segui, J. (1987). Phoneme identification and the lexicon.Cognitive Psychology, 19(2), 141177. 10.1016/0010-0285(87)90010-70010-0285Search in Google Scholar

12 Ganong, W. F., III. (1980). Phonetic categorization in auditory word perception.Journal of Experimental Psychology. Human Perception and Performance, 6(1), 110125. 10.1037/0096-1523.6.1.1100096-1523Search in Google Scholar PubMed

13 Hay, J. B., Pierrehumbert, J. B., Walker, A. J., & LaShell, P. (2015). Tracking word frequency effects through 130 years of sound change.Cognition, 139, 8391. 10.1016/j.cognition.2015.02.0120010-0277Search in Google Scholar PubMed

14 Holt, R. F., & Bent, T. (2017). Children’s use of semantic context in perception of foreign-accented speech.Journal of Speech, Language, and Hearing Research: JSLHR, 60(1), 223230. 10.1044/2016_JSLHR-H-16-00141092-4388Search in Google Scholar PubMed

15 Jesse, A., & McQueen, J. M. (2011). Positional effects in the lexical retuning of speech perception.Psychonomic Bulletin & Review, 18(5), 943950. 10.3758/s13423-011-0129-21069-9384Search in Google Scholar PubMed

16 Kawahara, H, Morise, M, Takahashi, T, Nisimura, R, Irino, T, & Banno, H (2008, March): TANDEM-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation. Acoustics, Speech and Signal Processing ICASSP. IEEE International Conference 3933-3936).10.1109/ICASSP.2008.4518514Search in Google Scholar

17 Kleinschmidt, D. F., & Jaeger, T. F. (2015). Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel.Psychological Review, 122(2), 148203. 10.1037/a00386950033-295XSearch in Google Scholar PubMed

18 Knoblauch, K. (2014). psyphy: Functions for analyzing psychophysical data in R. R package version 0.1-9. Retrieved from in Google Scholar

19 Kraljic, T., & Samuel, A. G. (2005). Perceptual learning for speech: Is there a return to normal?Cognitive Psychology, 51(2), 141178. 10.1016/j.cogpsych.2005.05.0010010-0285Search in Google Scholar PubMed

20 Kraljic, T., & Samuel, A. G. (2006). Generalization in perceptual learning for speech.Psychonomic Bulletin & Review, 13(2), 262268. 10.3758/BF031938411069-9384Search in Google Scholar PubMed

21 Kraljic, T., & Samuel, A. G. (2007). Perceptual adjustments to multiple speakers.Journal of Memory and Language, 56(1), 115. 10.1016/j.jml.2006.07.0100749-596XSearch in Google Scholar

22 Kraljic, T., Brennan, S. E., & Samuel, A. G. (2008a). Accommodating variation: Dialects, idiolects, and speech processing.Cognition, 107(1), 5481. 10.1016/j.cognition.2007.07.0130010-0277Search in Google Scholar PubMed

23 Kraljic, T., Samuel, A. G., & Brennan, S. E. (2008b). First impressions and last resorts: How listeners adjust to speaker variability.Psychological Science, 19(4), 332338. 10.1111/j.1467-9280.2008.02090.x0956-7976Search in Google Scholar PubMed

24 Kuperberg, G. R., & Jaeger, T. F. (2016). What do we mean by prediction in language comprehension?Language, Cognition and Neuroscience, 31(1), 3259. 10.1080/23273798.2015.11022992327-3798Search in Google Scholar PubMed

25 Lenth, R. (2016). Least-Squares Means: The R Package lsmeans.Journal of Statistical Software, 69(1), 133. 10.18637/jss.v069.i011548-7660Search in Google Scholar

26 Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries.Journal of Experimental Psychology, 54(5), 358368. 10.1037/h00444170022-1015Search in Google Scholar PubMed

27 Macmillan, N. A., & Creelman, C. D. (2004). Detection theory: A user's guide. New York: Psychology Press.10.4324/9781410611147Search in Google Scholar

28 Maye, J., Aslin, R. N., & Tanenhaus, M. K. (2008). The weckud wetch of the wast: Lexical adaptation to a novel accent.Cognitive Science, 32(3), 543562. 10.1080/036402108020353570364-0213Search in Google Scholar PubMed

29 McAuliffe, M (2015): Attention and salience in lexically-guided perceptual learning (University of British Columbia Doctoral Dissertation). cIRcle: UBC’s Digital Repository: Electronic Theses and Dissertations (ETDs) 2008+. Available at: in Google Scholar

30 McAuliffe, M., & Babel, M. (2016). Stimulus-directed attention attenuates lexically-guided perceptual learning.The Journal of the Acoustical Society of America, 140(3), 17271738. 10.1121/1.49625290001-4966Search in Google Scholar PubMed

31 McClelland, J. L., & Elman, J. L. (1986). The TRACE model of speech perception.Cognitive Psychology, 18(1), 186. 10.1016/0010-0285(86)90015-00010-0285Search in Google Scholar PubMed

32 McClelland, J. L., Mirman, D., & Holt, L. L. (2006). Are there interactive processes in speech perception?Trends in Cognitive Sciences, 10(8), 363369. 10.1016/j.tics.2006.06.0071364-6613Search in Google Scholar PubMed

33 Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences, 23(3), 299–325. doi:10.1017/S0140525X0000324110.1017/S0140525X00003241Search in Google Scholar

34 Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech.Cognitive Psychology, 47(2), 204238. 10.1016/S0010-0285(03)00006-90010-0285Search in Google Scholar PubMed

35 Norris, D., McQueen, J. M., & Cutler, A. (2016). Prediction, Bayesian inference and feedback in speech recognition.Language, Cognition and Neuroscience, 31(1), 418. 10.1080/23273798.2015.10817032327-3798Search in Google Scholar PubMed

36 Pisoni, D. B., & Tash, J. (1974). Reaction times to comparisons within and across phonetic categories.Attention, Perception & Psychophysics, 15(2), 285290. 10.3758/BF032139461943-3921Search in Google Scholar PubMed

37 Pitt, M. A., & Samuel, A. G. (1993). An empirical and meta-analytic evaluation of the phoneme identification task.Journal of Experimental Psychology. Human Perception and Performance, 19(4), 699725. 10.1037/0096-1523.19.4.6990096-1523Search in Google Scholar PubMed

38 Psychology Software Tools, Inc. (2012): E-Prime 2.0 [computer software]. Retrieved from http://www.pstnet.comSearch in Google Scholar

39 R Core Team (2017): R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from in Google Scholar

40 Reinisch, E., Weber, A., & Mitterer, H. (2013). Listeners retune phoneme categories across languages.Journal of Experimental Psychology. Human Perception and Performance, 39(1), 7586. 10.1037/a00279790096-1523Search in Google Scholar PubMed

41 Reinisch, E., Wozny, D. R., Mitterer, H., & Holt, L. L. (2014). Phonetic category recalibration: What are the categories?Journal of Phonetics, 45, 91105. 10.1016/j.wocn.2014.04.0020095-4470Search in Google Scholar PubMed

42 Scharenborg, O., & Janse, E. (2013). Comparing lexically guided perceptual learning in younger and older listeners.Attention, Perception & Psychophysics, 75(3), 525536. 10.3758/s13414-013-0422-41943-3921Search in Google Scholar PubMed

43 Scharenborg, O., Weber, A., & Janse, E. (2015). The role of attentional abilities in lexically guided perceptual learning by older listeners.Attention, Perception & Psychophysics, 77(2), 493507. 10.3758/s13414-014-0792-21943-3921Search in Google Scholar PubMed

44 Sohoglu, E., & Davis, M. H. (2016). Perceptual learning of degraded speech by minimizing prediction error.Proceedings of the National Academy of Sciences of the United States of America, 113(12), E1747E1756. 10.1073/pnas.15232661130027-8424Search in Google Scholar PubMed

45 Sóskuthy, M. (2015). Understanding change through stability: A computational study of sound change actuation.Lingua, 163, 4060. 10.1016/j.lingua.2015.05.0100024-3841Search in Google Scholar

46 Sumner, M. (2011). The role of variation in the perception of accented speech.Cognition, 119(1), 131136. 10.1016/j.cognition.2010.10.0180010-0277Search in Google Scholar PubMed

47 Theodore, R. M., Myers, E. B., & Lomibao, J. A. (2015). Talker-specific influences on phonetic category structure.The Journal of the Acoustical Society of America, 138(2), 10681078. 10.1121/1.49274890001-4966Search in Google Scholar PubMed

48 Vaughn, C., & Kendall, T. (2018). Listener sensitivity to probabilistic conditioning of sociolinguistic variables: The case of (ING).Journal of Memory and Language, 103, 5873. 10.1016/j.jml.2018.07.0060749-596XSearch in Google Scholar

49 Vroomen, J., van Linden, S., Keetels, M., De Gelder, B., & Bertelson, P. (2004). Selective adaptation and recalibration of auditory speech by lipread information: Dissipation.Speech Communication, 44(1), 5561. 10.1016/j.specom.2004.03.0090167-6393Search in Google Scholar

50 Warren, R. M. (1970). Perceptual restoration of missing speech sounds.Science, 167(3917), 392393. 10.1126/science.167.3917.3920036-8075Search in Google Scholar PubMed

51 Weatherholtz, K (2015): Perceptual learning of systemic cross-category vowel variation (Unpublished doctoral dissertation). The Ohio State University, Columbus, OH.Search in Google Scholar

52 Wedel, A. (2012). Lexical contrast maintenance and the organization of sublexical contrast systems.Language and Cognition, 4(4), 319355. 10.1515/langcog-2012-00181866-9808Search in Google Scholar

53 Witteman, M. J., Weber, A., & McQueen, J. M. (2013). Foreign accent strength and listener familiarity with an accent codetermine speed of perceptual adaptation.Attention, Perception & Psychophysics, 75(3), 537556. 10.3758/s13414-012-0404-y1943-3921Search in Google Scholar PubMed

54 Xie, X., Theodore, R. M., & Myers, E. B. (2017). More than a boundary shift: Perceptual adaptation to foreign-accented speech reshapes the internal structure of phonetic categories.Journal of Experimental Psychology. Human Perception and Performance, 43(1), 206217. 10.1037/xhp00002850096-1523Search in Google Scholar PubMed

  1. 1

    Formula used: accuracy – exposure item type × category typicality + (1 + exposure item type | subject) + (1 + category typicality | word).

  2. 2

    Formula used: accuracy – step × category typicality+ (1 + step | subject) + (1 + step × category typicality | item).

  3. 3

    A reviewer points out that many of these excised CV sequences themselves create words (e.g., /si/ and /ʃi/ excised from the canonical and remapped end points of galaxy create the words see and she). Including fricative-rhotic sequences, which create words like sure and sir, 9 of the 20 CV sequences mapped to real CV words. Being excised out of word-medial positions from multisyllabic words, these items retain any coarticulation from the surrounding context and are considerably shorter in duration (mean: 275 ms, SD: 54 ms) than the real word equivalents of these words (for example, test items sigh and shy were 635 and 622 ms, respectively). This fact and the monotony of the experiment make it less likely that listeners processed these items as fully word-like (Cutler et al., 1987).

Received: 2017-04-03
Accepted: 2018-10-29
Published Online: 2019-05-26
Published in Print: 2019-05-01

© 2019 S. Karger AG, Basel

Downloaded on 28.2.2024 from
Scroll to top button