Abstract
It has long been noted that language production seems to reflect a correlation between message redundancy and signal reduction. More frequent words and contextually predictable instances of words, for example, tend to be produced with shorter and less clear signals. The same tendency is observed in the language code (e.g. the phonological lexicon), where more frequent words and words that are typically contextually predictable tend to have fewer segments or syllables. Average predictability in context (informativity) also seems to be an important factor in understanding phonological alternations. What has received little attention so far is the relation between various information-theoretic indices – such as frequency, contextual predictability, and informativity. Although each of these indices has been associated with different theories about the source of the redundancy-reduction link, different indices tend to be highly correlated in natural language, making it difficult to tease apart their effects. We present a computational approach to this problem. We assess the correlations between frequency, predictability, and informativity, and assess when these correlations are likely to create spurious (null or non-null) effects depending on, for example, the amount of data available to the researcher.
Acknowledgement
We would like to thank Emily Gleason and Elinor Amit for their feedback.
References
Al-Nassir, A. A. 1993. Sibawayh the phonologist: A critical study of the phonetic and phonological theory of Sibawayh as presented in his treatise Al-Kitab. London [u.a.]: Kegan Paul Internat.Search in Google Scholar
Arnold, J. E., J. M. Kahn & G. C. Pancani. 2012. Audience design affects acoustic reduction via production facilitation. Psychonomic Bulletin & Review 19(3). 505–512.10.3758/s13423-012-0233-ySearch in Google Scholar
Aylett, M. & A. Turk. 2004. The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech 47(1). 31–56.10.1177/00238309040470010201Search in Google Scholar
Aylett, M. & A. Turk. 2006. Language redundancy predicts syllabic duration and the spectral characteristics of vocalic syllable nuclei. Acoustical Society of America Journal 119. 3048–3058.10.1121/1.2188331Search in Google Scholar
Bates, E. & B. MacWhinney. 1987. Competition, variation, and language learning. In B. MacWhinney (ed.), Mechanisms of language acquisition, 157–194. Hillsdale/London: Lawrence Erlbaum Associates.Search in Google Scholar
Bell, A., J. Brenier, M. Gregory, C. Girand & D. Jurafsky. 2009. Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language 60(1). 92–111.10.1016/j.jml.2008.06.003Search in Google Scholar
Bell, A., D. Jurafsky, E. Fosler-Lussier, C. Girand, M. Gregory & D. Gildea. 2003. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. Journal of the Acoustical Society of America 113(2). 1001–1024.10.1121/1.1534836Search in Google Scholar
Bybee, J. 2002. Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change 14(03). 261–290.10.1093/acprof:oso/9780195301571.003.0011Search in Google Scholar
Bybee, J., R. J. File-Muriel & R. N. D. Souza. 2016. Special reduction: A usage-based approach. Language and Cognition 8(3). 421–446.10.1017/langcog.2016.19Search in Google Scholar
Bybee, J. & J. Scheibman. 1999. The effect of usage on degrees of constituency: The reduction of don’t in English. Linguistics 37(4). 575–596.10.1515/ling.37.4.575Search in Google Scholar
Carter, M. G. 2004. Sibawayhi. London/New York: I.B. Tauris.Search in Google Scholar
Cieri, C., D. Graff, O. Kimball, D. Miller & K. Walker. 2005. Fisher English training part 2, transcripts. Philadelphia: Linguistic Data Consortium.Search in Google Scholar
Cieri, C., D. Miller & K. Walker. 2004. The Fisher corpus: A resource for the next generations of speech-to-text. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), 69–71. Lisbon, Portugal.Search in Google Scholar
Clopper, C. G., R. Turnbull & R. S. Burdin. 2018. Assessing predictability effects in connected read speech. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0044Search in Google Scholar
Cohen Priva, U. 2008. Using information content to predict phone deletion. In N. Abner & J. Bishop (eds.), Proceedings of the 27th West Coast Conference on Formal Linguistics, 90–98. Somerville, MA: Cascadilla Proceedings Project.Search in Google Scholar
Cohen Priva, U. 2012. Sign and signal: Deriving linguistic generalizations from information utility. Stanford, CA: Stanford University dissertation. http://purl.stanford.edu/wg646gh4444.Search in Google Scholar
Cohen Priva, U. 2015. Informativity affects consonant duration and deletion rates. Laboratory Phonology 6(2). 243–278.10.1515/lp-2015-0008Search in Google Scholar
Cohen Priva, U. 2017a. Informativity and the actuation of lenition. Language 93(3). 569–597.10.1353/lan.2017.0037Search in Google Scholar
Cohen Priva, U. 2017b. Not so fast: Fast speech correlates with lower lexical and structural information. Cognition 160. 27–34.10.1016/j.cognition.2016.12.002Search in Google Scholar
Cohen Priva, U. & E. Gleason. 2016. Simpler structure for more informative words: A longitudinal study. In A. Papafragou, D. Grodner, D. Mirman & J. Trueswell (eds.), Proceedings of the 38th Annual Conference of the Cognitive Science Society, 1895–1900. Austin, TX: Cognitive Science Society. https://mindmodeling.org/cogsci2016/papers/0331/index.html.Search in Google Scholar
Daland, R. & K. Zuraw. 2018. Loci and locality of informational effects on phonetic implementation. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0045Search in Google Scholar
Ernestus, M. 2014. Acoustic reduction and the roles of abstractions and exemplars in speech processing. Lingua 142. 27–41.10.1016/j.lingua.2012.12.006Search in Google Scholar
Fenk-Oczlon, G., A. Fenk & P. Faber. 2010. Frequency effects on the emergence of polysemy and homophony. International Journal of Information Technologies and Knowledge 4(2). 103–109.Search in Google Scholar
Foulkes, P., G. Docherty, S. Shattuck-Hufnagel & V. Hughes. 2018. Three steps forward for predictability. Consideration of methodological robustness, indexical and prosodic factors, and replication in the laboratory. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0032Search in Google Scholar
Frank, A. F. & T. F. Jaeger. 2008. Speaking rationally: Uniform information density as an optimal strategy for language production. In B. C. Love, K. McRae & V. M. Sloutsky (eds.), Proceedings of the 30th Annual Meeting of the Cognitive Science Society (Cogsci08), 939–944. Austin, TX: Cognitive Science Society.Search in Google Scholar
Gahl, S. 2008. Time and thyme are not homophones: The effect of lemma frequency on word durations in spontaneous speech. Language 84(3). 474–496.10.1353/lan.0.0035Search in Google Scholar
Gahl, S., Y. Yao & K. Johnson. 2012. Why reduce? Phonological neighborhood density and phonetic reduction in spontaneous speech. Journal of Memory and Language 66(4). 789–806.10.1016/j.jml.2011.11.006Search in Google Scholar
Godfrey, J. J. & E. Holliman. 1997. Switchboard-1 release 2. Philadelphia: Linguistic Data Consortium.Search in Google Scholar
Gries, S. T. 2010. Useful statistics for corpus linguistics. In A. S. Pérez & M. A. Sánchez (eds.), A mosaic of corpus linguistics, 269–291. Bern, Switzerland: Peter Lang.Search in Google Scholar
Hall, K., E. Hume, T. F. Jaeger & A. B. Wedel. 2016. The message shapes phonology. Ms. https://www.researchgate.net/profile/T_Florian_Jaeger/publication/309033386_The_Message_Shapes_Phonology/links/57fe71a908ae727564016264/The-Message-Shapes-Phonology.pdf?origin=publication_detail.Search in Google Scholar
Hall, K., E. Hume, T. F. Jaeger & A. B. Wedel. 2018. The role of predictability in shaping phonological patterns. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0027Search in Google Scholar
Jaeger, T. F. 2006. Redundancy and syntactic reduction in spontaneous speech. Stanford, CA: Stanford University dissertation.Search in Google Scholar
Jaeger, T. F. 2010. Redundancy and reduction: Speakers manage syntactic information density. Cognitive Psychology 61(1). 23–62.10.1016/j.cogpsych.2010.02.002Search in Google Scholar
Jaeger, T. F. 2013. Production preferences cannot be understood without reference to communication. Frontiers in Psychology 4. 230.10.3389/fpsyg.2013.00230Search in Google Scholar
Jaeger, T. F. & E. Buz. 2017. Signal reduction and linguistic encoding. In E. M. Fernández & H. S. Cairns (eds.), Handbook of psycholinguistics, 38–81. Hoboken, NJ: Wiley-Blackwell.10.1002/9781118829516.ch3Search in Google Scholar
Kawahara, S. & S. Lee. 2018. Truncation in message-oriented phonology: A case study using Korean vocative truncation. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0016Search in Google Scholar
Kuperman, V. & J. Bresnan. 2012. The effects of construction probability on word durations during spontaneous incremental sentence production. Journal of Memory and Language 66(4). 588–611.10.1016/j.jml.2012.04.003Search in Google Scholar
Lewis, J. W. & L. A. Escobar. 1986. Suppression and enhancement in bivariate regression. Journal of the Royal Statistical Society. Series D (The Statistician) 35(1). 17–26.10.2307/2988294Search in Google Scholar
Lindblom, B. 1990. Explaining phonetic variation: A sketch of the H&H theory. In W. J. Hardcastle & A. Marchal (eds.), Speech production and speech modeling, 403–439. Dordrecht: Kluwer.10.1007/978-94-009-2037-8_16Search in Google Scholar
Pate, J. K. & S. Goldwater. 2015. Talkers account for listener and channel characteristics to communicate efficiently. Journal of Memory and Language 78. 1–17.10.1016/j.jml.2014.10.003Search in Google Scholar
Piantadosi, S. T., H. J. Tily & E. Gibson. 2011. Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences 108(9). 3526–3529.10.1073/pnas.1012551108Search in Google Scholar
Pierrehumbert, J. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In J. Bybee & P. Hopper (eds.), Frequency and the emergence of linguistic structure, 137–157. Amsterdam/Philadelphia: John Benjamins.10.1075/tsl.45.08pieSearch in Google Scholar
Pierrehumbert, J. B. 2003. Phonetic diversity, statistical learning, and acquisition of phonology. Language and Speech 46(2–3). 115–154.10.1177/00238309030460020501Search in Google Scholar
Pitt, M., L. Dilley, K. Johnson, S. Kiesling, W. Raymond, E. Hume & E. Fosler-Lussier. 2007. Buckeye corpus of conversational speech (2nd release). Columbus, OH: Department of Psychology, Ohio State University.Search in Google Scholar
Pluymaekers, M., M. Ernestus & R. H. Baayen. 2005. Articulatory planning is continuous and sensitive to informational redundancy. Phonetica 62. 146–159.10.1159/000090095Search in Google Scholar
Seyfarth, S. 2014. Word informativity influences acoustic duration: Effects of contextual predictability on lexical representation. Cognition 133(1). 140–155.10.1016/j.cognition.2014.06.013Search in Google Scholar
Shannon, C. E. 1948. A mathematical theory of communication. The Bell System Technical Journal 27. 379–423.10.1002/j.1538-7305.1948.tb01338.xSearch in Google Scholar
Shaw, J. & S. Kawahara. 2018. Predictability and phonology: Past, present & future. Linguistics Vanguard 4(S2).10.1515/lingvan-2018-0042Search in Google Scholar
Tily, H. & V. Kuperman. 2012. Rational phonological lengthening in spoken Dutch. The Journal of the Acoustical Society of America 132(6). 3935–3940.10.1121/1.4765071Search in Google Scholar
Turnbull, R. 2018. Patterns of probabilistic segment deletion/reduction in English and Japanese. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0033Search in Google Scholar
van Son, R. J. J. H. & L. C. W. Pols. 2003. How efficient is speech? Proceedings of the Institute of Phonetic Sciences 25. 171–184.Search in Google Scholar
van Son, R. & J. van Santen. 2005. Duration and spectral balance of intervocalic consonants: A case for efficient communication. Speech Communication 47. 100–123.10.1016/j.specom.2005.06.005Search in Google Scholar
Wedel, A. B. 2006. Exemplar models, evolution and language change. The Linguistic Review 23(3). 247–274.10.1515/TLR.2006.010Search in Google Scholar
Weide, R. 2008. The CMU pronunciation dictionary, release 0.7a. Pittsburgh, PA: Carnegie Mellon University.Search in Google Scholar
Zipf, G. K. 1935. The psycho-biology of language: An introduction to dynamic philology. Boston: Houghton Mifflin.Search in Google Scholar
Zipf, G. K. 1949. Human behavior and the principle of least effort: An introduction to human ecology. New York: Hafner Publisher Company.Search in Google Scholar
©2018 Walter de Gruyter GmbH, Berlin/Boston