Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton September 13, 2018

The interdependence of frequency, predictability, and informativity in the segmental domain

  • Uriel Cohen Priva EMAIL logo and T. Florian Jaeger
From the journal Linguistics Vanguard


It has long been noted that language production seems to reflect a correlation between message redundancy and signal reduction. More frequent words and contextually predictable instances of words, for example, tend to be produced with shorter and less clear signals. The same tendency is observed in the language code (e.g. the phonological lexicon), where more frequent words and words that are typically contextually predictable tend to have fewer segments or syllables. Average predictability in context (informativity) also seems to be an important factor in understanding phonological alternations. What has received little attention so far is the relation between various information-theoretic indices – such as frequency, contextual predictability, and informativity. Although each of these indices has been associated with different theories about the source of the redundancy-reduction link, different indices tend to be highly correlated in natural language, making it difficult to tease apart their effects. We present a computational approach to this problem. We assess the correlations between frequency, predictability, and informativity, and assess when these correlations are likely to create spurious (null or non-null) effects depending on, for example, the amount of data available to the researcher.


We would like to thank Emily Gleason and Elinor Amit for their feedback.


Al-Nassir, A. A. 1993. Sibawayh the phonologist: A critical study of the phonetic and phonological theory of Sibawayh as presented in his treatise Al-Kitab. London [u.a.]: Kegan Paul Internat.Search in Google Scholar

Arnold, J. E., J. M. Kahn & G. C. Pancani. 2012. Audience design affects acoustic reduction via production facilitation. Psychonomic Bulletin & Review 19(3). 505–512.10.3758/s13423-012-0233-ySearch in Google Scholar

Aylett, M. & A. Turk. 2004. The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech 47(1). 31–56.10.1177/00238309040470010201Search in Google Scholar

Aylett, M. & A. Turk. 2006. Language redundancy predicts syllabic duration and the spectral characteristics of vocalic syllable nuclei. Acoustical Society of America Journal 119. 3048–3058.10.1121/1.2188331Search in Google Scholar

Bates, E. & B. MacWhinney. 1987. Competition, variation, and language learning. In B. MacWhinney (ed.), Mechanisms of language acquisition, 157–194. Hillsdale/London: Lawrence Erlbaum Associates.Search in Google Scholar

Bell, A., J. Brenier, M. Gregory, C. Girand & D. Jurafsky. 2009. Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language 60(1). 92–111.10.1016/j.jml.2008.06.003Search in Google Scholar

Bell, A., D. Jurafsky, E. Fosler-Lussier, C. Girand, M. Gregory & D. Gildea. 2003. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. Journal of the Acoustical Society of America 113(2). 1001–1024.10.1121/1.1534836Search in Google Scholar

Bybee, J. 2002. Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change 14(03). 261–290.10.1093/acprof:oso/9780195301571.003.0011Search in Google Scholar

Bybee, J., R. J. File-Muriel & R. N. D. Souza. 2016. Special reduction: A usage-based approach. Language and Cognition 8(3). 421–446.10.1017/langcog.2016.19Search in Google Scholar

Bybee, J. & J. Scheibman. 1999. The effect of usage on degrees of constituency: The reduction of don’t in English. Linguistics 37(4). 575–596.10.1515/ling.37.4.575Search in Google Scholar

Carter, M. G. 2004. Sibawayhi. London/New York: I.B. Tauris.Search in Google Scholar

Cieri, C., D. Graff, O. Kimball, D. Miller & K. Walker. 2005. Fisher English training part 2, transcripts. Philadelphia: Linguistic Data Consortium.Search in Google Scholar

Cieri, C., D. Miller & K. Walker. 2004. The Fisher corpus: A resource for the next generations of speech-to-text. Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), 69–71. Lisbon, Portugal.Search in Google Scholar

Clopper, C. G., R. Turnbull & R. S. Burdin. 2018. Assessing predictability effects in connected read speech. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0044Search in Google Scholar

Cohen Priva, U. 2008. Using information content to predict phone deletion. In N. Abner & J. Bishop (eds.), Proceedings of the 27th West Coast Conference on Formal Linguistics, 90–98. Somerville, MA: Cascadilla Proceedings Project.Search in Google Scholar

Cohen Priva, U. 2012. Sign and signal: Deriving linguistic generalizations from information utility. Stanford, CA: Stanford University dissertation. in Google Scholar

Cohen Priva, U. 2015. Informativity affects consonant duration and deletion rates. Laboratory Phonology 6(2). 243–278.10.1515/lp-2015-0008Search in Google Scholar

Cohen Priva, U. 2017a. Informativity and the actuation of lenition. Language 93(3). 569–597.10.1353/lan.2017.0037Search in Google Scholar

Cohen Priva, U. 2017b. Not so fast: Fast speech correlates with lower lexical and structural information. Cognition 160. 27–34.10.1016/j.cognition.2016.12.002Search in Google Scholar

Cohen Priva, U. & E. Gleason. 2016. Simpler structure for more informative words: A longitudinal study. In A. Papafragou, D. Grodner, D. Mirman & J. Trueswell (eds.), Proceedings of the 38th Annual Conference of the Cognitive Science Society, 1895–1900. Austin, TX: Cognitive Science Society. in Google Scholar

Daland, R. & K. Zuraw. 2018. Loci and locality of informational effects on phonetic implementation. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0045Search in Google Scholar

Ernestus, M. 2014. Acoustic reduction and the roles of abstractions and exemplars in speech processing. Lingua 142. 27–41.10.1016/j.lingua.2012.12.006Search in Google Scholar

Fenk-Oczlon, G., A. Fenk & P. Faber. 2010. Frequency effects on the emergence of polysemy and homophony. International Journal of Information Technologies and Knowledge 4(2). 103–109.Search in Google Scholar

Foulkes, P., G. Docherty, S. Shattuck-Hufnagel & V. Hughes. 2018. Three steps forward for predictability. Consideration of methodological robustness, indexical and prosodic factors, and replication in the laboratory. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0032Search in Google Scholar

Frank, A. F. & T. F. Jaeger. 2008. Speaking rationally: Uniform information density as an optimal strategy for language production. In B. C. Love, K. McRae & V. M. Sloutsky (eds.), Proceedings of the 30th Annual Meeting of the Cognitive Science Society (Cogsci08), 939–944. Austin, TX: Cognitive Science Society.Search in Google Scholar

Gahl, S. 2008. Time and thyme are not homophones: The effect of lemma frequency on word durations in spontaneous speech. Language 84(3). 474–496.10.1353/lan.0.0035Search in Google Scholar

Gahl, S., Y. Yao & K. Johnson. 2012. Why reduce? Phonological neighborhood density and phonetic reduction in spontaneous speech. Journal of Memory and Language 66(4). 789–806.10.1016/j.jml.2011.11.006Search in Google Scholar

Godfrey, J. J. & E. Holliman. 1997. Switchboard-1 release 2. Philadelphia: Linguistic Data Consortium.Search in Google Scholar

Gries, S. T. 2010. Useful statistics for corpus linguistics. In A. S. Pérez & M. A. Sánchez (eds.), A mosaic of corpus linguistics, 269–291. Bern, Switzerland: Peter Lang.Search in Google Scholar

Hall, K., E. Hume, T. F. Jaeger & A. B. Wedel. 2016. The message shapes phonology. Ms. in Google Scholar

Hall, K., E. Hume, T. F. Jaeger & A. B. Wedel. 2018. The role of predictability in shaping phonological patterns. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0027Search in Google Scholar

Jaeger, T. F. 2006. Redundancy and syntactic reduction in spontaneous speech. Stanford, CA: Stanford University dissertation.Search in Google Scholar

Jaeger, T. F. 2010. Redundancy and reduction: Speakers manage syntactic information density. Cognitive Psychology 61(1). 23–62.10.1016/j.cogpsych.2010.02.002Search in Google Scholar

Jaeger, T. F. 2013. Production preferences cannot be understood without reference to communication. Frontiers in Psychology 4. 230.10.3389/fpsyg.2013.00230Search in Google Scholar

Jaeger, T. F. & E. Buz. 2017. Signal reduction and linguistic encoding. In E. M. Fernández & H. S. Cairns (eds.), Handbook of psycholinguistics, 38–81. Hoboken, NJ: Wiley-Blackwell.10.1002/9781118829516.ch3Search in Google Scholar

Kawahara, S. & S. Lee. 2018. Truncation in message-oriented phonology: A case study using Korean vocative truncation. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0016Search in Google Scholar

Kuperman, V. & J. Bresnan. 2012. The effects of construction probability on word durations during spontaneous incremental sentence production. Journal of Memory and Language 66(4). 588–611.10.1016/j.jml.2012.04.003Search in Google Scholar

Lewis, J. W. & L. A. Escobar. 1986. Suppression and enhancement in bivariate regression. Journal of the Royal Statistical Society. Series D (The Statistician) 35(1). 17–26.10.2307/2988294Search in Google Scholar

Lindblom, B. 1990. Explaining phonetic variation: A sketch of the H&H theory. In W. J. Hardcastle & A. Marchal (eds.), Speech production and speech modeling, 403–439. Dordrecht: Kluwer.10.1007/978-94-009-2037-8_16Search in Google Scholar

Pate, J. K. & S. Goldwater. 2015. Talkers account for listener and channel characteristics to communicate efficiently. Journal of Memory and Language 78. 1–17.10.1016/j.jml.2014.10.003Search in Google Scholar

Piantadosi, S. T., H. J. Tily & E. Gibson. 2011. Word lengths are optimized for efficient communication. Proceedings of the National Academy of Sciences 108(9). 3526–3529.10.1073/pnas.1012551108Search in Google Scholar

Pierrehumbert, J. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In J. Bybee & P. Hopper (eds.), Frequency and the emergence of linguistic structure, 137–157. Amsterdam/Philadelphia: John Benjamins.10.1075/tsl.45.08pieSearch in Google Scholar

Pierrehumbert, J. B. 2003. Phonetic diversity, statistical learning, and acquisition of phonology. Language and Speech 46(2–3). 115–154.10.1177/00238309030460020501Search in Google Scholar

Pitt, M., L. Dilley, K. Johnson, S. Kiesling, W. Raymond, E. Hume & E. Fosler-Lussier. 2007. Buckeye corpus of conversational speech (2nd release). Columbus, OH: Department of Psychology, Ohio State University.Search in Google Scholar

Pluymaekers, M., M. Ernestus & R. H. Baayen. 2005. Articulatory planning is continuous and sensitive to informational redundancy. Phonetica 62. 146–159.10.1159/000090095Search in Google Scholar

Seyfarth, S. 2014. Word informativity influences acoustic duration: Effects of contextual predictability on lexical representation. Cognition 133(1). 140–155.10.1016/j.cognition.2014.06.013Search in Google Scholar

Shannon, C. E. 1948. A mathematical theory of communication. The Bell System Technical Journal 27. 379–423.10.1002/j.1538-7305.1948.tb01338.xSearch in Google Scholar

Shaw, J. & S. Kawahara. 2018. Predictability and phonology: Past, present & future. Linguistics Vanguard 4(S2).10.1515/lingvan-2018-0042Search in Google Scholar

Tily, H. & V. Kuperman. 2012. Rational phonological lengthening in spoken Dutch. The Journal of the Acoustical Society of America 132(6). 3935–3940.10.1121/1.4765071Search in Google Scholar

Turnbull, R. 2018. Patterns of probabilistic segment deletion/reduction in English and Japanese. Linguistics Vanguard 4(S2).10.1515/lingvan-2017-0033Search in Google Scholar

van Son, R. J. J. H. & L. C. W. Pols. 2003. How efficient is speech? Proceedings of the Institute of Phonetic Sciences 25. 171–184.Search in Google Scholar

van Son, R. & J. van Santen. 2005. Duration and spectral balance of intervocalic consonants: A case for efficient communication. Speech Communication 47. 100–123.10.1016/j.specom.2005.06.005Search in Google Scholar

Wedel, A. B. 2006. Exemplar models, evolution and language change. The Linguistic Review 23(3). 247–274.10.1515/TLR.2006.010Search in Google Scholar

Weide, R. 2008. The CMU pronunciation dictionary, release 0.7a. Pittsburgh, PA: Carnegie Mellon University.Search in Google Scholar

Zipf, G. K. 1935. The psycho-biology of language: An introduction to dynamic philology. Boston: Houghton Mifflin.Search in Google Scholar

Zipf, G. K. 1949. Human behavior and the principle of least effort: An introduction to human ecology. New York: Hafner Publisher Company.Search in Google Scholar

Received: 2017-03-23
Accepted: 2018-06-07
Published Online: 2018-09-13

©2018 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 26.9.2023 from
Scroll to top button