Jump to ContentJump to Main Navigation
Show Summary Details
More options …

Folia Linguistica

Acta Societatis Linguisticae Europaeae

Editor-in-Chief: Fischer, Olga / Norde, Muriel

Folia Linguistica
IMPACT FACTOR 2017: 0.324
5-year IMPACT FACTOR: 0.616

CiteScore 2017: 0.55

SCImago Journal Rank (SJR) 2017: 0.349
Source Normalized Impact per Paper (SNIP) 2017: 1.093

Folia Linguistica Historica
IMPACT FACTOR 2017: 0.529
5-year IMPACT FACTOR: 0.525

See all formats and pricing
More options …
Volume 50, Issue 2


The cognitive plausibility of statistical classification models: Comparing textual and behavioral evidence

Jane Klavan / Dagmar Divjak
  • School of Languages & Cultures, University of Sheffield, Jessop West, 1 Upper Hanover Street, Sheffield S3 7RA, UK
  • Email
  • Other articles by this author:
  • De Gruyter OnlineGoogle Scholar
Published Online: 2016-11-08 | DOI: https://doi.org/10.1515/flin-2016-0014


Usage-based linguistics abounds with studies that use statistical classification models to analyze either textual corpus data or behavioral experimental data. Yet, before we can draw conclusions from statistical models of empirical data that we can feed back into cognitive linguistic theory, we need to assess whether the text-based models are cognitively plausible and whether the behavior-based models are linguistically accurate. In this paper, we review four case studies that evaluate statistical classification models of richly annotated linguistic data by explicitly comparing the performance of a corpus-based model to the behavior of native speakers. The data come from four different languages (Arabic, English, Estonian, and Russian) and pertain to both lexical as well as syntactic near-synonymy. We show that behavioral evidence is needed in order to fine tune and improve statistical models built on data from a corpus. We argue that methodological pluralism is the key for a cognitively realistic linguistic theory.

Keywords: statistical modeling; near-synonymy; corpus linguistics; (psycho)linguistic experiments


  • Abdulrahim, Dana. 2013. A corpus study of basic motion events in Modern Standard Arabic. Edmonton: University of Alberta dissertation. http://hdl.handle.net/10402/era.33921 (accessed 20 January 2015)Google Scholar

  • Ambridge, Ben, Julian M. Pine, Caroline F. Rowland & Franklin Chang. 2012. The roles of verb semantics, entrenchment, and morphophonology in the retreat from dative argument-structure overgeneralization errors. Language 88(1). 45–81.Google Scholar

  • Antić, Eugenia. 2012. Relative frequency effects in Russian morphology. In Stefan Th. Gries & Dagmar Divjak (eds.), Frequency effects in language learning and processing, Vol. 1, 83–102. Berlin: De Gruyter Mouton.Google Scholar

  • Arppe, Antti. 2008. Univariate, bivariate and multivariate methods in corpus-based lexicography – a study of synonymy. Helsinki: University of Helsinki dissertation. https://helda.helsinki.fi/handle/10138/19274 (accessed 28 May 2015)Google Scholar

  • Arppe, Antti. 2013a. Polytomous: Polytomous logistic regression for fixed and mixed effects. R package version 0.1.6. http://CRAN.R-project.org/package=polytomous

  • Arppe, Antti. 2013b. Extracting exemplars and prototypes. R vignette to accompany Divjak & Arppe (2013). http://cran.r-project.org/web/packages/polytomous/vignettes/exemplars2prototypes.pdf

  • Arppe, Antti & Dana Abdulrahim. 2013. Converging linguistic evidence on two flavors of production: The synonymy of Arabic COME verbs. Paper presented at Second Workshop on Arabic Corpus Linguistics, University of Lancaster, 22–26 July.

  • Arppe, Antti, Patrick Bolger & Dagmara Dowbor. 2012. The more evidential diversity, the merrier – contrasting linguistic data on frequency, selection, acceptability and processing. Paper presented at New Ways of Analyzing Syntactic Variation, Radboud University, Nijmegen, the Netherlands, 15–17 November.

  • Arppe, Antti & Juhani Järvikivi. 2007. Every method counts: Combining corpus-based and experimental evidence in the study of synonymy. Corpus Linguistics and Linguistic Theory 3(2). 131–159.Google Scholar

  • Baayen, R. Harald. 2008. Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.Google Scholar

  • Baayen, R. Harald. 2011. Corpus linguistics and naive discriminative learning. Revista Brasileira de Linguística Aplicada 11(2). 295–328.Google Scholar

  • Baayen, R. Harald & Antti Arppe. 2011. Statistical classification and principles of human learning. QITL-4-Proceedings of Quantitative Investigations in Theoretical Linguistics 4 (QITL-4). Berlin: Humboldt-Universität zu Berlin. http://edoc.hu-berlin.de/conferences/qitl-4/baayen-r-harald-8/PDF/baayen.pdf (accessed on 06 January 2015).

  • Baayen, R. Harald, Douglas J. Davidson & Douglas M. Bates. 2008. Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59. 390–412.Google Scholar

  • Baayen, R. Harald, Anna Endresen, Laura A. Janda, Anastasia Makarova & Tore Nesset. 2013. Making choices in Russian: Pros and cons of statistical methods for rival forms. Russian Linguistics 37. 253–291.Google Scholar

  • Barth, Danielle & Vsevolod Kapatsinski. in press. A multimodel inference approach to categorical variant choice: Construction, priming and frequency effects on the choice between full and contracted forms of am, are and is. Corpus Linguistics and Linguistic Theory. http://www.degruyter.com/view/j/cllt.ahead-of-print/cllt-2014-0022/cllt-2014-0022.xml (accessed 28 May 2015)

  • Bermel, Neil & Luděk Knittl. 2012a. Corpus frequency and acceptability judgments: A study of morphosyntactic variants in Czech. Corpus Linguistics and Linguistic Theory 8(2). 241–275.Google Scholar

  • Bermel, Neil & Luděk Knittl. 2012b. Morphosyntactic variation and syntactic constructions in Czech nominal declension: corpus frequency and native-speaker judgments. Russian Linguistics 36(1). 91–119.Google Scholar

  • Box, George E. P. 1976. Science and statistics. Journal of the American Statistical Association 71(356). 791–799.Google Scholar

  • Bradshaw, John. 1984. A guide to norms, ratings, and lists. Memory & Cognition 12(2). 202–206.Google Scholar

  • Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Sam Featherston & Wolfgang Sternefeld (eds.), Roots: Linguistics in search of its evidential base, 77–96. Berlin: Mouton de Gruyter.Google Scholar

  • Bresnan, Joan, Anna Cueni, Tatiana Nikitina & R. Harald Baayen. 2007. Predicting the dative alternation. In Gerlof Bouma, Irene Krämer & Joost Zwarts (eds.) Cognitive foundations of interpretation, 69–94. Amsterdam: Royal Netherlands Academy of Science.Google Scholar

  • Bresnan, Joan & Marilyn Ford. 2010. Predicting syntax: Processing dative constructions in American and Australian varieties of English. Language 86(1). 186–213.Google Scholar

  • Burnham, Kenneth P. & David R. Anderson. 2002. Model selection and multimodel inference: A practical information-theoretic approach, 2nd edn. New York: Springer.Google Scholar

  • Bybee, Joan L. & David Eddington. 2006. A usage-based approach to Spanish verbs of ‘becoming’. Language 82(2). 323–355.Google Scholar

  • Caines, Andrew. 2012. ‘You talking to me?’ Testing corpus data with a shadowing experiment. In Stefan Th. Gries & Dagmar Divjak (eds.), Frequency effects in language learning and processing, 177–206. Berlin: MDe Gruyter Mouton.Google Scholar

  • Chafe, Wallace. 1992. The importance of corpus linguistics to understanding the nature of language. In Jan Svartvik (ed.), Directions in corpus linguistics, 79–97. Berlin: Mouton de Gruyter.Google Scholar

  • Crawley, Michael J. 2007. The R book. Chichester: John Wiley & Sons.Google Scholar

  • De Sutter, Gert, Dirk Speelman & Dirk Geeraerts. 2008. Prosodic and syntactic-pragmatic mechanisms of grammatical variation: The impact of a postverbal constituent on the word order in Dutch clause final verb clusters. International Journal of Corpus Linguistics 13(2). 194–224.Google Scholar

  • Deignan, Alice H. 2005. Metaphor and corpus linguistics. Amsterdam: John Benjamins.Google Scholar

  • Divjak, Dagmar. 2003. On trying in Russian: A tentative network model for near(er) synonyms. Slavica Gandensia 30. 25–58.Google Scholar

  • Divjak, Dagmar. 2004. Degrees of verb integration: Conceptualizing and categorizing events in Russian. Leuven: University of Leuven (KU Leuven) dissertation.Google Scholar

  • Divjak, Dagmar. 2010. Structuring the lexicon: A clustered model for near-synonymy (Cognitive Linguistics Research). Berlin: De Gruyter Mouton.Google Scholar

  • Divjak, Dagmar. 2012. Introduction. In Dagmar Divjak & Stephan Th. Gries (eds.), Frequency effects in language. Vol. 2: Frequency effects in language representation. Berlin: De Gruyter Mouton, 1–10.

  • Divjak, Dagmar & Antti Arppe. 2013. Extracting prototypes from exemplars: What can corpus data tell us about concept representation? Cognitive Linguistics 24(2). 221–274.Google Scholar

  • Divjak, Dagmar, Antti Arppe & Harald Baayen. 2016a. Does real language fit a self-paced reading paradigm? In Anja Gattnar, Tanja Anstatt & Christina Clasmeier (eds.), Slavic languages in psycholinguistics, 52–82. Tübingen: Narr Francke Attempto Verlag.Google Scholar

  • Divjak, Dagmar, Antti Arppe & Ewa Dąbrowska. 2016b. Machine meets man: Evaluating the psychological reality of corpus-based probabilistic models. Cognitive Linguistics 27(1). 1–33.Google Scholar

  • Divjak, Dagmar & Stefan Th. Gries. 2006. Ways of trying in Russian. Clustering behavioral profiles. Journal of Corpus Linguistics and Linguistic Theory 2(1). 23–60.Google Scholar

  • Divjak, Dagmar & Stefan Th. Gries. 2008. Clusters in the mind? Converging evidence from near-synonymy in Russian. The Mental Lexicon 3(2). 188–213.Google Scholar

  • Divjak, Dagmar & Stefan Th. Gries (eds.). 2012. Frequency effects in language. Vol. 2: Frequency effects in language representation. Berlin: De Gruyter Mouton.Google Scholar

  • Erker, Daniel & Gregory R. Guy. 2012. The role of lexical frequency in syntactic variability: Variable subject personal pronoun expression in Spanish. Language 88(3). 526–557.Google Scholar

  • Ford, Marilyn & Joan Bresnan. 2013a. Using convergent evidence from psycholinguistics and usage. In Manfred Krug & Julia Schlüter (eds.), Research methods in language variation and change, 295–312. Cambridge: Cambridge University Press.Google Scholar

  • Ford, Marilyn & Joan Bresnan. 2013b. ‘They whispered me the answer’ in Australia and the US: A comparative experimental study. In Tracy Holloway King & Valeria de Paiva (eds.), From quirky case to representing space: Papers in honor of Annie Zaenen, 95–107. Stanford: CSLI Publications. http://web.stanford.edu/group/cslipublications/cslipublications/Online/azfest-final.pdf (accessed 22 January 2015).Google Scholar

  • Frary, Robert B. 1988. Formula scoring of multiple-choice tests (correction for guessing). Educational Measurement: Issues and Practice 7(2). 33–38.Google Scholar

  • Gilquin, Gaëtanelle & Stefan Th. Gries. 2009. Corpora and experimental methods: A state-of-the-art review. Corpus Linguistics and Linguistic Theory 5(1). 1–26.Google Scholar

  • Glynn, Dylan & Kerstin Fischer (eds.). 2010. Quantitative methods in cognitive semantics: Corpus-driven approaches (Cognitive Linguistics Research 46). Berlin: De Gruyter Mouton.Google Scholar

  • Glynn, Dylan & Justyna Robinson (eds.). 2014. Corpus methods for semantics: Quantitative studies in polysemy and synonymy (Human Cognitive Processing 43). Amsterdam: John Benjamins.Google Scholar

  • Gries, Stefan Th. 2003. Multifactorial analysis in corpus linguistics: A study of particle placement. London: Continuum Press.Google Scholar

  • Gries, Stefan Th., Beate Hampe & Doris Schönefeld. 2010. Converging evidence II: More on the association of verbs and constructions. In Sally Rice & John Newman (eds.), Empirical and experimental methods in cognitive/functional research, 59–72. Stanford, CA: Center for the Study of Language and Information.Google Scholar

  • Gries, Stefan Th. & Martin Hilpert. 2010. Modeling diachronic change in the third person singular: A multifactorial, verb- and author-specific exploratory approach. English Language and Linguistics 14(3). 293–320.Google Scholar

  • Gries, Stefan Th. & Dagmar Divjak (eds.). 2012. Frequency effects in language. Vol. 1: Frequency effects in language learning and processing. Berlin: De Gruyter Mouton.Google Scholar

  • Grondelaers, Stefan & Dirk Speelman. 2007. A variationist account of constituent ordering in presentative sentences in Belgian Dutch. Corpus Linguistics and Linguistic Theory 3(2). 161–193.Google Scholar

  • Harrell, Frank E. 2001. Regression modeling strategies: With applications to linear models, logistic regression and survival analysis. New York: Springer.Google Scholar

  • Hosmer, David W., Jr., Stanley Lemeshow & Rodney X. Sturdivant. 2013. Applied logistic regression. Hoboken, NJ: John Wiley & Sons.Google Scholar

  • Jaeger, T. Florian 2008. Categorical data analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. Journal of Memory and Language 59(4). 434–446.Google Scholar

  • Jurafsky, Dan. 2003. Probabilistic modeling in psycholinguistics: Linguistic comprehension and production. In Rens Bod, Jennifer Hay & Stefanie Jannedy (eds.), Probabilistic linguistics, 39–95. Cambridge, MA: MIT Press.Google Scholar

  • Kendall, Tyler, Joan Bresnan & Gerard Van Herk. 2011. The dative alternation in African American English: Researching syntactic variation and change across sociolinguistic datasets. Corpus Linguistics and Linguistic Theory 7(2). 229–244.Google Scholar

  • Kilgariff, Adam. 2005. Language is never, ever, ever, random. Corpus Linguistics and Linguistic Theory 1(2). 263–276.Google Scholar

  • Klavan, Jane 2012. Evidence in linguistics: Corpus-linguistic and experimental methods for studying grammatical synonymy. (Dissertationes Linguisticae Universitatis Tartuensis). Tartu: University of Tartu Press.Google Scholar

  • Klavan, Jane. 2014. How good is good? Evaluating the performance of probabilistic statistical classification models for predicting constructional choices. Paper presented at 5th UK Cognitive Linguistics Conference, University of Lancaster, 29–31 July.

  • Kotz, Samuel (ed.). 2006. Encyclopedia of statistical sciences, Vol. 11. Hoboken, NJ: Wiley and Sons.Google Scholar

  • McEnery, Tony & Andrew Hardie 2012. Corpus linguistics: Method, theory and practice. Cambridge: Cambridge University Press.Google Scholar

  • Milin, Petar, Dagmar Divjak, Strahinja Dimitrijević & R. Harald Baayen. 2016. Towards cognitively plausible data science in language research. Cognitive Linguistics 27(4).Google Scholar

  • Mitchell, Gregory. 2012. Revisiting truth or triviality the external validity of research in the psychological laboratory. Perspectives on Psychological Science 7(2). 109–117.Google Scholar

  • Pinheiro, José C. & Douglas M. Bates. 2000. Mixed-effects models in S and S-PLUS. New York: Springer.Google Scholar

  • Raymond, William D. & Esther L. Brown. 2012. Are effects of word frequency effects of context of use? An analysis of initial fricative reduction in Spanish. In Stefan Th. Gries & Dagmar Divjak (eds.), Frequency effects in language learning and processing, 35–52. Berlin: De Gruyter Mouton.Google Scholar

  • Resnik, Philip & Jimmy Lin. 2010. Evaluation of NLP systems. In Alexander Clark, Chris Fox & Shalom Lappin (eds.), The handbook of computational linguistics and natural language processing, 271–295. Oxford: Wiley-Blackwell.Google Scholar

  • Roland, Douglas, Jeffrey L. Elman & Victor S. Ferreira. 2006. Why is that? Structural prediction and ambiguity resolution in a very large corpus of English sentences. Cognition 98. 245–272.Google Scholar

  • Sankoff, David & William Labov. 1979. On the uses of variable rules. Language in Society 8(3). 189–222.Google Scholar

  • Szmrecsanyi, Benedikt. 2013. Diachronic probabilistic grammar. English Language and Linguistics 19(3). 41–68.Google Scholar

  • Theijssen, Daphne, Louis ten Bosch, Lou Boves, Bert Cranen & Hans van Halteren. 2013. Choosing alternatives: Using Bayesian networks and memory-based learning to study the dative alternation. Corpus Linguistics and Linguistic Theory 9(2). 227–262.Google Scholar

  • Tooley, Kristen M. & Kathryn Bock. 2014. On the parity of structural persistence in language production and comprehension. Cognition 132(2). 101–136.Google Scholar

  • Van de Weijer, Joost, Carita Paradis, Caroline Willners & Magnus Lindgren. 2012. As lexical as it gets: The role of co-occurrence of antonyms in a visual lexical decision experiment. In Dagmar Divjak & Stefan Th. Gries (eds.), Frequency effects in language representation, 255–279. Berlin: De Gruyter Mouton.Google Scholar

  • Wasow, Thomas & Jennifer Arnold. 2003. Post-verbal constituent ordering in English. Topics in English Linguistics 43. 119–154.Google Scholar

  • Wolk, Christoph, Joan Bresnan, Anette Rosenbach & Benedikt Szmrecsanyi. 2013. Dative and genitive variability in Late Modern English: Exploring cross-constructional variation and change. Diachronica 30(3). 382–419.Google Scholar

About the article

Received: 2015-06-01

Revised: 2015-11-17

Revised: 2016-02-29

Accepted: 2016-05-31

Published Online: 2016-11-08

Published in Print: 2016-11-01

Citation Information: Folia Linguistica, Volume 50, Issue 2, Pages 355–384, ISSN (Online) 1614-7308, ISSN (Print) 0165-4004, DOI: https://doi.org/10.1515/flin-2016-0014.

Export Citation

©2016 by De Gruyter Mouton.Get Permission

Citing Articles

Here you can find all Crossref-listed publications in which this article is cited. If you would like to receive automatic email messages as soon as this article is cited in other publications, simply activate the “Citation Alert” on the top of this page.

Melanie Röthlisberger, Jason Grafmiller, and Benedikt Szmrecsanyi
Cognitive Linguistics, 2017, Volume 0, Number 0

Comments (0)

Please log in or register to comment.
Log in