Jump to ContentJump to Main Navigation
Show Summary Details
More options …

The Linguistic Review

Editor-in-Chief: van der Hulst, Harry

IMPACT FACTOR 2018: 0.463
5-year IMPACT FACTOR: 0.789

CiteScore 2018: 0.69

SCImago Journal Rank (SJR) 2018: 0.643
Source Normalized Impact per Paper (SNIP) 2018: 0.679

See all formats and pricing
More options …
Volume 35, Issue 3


Colorless green ideas do sleep furiously: gradient acceptability and the nature of the grammar

Jon Sprouse / Beracah Yankama / Sagar Indurkhya / Sandiway Fong / Robert C. Berwick
Published Online: 2018-09-11 | DOI: https://doi.org/10.1515/tlr-2018-0005


In their recent paper, Lau, Clark, and Lappin explore the idea that the probability of the occurrence of word strings can form the basis of an adequate theory of grammar (Lau, Jey H., Alexander Clark & 15 Shalom Lappin. 2017. Grammaticality, acceptability, and probability: A prob- abilistic view of linguistic knowledge. Cognitive Science 41(5):1201–1241). To make their case, they present the results of correlating the output of several probabilistic models trained solely on naturally occurring sentences with the gradient acceptability judgments that humans report for ungrammatical sentences derived from roundtrip machine translation errors. In this paper, we first explore the logic of the Lau et al. argument, both in terms of the choice of evaluation metric (gradient acceptability), and in the choice of test data set (machine translation errors on random sentences from a corpus). We then present our own series of studies intended to allow for a better comparison between LCL’s models and existing grammatical theories. We evaluate two of LCL’s probabilistic models (trigrams and recurrent neural network) against three data sets (taken from journal articles, a textbook, and Chomsky’s famous colorless-green-ideas sentence), using three evaluation metrics (LCL’s gradience metric, a categorical version of the metric, and the experimental-logic metric used in the syntax literature). Our results suggest there are very real, measurable cost-benefit tradeoffs inherent in LCL’s models across the three evaluation metrics. The gain in explanation of gradience (between 13% and 31% of gradience) is offset by losses in the other two metrics: a 43%-49% loss in coverage based on a categorical metric of explaining acceptability, and a loss of 12%-35% in explaining experimentally-defined phenomena. This suggests that anyone wishing to pursue LCL’s models as competitors with existing syntactic theories must either be satisfied with this tradeoff, or modify the models to capture the phenomena that are not currently captured.

Keywords: acceptability; grammaticality; probability; gradience; n-grams; recurrent neural networks


  • Adger, David. 2003. Core syntax. Oxford: Oxford University Press.Google Scholar

  • Bock, Kathryn & Carol A Miller. 1991. Broken agreement. Cognitive Psychology 23:45–93.CrossrefGoogle Scholar

  • Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Sam Featherston & Wolfgang Sternefeld (eds.), Roots: Linguistics in search of its evidential base. Studies in Generative Grammar, 77–96. Berlin and New York: Mouton de Gruyter.Google Scholar

  • Chomsky, Noam. 1955/1975. The logical structure of linguistic theory. New York: Springer.Google Scholar

  • Chomsky, Noam. 1956. Three models for the description of language. IRE Transactions on Information Theory 2(3):113–124.CrossrefGoogle Scholar

  • Chomsky, Noam. 1986. Knowledge of language: Its nature, origins, and use. New York: Praeger.Google Scholar

  • Collins, Chris & Edward Stabler. 2016. A formalization of minimalist syntax. Syntax 19:43–78.CrossrefGoogle Scholar

  • Elo, Arpad. 1978. The rating of chessplayers, past and present. New York: Arco Press.Google Scholar

  • Featherston, Sam. 2005. The decathlon model of empirical syntax. In M. Reis & S. Kepser (eds.), Linguistic evidence: Empirical, theoretical, and computational perspectives, 187–208. Berlin: Mouton de Gruyter.Google Scholar

  • Fodor, Jerry A & Zenon Pylyshyn. 1988. Connectionism and cognitive architecture: A critical analysis. Cognition 28:3–71.CrossrefGoogle Scholar

  • Hunter, Tim & Chris Dyer. 2013. Distributions on Minimalist grammar derivations. Proceedings of the 13th Meeting on the Mathematics of Language.Google Scholar

  • Keller, Frank. 2000. Gradience in grammar: Experimental and computational aspects of degrees of grammaticality. Edinburgh: University of Edinburgh dissertation.Google Scholar

  • Lau, Jey H., Alexander Clark & Shalom Lappin. 2014. Measuring gradience in speakers’ grammaticality judgements. Proceedings of the 36th Annual Conference of the Cognitive Science Society, Quebec City, July.Google Scholar

  • Lau, Jey H., Alexander Clark & Shalom Lappin. 2015. Unsupervised prediction of acceptability judgements. Proceedings of the 53rd Annual Conference of the Association of Computational Linguistics, Beijing, July.Google Scholar

  • Lau, Jey H., Alexander Clark & Shalom Lappin. 2017. Grammaticality, acceptability, and probability: A probabilistic view of linguistic knowledge. Cognitive Science 41(5):1201–1241.Web of ScienceGoogle Scholar

  • Mikolov, Tomas. 2012. Statistical Language Models Based on Neural Networks. Brno: Brno University of Technology dissertation.Google Scholar

  • Noam, Chomsky & George A Miller. 1963. Introduction to the formal analysis of natural languages. In R. D. Luce, R. R. Bush & E. Galanter (eds.), Handbook of mathematical psychology, vol. 2, 269–321. Amsterdam: Wiley.Google Scholar

  • Pauls, Adam & Dan Klein. 2012. Large-scale syntactic language modeling with treelets. In Proceedings of the 50th annual meeting of the association for computational linguistics: Long papers-volume 1, 959–968. Stroudsburg PA, USA: Association for Computational Linguistics.Google Scholar

  • Pereira, Fernando. 2000. Formal grammar and information theory: together again? Philosophical Transactions of the Royal Society 358(1769):1239–1253. doi:10.1098/rsta.2000.0583.

  • Prince, Alan & Paul Smolensky. 1991. Connectionism and harmony theory in linguistics. Report CU-CS-600-92. Computer Science Department, University of Colorado at Boulder.Google Scholar

  • Prince, Alan & Paul Smolensky. 1993. Optimality theory: Constraint interaction in generative grammar. RuCCS Technical Report 2, Rutgers University. Piscateway, NJ: Rutgers University Center for Cognitive Science.Google Scholar

  • Smolensky, Paul. 1988. The constituent structure of mental states: A reply to Fodor and Pylyshyn. The Southern Journal of Philosophy 26:137–161.Google Scholar

  • Smolensky, Paul & Geraldine Legendre. 2006. The harmonic mind. Cambridge, MA: MIT Press.Google Scholar

  • Sorace, Antonella & Frank Keller. 2005. Gradience in linguistic data. Lingua 115:1497–1524.CrossrefGoogle Scholar

  • Sprouse, Jon & Diogo Almeida. 2012. Assessing the reliability of textbook data in syntax: Adger’s core syntax. Journal of Linguistics 48:609–652.CrossrefWeb of ScienceGoogle Scholar

  • Sprouse, Jon, Carson T Schütze & Diogo Almeida. 2013. A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001-2010. Lingua 134:219–248.CrossrefWeb of ScienceGoogle Scholar

  • Townsend, David J. & Thomas G Bever. 2001. Sentence comprehension: The integration of habits and rules. Cambridge, MA: MIT Press.Google Scholar

  • Xiang, Ming, Brian Dillon & Colin Phillips. 2009. Illusory licensing effects across dependency types: ERP evidence. Brain and Language 108:40–55.CrossrefWeb of ScienceGoogle Scholar

About the article

Published Online: 2018-09-11

Published in Print: 2018-09-25

Citation Information: The Linguistic Review, Volume 35, Issue 3, Pages 575–599, ISSN (Online) 1613-3676, ISSN (Print) 0167-6318, DOI: https://doi.org/10.1515/tlr-2018-0005.

Export Citation

© 2018 Walter de Gruyter GmbH, Berlin/Boston.Get Permission

Comments (0)

Please log in or register to comment.
Log in