Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton September 11, 2018

Colorless green ideas do sleep furiously: gradient acceptability and the nature of the grammar

Jon Sprouse, Beracah Yankama, Sagar Indurkhya, Sandiway Fong and Robert C. Berwick
From the journal The Linguistic Review


In their recent paper, Lau, Clark, and Lappin explore the idea that the probability of the occurrence of word strings can form the basis of an adequate theory of grammar (Lau, Jey H., Alexander Clark & 15 Shalom Lappin. 2017. Grammaticality, acceptability, and probability: A prob- abilistic view of linguistic knowledge. Cognitive Science 41(5):1201–1241). To make their case, they present the results of correlating the output of several probabilistic models trained solely on naturally occurring sentences with the gradient acceptability judgments that humans report for ungrammatical sentences derived from roundtrip machine translation errors. In this paper, we first explore the logic of the Lau et al. argument, both in terms of the choice of evaluation metric (gradient acceptability), and in the choice of test data set (machine translation errors on random sentences from a corpus). We then present our own series of studies intended to allow for a better comparison between LCL’s models and existing grammatical theories. We evaluate two of LCL’s probabilistic models (trigrams and recurrent neural network) against three data sets (taken from journal articles, a textbook, and Chomsky’s famous colorless-green-ideas sentence), using three evaluation metrics (LCL’s gradience metric, a categorical version of the metric, and the experimental-logic metric used in the syntax literature). Our results suggest there are very real, measurable cost-benefit tradeoffs inherent in LCL’s models across the three evaluation metrics. The gain in explanation of gradience (between 13% and 31% of gradience) is offset by losses in the other two metrics: a 43%-49% loss in coverage based on a categorical metric of explaining acceptability, and a loss of 12%-35% in explaining experimentally-defined phenomena. This suggests that anyone wishing to pursue LCL’s models as competitors with existing syntactic theories must either be satisfied with this tradeoff, or modify the models to capture the phenomena that are not currently captured.


We would like to thank two anonymous reviewers for immensely helpful for comments on an earlier draft. We would also like to thank audiences at GLOW 39, NELS 46, the University of Massachusetts, the University of Southern California, the City University of New York, and the Massachusetts Institute of Technology for stimulating discussions at various stages of this project. Finally, we’d like to thank Jey Han Lau, Alexander Clark, and Shalom Lappin for opening this conversation, and for making the code for their models publicly accessible. All errors remain our own. This material is based upon work supported by the National Science Foundation under Grant No. BCS-1347115 to JS.


Adger, David. 2003. Core syntax. Oxford: Oxford University Press.Search in Google Scholar

Bock, Kathryn & Carol A Miller. 1991. Broken agreement. Cognitive Psychology 23:45–93.10.1016/0010-0285(91)90003-7Search in Google Scholar

Bresnan, Joan. 2007. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Sam Featherston & Wolfgang Sternefeld (eds.), Roots: Linguistics in search of its evidential base. Studies in Generative Grammar, 77–96. Berlin and New York: Mouton de Gruyter.Search in Google Scholar

Chomsky, Noam. 1955/1975. The logical structure of linguistic theory. New York: Springer.Search in Google Scholar

Chomsky, Noam. 1956. Three models for the description of language. IRE Transactions on Information Theory 2(3):113–124.10.1109/TIT.1956.1056813Search in Google Scholar

Chomsky, Noam. 1986. Knowledge of language: Its nature, origins, and use. New York: Praeger.Search in Google Scholar

Collins, Chris & Edward Stabler. 2016. A formalization of minimalist syntax. Syntax 19:43–78.10.1111/synt.12117Search in Google Scholar

Elo, Arpad. 1978. The rating of chessplayers, past and present. New York: Arco Press.Search in Google Scholar

Featherston, Sam. 2005. The decathlon model of empirical syntax. In M. Reis & S. Kepser (eds.), Linguistic evidence: Empirical, theoretical, and computational perspectives, 187–208. Berlin: Mouton de Gruyter.10.1515/9783110197549.187Search in Google Scholar

Fodor, Jerry A & Zenon Pylyshyn. 1988. Connectionism and cognitive architecture: A critical analysis. Cognition 28:3–71.10.1016/0010-0277(88)90031-5Search in Google Scholar

Hunter, Tim & Chris Dyer. 2013. Distributions on Minimalist grammar derivations. Proceedings of the 13th Meeting on the Mathematics of Language.Search in Google Scholar

Keller, Frank. 2000. Gradience in grammar: Experimental and computational aspects of degrees of grammaticality. Edinburgh: University of Edinburgh dissertation.Search in Google Scholar

Lau, Jey H., Alexander Clark & Shalom Lappin. 2014. Measuring gradience in speakers’ grammaticality judgements. Proceedings of the 36th Annual Conference of the Cognitive Science Society, Quebec City, July.Search in Google Scholar

Lau, Jey H., Alexander Clark & Shalom Lappin. 2015. Unsupervised prediction of acceptability judgements. Proceedings of the 53rd Annual Conference of the Association of Computational Linguistics, Beijing, July.10.3115/v1/P15-1156Search in Google Scholar

Lau, Jey H., Alexander Clark & Shalom Lappin. 2017. Grammaticality, acceptability, and probability: A probabilistic view of linguistic knowledge. Cognitive Science 41(5):1201–1241.10.1111/cogs.12414Search in Google Scholar

Mikolov, Tomas. 2012. Statistical Language Models Based on Neural Networks. Brno: Brno University of Technology dissertation.Search in Google Scholar

Noam, Chomsky & George A Miller. 1963. Introduction to the formal analysis of natural languages. In R. D. Luce, R. R. Bush & E. Galanter (eds.), Handbook of mathematical psychology, vol. 2, 269–321. Amsterdam: Wiley.Search in Google Scholar

Pauls, Adam & Dan Klein. 2012. Large-scale syntactic language modeling with treelets. In Proceedings of the 50th annual meeting of the association for computational linguistics: Long papers-volume 1, 959–968. Stroudsburg PA, USA: Association for Computational Linguistics.Search in Google Scholar

Pereira, Fernando. 2000. Formal grammar and information theory: together again? Philosophical Transactions of the Royal Society 358(1769):1239–1253. doi:10.1098/rsta.2000.0583.10.1098/rsta.2000.0583Search in Google Scholar

Prince, Alan & Paul Smolensky. 1991. Connectionism and harmony theory in linguistics. Report CU-CS-600-92. Computer Science Department, University of Colorado at Boulder.Search in Google Scholar

Prince, Alan & Paul Smolensky. 1993. Optimality theory: Constraint interaction in generative grammar. RuCCS Technical Report 2, Rutgers University. Piscateway, NJ: Rutgers University Center for Cognitive Science.10.1002/9780470756171.ch1Search in Google Scholar

Smolensky, Paul. 1988. The constituent structure of mental states: A reply to Fodor and Pylyshyn. The Southern Journal of Philosophy 26:137–161.10.1007/978-94-011-3524-5_13Search in Google Scholar

Smolensky, Paul & Geraldine Legendre. 2006. The harmonic mind. Cambridge, MA: MIT Press.Search in Google Scholar

Sorace, Antonella & Frank Keller. 2005. Gradience in linguistic data. Lingua 115:1497–1524.10.1016/j.lingua.2004.07.002Search in Google Scholar

Sprouse, Jon & Diogo Almeida. 2012. Assessing the reliability of textbook data in syntax: Adger’s core syntax. Journal of Linguistics 48:609–652.10.1017/S0022226712000011Search in Google Scholar

Sprouse, Jon, Carson T Schütze & Diogo Almeida. 2013. A comparison of informal and formal acceptability judgments using a random sample from Linguistic Inquiry 2001-2010. Lingua 134:219–248.10.1016/j.lingua.2013.07.002Search in Google Scholar

Townsend, David J. & Thomas G Bever. 2001. Sentence comprehension: The integration of habits and rules. Cambridge, MA: MIT Press.10.7551/mitpress/6184.001.0001Search in Google Scholar

Xiang, Ming, Brian Dillon & Colin Phillips. 2009. Illusory licensing effects across dependency types: ERP evidence. Brain and Language 108:40–55.10.1016/j.bandl.2008.10.002Search in Google Scholar

Published Online: 2018-09-11
Published in Print: 2018-09-25

© 2018 Walter de Gruyter GmbH, Berlin/Boston

Scroll Up Arrow