Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter Mouton December 9, 2022

Exposure and emergence in usage-based grammar: computational experiments in 35 languages

  • Jonathan Dunn ORCID logo EMAIL logo
From the journal Cognitive Linguistics

Abstract

This paper uses computational experiments to explore the role of exposure in the emergence of construction grammars. While usage-based grammars are hypothesized to depend on a learner’s exposure to actual language use, the mechanisms of such exposure have only been studied in a few constructions in isolation. This paper experiments with (i) the growth rate of the constructicon, (ii) the convergence rate of grammars exposed to independent registers, and (iii) the rate at which constructions are forgotten when they have not been recently observed. These experiments show that the lexicon grows more quickly than the grammar and that the growth rate of the grammar is not dependent on the growth rate of the lexicon. At the same time, register-specific grammars converge onto more similar constructions as the amount of exposure increases. This means that the influence of specific registers becomes less important as exposure increases. Finally, the rate at which constructions are forgotten when they have not been recently observed mirrors the growth rate of the constructicon. This paper thus presents a computational model of usage-based grammar that includes both the emergence and the unentrenchment of constructions.


Corresponding author: Jonathan Dunn, Department of Linguistics and the New Zealand Institute for Language, Brain and Behaviour, University of Canterbury, Christchurch, New Zealand, E-mail:

Appendix 1: Glossary of computational terms

Potential Construction . Given an observed string, the learner could hypothesize many competing structural analyses. From the perspective of CxG, these structures could involve boundaries, slot types, and slot constraints. Part of modelling the emergence of such structures is to hypothesize what possible constructions are being observed in a particular set of input.

Hypothesis Space . A constructicon in this model is a set of constructions, with each construction a sequence of slot-constraints. Given a specific language, the set of possible constructions depends both on the formation of categories (e.g., what semantic or syntactic categories are recognized) as well as the usage observed in the corpora. Richer representations like CxG have a larger hypothesis space because they contain more potential structures.

Δ P . Within the CxG model, the distribution of sequences is quantified using association measures, specifically this ΔP which captures the probability of an outcome given a cue. In this case, the cue is a particular linguistic sequence and the outcome is the following sequence. For example, sequences with many possible following items will have lower values.

Skip-Gram Negative-Sampling . The semantic domains for each language are formed using word embeddings trained using the sgns method (within the fastText framework). This method essentially trains a logistic regression classifier to predict the most likely context words for each target word in a corpus. Work in NLP has shown that such embeddings are closely related to matrices of association measures.

Slot-Constraint . Constructions are based on symbolic constraints, in which an utterance is produced by some series of specific slot-constraints. A slot in this sense is a segmentation (here, word-level segmentation) and a constraint is the type of category used to formulate the generalization.

Slot-Fillers . In this model, there are three types of slot-fillers: lexical, syntactic, and semantic. While CxG could formulate joint semantic-syntactic constraints, the constructions learned in this paper show that implicit relationships between constraints emerge during learning.

Minimum Description Length (MDL). Given a grammar, we evaluate its quality by describing how well it fits or describes a particular test corpus. For language models, the measure most often used is perplexity, for which a better model will provide a smaller description of the data. MDL is a further refinement of perplexity which takes model complexity into account as well. The resulting trade-off is a balance between memory (i.e., storing all possible constructions) and computation (i.e., relying on fully compositional phrase structure rules).

Starting Node . Within the beam-search algorithm for finding potential constructions, each hypothesized structure must have boundaries. The starting node refers to the beginning of such a structure, the point from which the search begins.

Candidate Stack . From each starting node, a very large number of potential representations (potential constructions) is possible. Rather than maintaining all such candidates, we store in short-term memory the best options and then prune or evaluate these candidates using a beam-search.

References

Alishahi, Afra & Suzanne Stevenson. 2008. A computational model of early argument structure acquisition. Cognitive Science 32(5). 789–834. https://doi.org/10.1080/03640210801929287.Search in Google Scholar

Anthonissen, Lynn. 2020. Cognition in construction grammar: Connecting individual and community grammars. Cognitive Linguistics 31(2). 309–337. https://doi.org/10.1515/cog-2019-0023.Search in Google Scholar

Azazil, Lina. 2020. Frequency effects in the L2 acquisition of the catenative verb construction – evidence from experimental and corpus data. Cognitive Linguistics 31(3). 417–451. https://doi.org/10.1515/cog-2018-0139.Search in Google Scholar

Baayen, Harald. 2001. Word frequency distributions. Dordrecht: Springer Netherlands.10.1007/978-94-010-0844-0Search in Google Scholar

Barak, Libby & Adele Goldberg. 2017. Modeling the partial productivity of constructions. In Proceedings of AAAI 2017 spring symposium on computational construction grammar and natural language understanding, 131–138. Association for the Advancement of Artificial Intelligence. Available at: https://aaai.org/ocs/index.php/SSS/SSS17/paper/view/15297.Search in Google Scholar

Barak, Libby, Adele Goldberg & Suzanne Stevenson. 2017. Comparing computational cognitive models of generalization in a language acquisition task. In Proceedings of the conference on empirical methods in NLP, 96–106. Association for Computational Linguistics.10.18653/v1/D16-1010Search in Google Scholar

Bates, Elizabeth & Judith Goodman. 1997. On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing. Language & Cognitive Processes 12(5–6). 507–584. https://doi.org/10.1080/016909697386628.Search in Google Scholar

Beckner, Clay, Richard Blythe, Joan Bybee, Morten Christiansen, William Croft, Nick Ellis, John Holland, Jinyun Ke, Diane Larsen-Freeman & Tom Schoenemann. 2009. language is a complex adaptive system: Position paper. Language Learning 59. 1–26. https://doi.org/10.1111/j.1467-9922.2009.00533.x.Search in Google Scholar

Beekhuizen, Barend, Rens. Bod, Afsaneh Fazly, Suzanne Stevenson & Arie Verhagen. 2015. A usage-based model of early grammatical development. In Proceedings of the workshop on cognitive modeling and computational linguistics, 46–54. Association for Computational Linguistics.10.3115/v1/W14-2006Search in Google Scholar

Bender, Emily, Timnit Gebru, Angelina McMillan-Major & Shmargaret Shmitchell. 2021. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, 610–623. New York: Association for Computing Machinery.10.1145/3442188.3445922Search in Google Scholar

Biber, Douglas. 2012. Register as a predictor of linguistic variation. Corpus Linguistics and Linguistic Theory 8(1). 9–37. https://doi.org/10.1515/cllt-2012-0002.Search in Google Scholar

Brezina, Vaclav, Tony McEnery & Stephen Wattam. 2015. Collocations in context: A new perspective on collocation networks. International Journal of Corpus Linguistics 20(2). 139–173. https://doi.org/10.1075/ijcl.20.2.01bre.Search in Google Scholar

Bybee, Joan. 2006. From usage to Grammar: The mind’s response to repetition. Language 82(4). 711–733. https://doi.org/10.1353/lan.2006.0186.Search in Google Scholar

Christodoulopoulos, Christos & Mark Steedman. 2015. A massively parallel corpus: The bible in 100 languages. Language Resources and Evaluation 49. 375–395. https://doi.org/10.1007/s10579-014-9287-y.Search in Google Scholar

Church, Kenneth & Patrick Hanks. 1989. Word association norms, mutual information, and lexicography. In Proceedings of the annual meeting of the association for computational linguistics, 76–83. Association for Computational Linguistics.10.3115/981623.981633Search in Google Scholar

Clauset, Aaron, Cosma Rohilla Shalizi & M. E. J. Newman. 2009. Power-law distributions in empirical data. SIAM Review 51(4). 661–703. https://doi.org/10.1137/070710111.Search in Google Scholar

Dagan, Ido, Shaul Marcus & Shaul Markovitch. 1993. Contextual word similarity and estimation from sparse data. In Proceedings of the annual meeting of the association for computational linguistics, 164–171. Association for Computational Linguistics.10.3115/981574.981596Search in Google Scholar

Desagulier, Guillaume. 2016. A lesson from associative learning: Asymmetry and productivity in multiple-slot constructions. Corpus Linguistics and Linguistic Theory 12(2). 173–219. https://doi.org/10.1515/cllt-2015-0012.Search in Google Scholar

Divjak, Dagmar. 2019. Frequency in language: Memory, attention and learning. Cambridge: Cambridge University Press.10.1017/9781316084410Search in Google Scholar

Dunn, Jonathan. 2017. Computational learning of construction grammars. Language and Cognition 9(2). 254–292. https://doi.org/10.1017/langcog.2016.7.Search in Google Scholar

Dunn, Jonathan. 2018a. Modeling the complexity and descriptive adequacy of construction grammars. In Proceedings of the society for computation in linguistics, 81–90. Association for Computational Linguistics.Search in Google Scholar

Dunn, Jonathan. 2018b. Finding variants for construction-based dialectometry: A corpus-based approach to regional CxGs. Cognitive Linguistics 29(2). 275–311. https://doi.org/10.1515/cog-2017-0029.Search in Google Scholar

Dunn, Jonathan. 2018c. Multi-unit association measures: Moving beyond pairs of words. International Journal of Corpus Linguistics 23(2). 183–215. https://doi.org/10.1075/ijcl.16098.dun.Search in Google Scholar

Dunn, Jonathan. 2019a. Frequency vs. Association for constraint selection in usage-based construction grammar. In Proceedings of the workshop on cognitive modeling and computational linguistics, 117–128. Association for Computational Linguistics.10.18653/v1/W19-2913Search in Google Scholar

Dunn, Jonathan. 2019b. Global syntactic variation in seven languages: Toward a computational dialectology. Frontiers in Artificial Intelligence: Language and Computation 2. 15. https://doi.org/10.3389/frai.2019.00015.Search in Google Scholar

Dunn, Jonathan. 2019c. Modeling global syntactic variation in English using dialect classification. In Proceedings of the sixth workshop on NLP for similar languages, varieties and dialects, 42–53. Association for Computational Linguistics.Search in Google Scholar

Dunn, Jonathan. 2020. Mapping languages: The corpus of global language use. Language Resources and Evaluation 54. 999–1018. https://doi.org/10.1007/s10579-020-09489-2.Search in Google Scholar

Dunn, Jonathan. 2021. Representations of language varieties are reliable given corpus similarity measures. In Proceedings of the eighth workshop on NLP for similar languages, varieties and dialects, 28–38. Association for Computational Linguistics. Available at: https://aclanthology.org/2021.vardial-1.4.Search in Google Scholar

Dunn, Jonathan. (2022). “Replication Data for: Exposure and Emergence in Usage-Based Grammar: Computational Experiments in 35 Languages”. https://doi.org/10.18710/CES0L8, DataverseNO, V1Search in Google Scholar

Dunn, Jonathan & Ben Adams. 2020. Geographically-balanced gigaword corpora for 50 language varieties. In Proceedings of the international conference on language resources and evaluation, 2528–2536. European Language Resources Association. Available at: https://www.aclweb.org/anthology/2020.lrec-1.308.Search in Google Scholar

Dunn, Jonathan & Harish Tayyar Madabushi. 2021. Learned construction grammars converge across registers given increased exposure. In Proceedings of the conference on natural language learning, 268–278. Association for Computational Linguistics.10.18653/v1/2021.conll-1.21Search in Google Scholar

Dunn, Jonathan & Andrea Nini. 2021. Production vs perception: The role of individuality in usage-based grammar induction. In Proceedings of the workshop on cognitive modeling and computational linguistics, 149–159. Association for Computational Linguistics.10.18653/v1/2021.cmcl-1.19Search in Google Scholar

Dunn, Jonathan, Haipeng Li & Damien Sastre. 2022. Predicting embedding reliability in low-resource settings using corpus similarity measures. In Proceedings of the 13th international conference on language resources and evaluation, 6461–6470. European Language Resources Association. Available at: http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.693.pdf.Search in Google Scholar

Dunn, Jonathan & Sidney Wong. 2022. Stability of syntactic dialect classification over space and time. In Proceedings of the international conference on computational linguistics, 26. Available at: https://aclanthology.org/2022.coling-1.3.Search in Google Scholar

Ellis, Nick. 2007. Language acquisition as rational contingency learning. Applied Linguistics 27(1). 1–24. https://doi.org/10.1093/applin/ami038.Search in Google Scholar

Flach, Susanne. 2020. Schemas and the frequency/acceptability mismatch: Corpus distribution predicts sentence judgments. Cognitive Linguistics 31(4). 609–645. https://doi.org/10.1515/cog-2020-.2040.Search in Google Scholar

Forsberg, Markus, Richard Johansson, Linnéa Bäckström, Lars Borin, Benjamin Lyngfelt, Joel Olofsson & Julia Prentice. 2014. From construction candidates to constructicon entries: An experiment using semi-automatic methods for identifying constructions in corpora. Constructions and Frames 6(1). 114–135. https://doi.org/10.1075/cf.6.1.07for.Search in Google Scholar

Gelbukh, Alexander & Grigori Sidorov. 2001. Zipf and heaps laws’ coefficients depend on language. In Proceedings of conference on intelligent text processing and computational linguistics, 332–335.10.1007/3-540-44686-9_33Search in Google Scholar

Goldberg, Adele. 2006. Constructions at work: The nature of generalization in language. Oxford: Oxford University Press.10.1093/acprof:oso/9780199268511.001.0001Search in Google Scholar

Goldberg, Adele. 2011. Corpus evidence of the viability of statistical pre-emption. Cognitive Linguistics 22(1). 131–154. https://doi.org/10.1515/cogl.2011.006.Search in Google Scholar

Goldberg, Adele. 2016. Partial productivity of linguistic constructions: Dynamic categorization and statistical pre-emption. Language and Cognition 8(3). 369–390. https://doi.org/10.1017/langcog.2016.17.Search in Google Scholar

Goldsmith, John. 2001. Unsupervised learning of the morphology of a natural language. Computational Linguistics 27(2). 153–198. https://doi.org/10.1162/089120101750300490.Search in Google Scholar

Goldsmith, John. 2006. An algorithm for the unsupervised learning of morphology. Natural Language Engineering 12(4). 353–371. https://doi.org/10.1017/S1351324905004055.Search in Google Scholar

Goldsmith, John. 2015. Towards a new empiricism for linguistics. In Nick Chater, Alexander Clark, John Goldsmith & Amy Perfors (eds.), Empiricism and language learnability, 58–105. Oxford: Oxford University Press.10.1093/acprof:oso/9780198734260.003.0003Search in Google Scholar

Grave, Edouard, Piotr Bojanowski, Prakhar Gupta, Armand Joulin & Tomas Mikolov. 2019. Learning word vectors for 157 languages. In Proceedings of the international conference on language resources and evaluation, 3483–3487. European Language Resources Association. Available at: https://aclanthology.org/L18-1550.Search in Google Scholar

Gries, Stefan Th. 2013. 50-something years of work on collocations: What is or should be next. International Journal of Corpus Linguistics 18(1). 137–165. https://doi.org/10.1075/ijcl.18.1.09gri.Search in Google Scholar

Grünwald, Peter. 2007. The Minimum description length principle. Cambridge, MA: MIT Press.10.7551/mitpress/4643.001.0001Search in Google Scholar

Hampe, Beate. 2011. Discovering constructions by means of collostruction analysis: The English denominative construction. Cognitive Linguistics 22(2). 211–245. https://doi.org/10.1515/cogl.2011.009.Search in Google Scholar

Heaps, Harold Stanley. 1978. Information retrieval: Computational and theoretical aspects. New York, NY: Academic Press.Search in Google Scholar

Hunston, Susan. 2019. Patterns, constructions, and applied linguistics. International Journal of Corpus Linguistics 24(3). 324–353. https://doi.org/10.1075/ijcl.00015.hun.Search in Google Scholar

Kuperman, Victor, Hans Stadthagen-Gonzalez & Marc Brysbaert. 2012. Age-of-acquisition ratings for 30, 000 English words. Behavior Research Methods 44(4). 978–990. https://doi.org/10.3758/s13428-012-0210-4.Search in Google Scholar

Li, Haipeng, Jonathan Dunn & Andrea Nini. 2022. Register variation remains stable across 60 languages. Corpus Linguistics and Linguistic Theory. https://doi.org/10.1515/cllt-2021-0090.Search in Google Scholar

Liu, Li & Ben Ambridge. 2021. Balancing information-structure and semantic constraints on construction choice: Building a computational model of passive and passive-like constructions in mandarin Chinese. Cognitive Linguistics 32(3). 349–388. https://doi.org/10.1515/cog-2019-0100.Search in Google Scholar

Martí, Maria Antónia, Mariona Taulé, Venelin Kovatchev & Maria Salamó. 2019. DISCOver: DIStributional approach based on syntactic dependencies for discovering COnstructions. Corpus Linguistics and Linguistic Theory 17(2). 491–523. https://doi.org/10.1515/cllt-2018-0028.Search in Google Scholar

Matusevych, Yevgen, Afra Alishahi & Ad Backus. 2013. Computational simulations of second language construction learning. In Proceedings of the workshop on cognitive modeling and computational linguistics, 47–56. Association for Computational Linguistics. Available at: http://www.aclweb.org/anthology/W13-2606.Search in Google Scholar

Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado & Jeffrey Dean. 2013. Distributed representations of words and phrases and their compositionality Proceedings of the international conference on neural information processing systems, vol. 2, 3111–3119. Available at: https://proceedings.neurips.cc/paper/2013/file/9aa42b31882ec039965f3c4923ce901b-Paper.pdf.Search in Google Scholar

Nguyen, Dat Quoca, Dai Quocb Nguyen, Dang Ducc Pham & Son Baod Pham. 2016. A robust transformation-based learning approach using ripple down rules for part-of-speech tagging. AI Communications 29(3). 409–422. https://doi.org/10.3233/AIC-150698.Search in Google Scholar

Nivre, Joakim & Ryan McDonald. 2008. Integrating graph-based and transition-based dependency parser. In Proceedings of the annual meeting of the association for computational linguistics, 950–958. Association for Computational Linguistics. Available at: https://aclanthology.org/P08-1108.Search in Google Scholar

Osborne, Timothy & Thomas Gross. 2012. Constructions are catenae: Construction grammar meets dependency grammar. Cognitive Linguistics 23(1). 165–216. https://doi.org/10.1515/cog-2012-0006.Search in Google Scholar

Perek, Florent & Adele Goldberg. 2017. Linguistic generalization on the basis of function and constraints on the basis of statistical pre-emption. Cognition 168. 276–293. https://doi.org/10.1016/j.cognition.2017.06.019.Search in Google Scholar

Perek, Florent & Amanda Patten. 2019. Towards an English constructicon using patterns and frames. International Journal of Corpus Linguistics 24(3). 354–384. https://doi.org/10.1075/ijcl.00016.per.Search in Google Scholar

Petrov, Slav, Dipanjan Das & Ryan McDonald. 2012. A universal part-of-speech tagset. In Proceedings of the international conference on language resources and evaluation, 2089–2096. European Language Resources Association. Available at: http://www.lrec-conf.org/proceedings/lrec2012/pdf/274_Paper.pdf.Search in Google Scholar

Sommerer, Lotte & Andreas Baumann. 2021. Of absent mothers, strong sisters and peculiar daughters: The constructional network of English NPN constructions. Cognitive Linguistics 32(1). 97–131. https://doi.org/10.1515/cog-2020-0013.Search in Google Scholar

Tayyar Madabushi, Harish, Laurence Romain, Dagmar Divjak & Petar Milin. 2020. CxGBERT: BERT meets construction grammar. In Proceedings of the 28th international conference on computational linguistics, 4020–4032.10.18653/v1/2020.coling-main.355Search in Google Scholar

Theakston, Anna, Robert Maslen, Elena Lieven & Michael Tomasello. 2012. The acquisition of the active transitive construction in English: A detailed case study. Cognitive Linguistics 23(1). 91–128. https://doi.org/10.1515/cog-2012-0004.Search in Google Scholar

Tiedemann, Jörg. 2012. Parallel data, tools and interfaces in OPUS. In Proceedings of the international conference on language resources and evaluation, 2214–2218. European Language Resources Association. Available at: http://www.lrec-conf.org/proceedings/lrec2012/pdf/463_Paper.pdf.Search in Google Scholar

van Trijp, Remi. 2015. Cognitive vs. generative construction grammar: The case of coercion and argument structure. Cognitive Linguistics 26(4). 613–632. https://doi.org/10.1515/cog-2014-0074.Search in Google Scholar

Ungerer, Tobias. 2021. Using structural priming to test links between constructions: English caused-motion and resultative sentences inhibit each other. Cognitive Linguistics 32(3). 389–420. https://doi.org/10.1515/cog-2020-0016.Search in Google Scholar

Vlach, Haley. 2014. The spacing effect in children’s generalization of knowledge: Allowing children time to forget promotes their ability to learn. Child Development Perspectives 8. 163–168. https://doi.org/10.1111/cdep.12079.Search in Google Scholar

Vlach, Haley. 2019. Learning to remember words: Memory constraints as double-edged sword mechanisms of language development. Child Development Perspectives 13. 159–165. https://doi.org/10.1111/cdep.12337.Search in Google Scholar

Vlach, Haley & Catherine DeBrock. 2019. Statistics learned are statistics forgotten: Children’s retention and retrieval of cross-situational word learning. Journal of Experimental Psychology: Learning, Memory, and Cognition 45. 700–711. https://doi.org/10.1037/xlm0000611.Search in Google Scholar

Wible, David. & Nai-Lung Tsao. 2010. StringNet as a computational resource for discovering and investigating linguistic constructions. In Proceedings of the workshop on extracting and using constructions in computational linguistics, 25–31. Association for Computational Linguistics. Available at: https://aclanthology.org/W10-0804.Search in Google Scholar

Wible, David & Nai-Lung Tsao. 2020. Constructions and the problem of discovery: A case for the paradigmatic. Corpus Linguistics and Linguistic Theory 16(1). 67–93. https://doi.org/10.1515/cllt-2017-0008.Search in Google Scholar

Zhang, Yue & Joakim Nivre. 2012. Analyzing the effect of global learning and beam-search on transition-based dependency parsing. In Proceedings of the international conference on computational linguistics, 1391–1400. Available at: https://aclanthology.org/C12-2136.Search in Google Scholar

Zipf, George. 1935. The psychobiology of language. Boston, MA: Houghton-Mifflin.Search in Google Scholar

Received: 2021-09-24
Accepted: 2022-11-04
Published Online: 2022-12-09
Published in Print: 2022-11-25

© 2022 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 29.3.2024 from https://www.degruyter.com/document/doi/10.1515/cog-2021-0106/pdf
Scroll to top button