Barend Beekhuizen and Rens Bod 3 Automating construction work: Data-Oriented Parsing and constructivist accounts of language acquisition 1 Introduction Our world is filled with a vast array of objects and their relations and properties. Human infants face the magnificent task of processing experiences with the out- side world in such a way that they can later on respond in an adequate manner when similar, but non-identical experiences present themselves.We can call this processing “learning” and an important question studied throughout the cogni- tive sciences is

Constructions at work or at rest? RENS BOD* Abstract We question whether Adele Goldberg fulfills her self-declared goal in ‘‘Con- structions at Work’’, i.e. to develop a usage-based theory that ‘‘can produce an open-ended number of novel utterances based on a finite amount of input’’. We point out converging trends in computational linguistics that suggest formalizations of Construction Grammar. In particular, we go into recent developments in Data-Oriented Parsing, such as U-DOP and LFG-DOP, that produce an unlimited number of new utterances based on a finite

- pacity is needed. This article shows that this common wisdom is wrong. It starts out by reviewing an exemplar-based syntactic model, known as Data-Oriented Parsing, or DOP, which operates on a corpus of phrase-structure trees. While this model is productive, it is inadequate from the point of grammatical produc- tivity. We therefore extend it to the more sophisticated linguistic representations proposed by Lexical-Functional Grammar theory, resulting in the model known as LFG-DOP, which does allow for meta-linguistic judgments of acceptability. We show how DOP deals

Table of contents Ronny Boogaart, Timothy Colleman, and Gijsbert Rutten 1 Constructions all the way everywhere: Four new directions in constructionist research 1 I Methodological advances Natalia Levshina and Kris Heylen 2 A radically data-driven Construction Grammar: Experiments with Dutch causative constructions 17 Barend Beekhuizen and Rens Bod 3 Automating construction work: Data-Oriented Parsing and constructivist accounts of language acquisition 47 II Construction morphology Geert Booij and Matthias Hüning 4 Affixoids and constructional idioms 77 Alan Scott

diffusion 181, 191–202 constructional idioms 77–105 constructional networks 1–2, 4, 6, 141–175, 239–246, 265–266, 325, 338, 347, 356 constructionalization 239–243, 353 contrastive 8–9, 259–267, 275–276, 278 conventionality 193, 252–281 Data-Oriented Parsing 47–72 debonding 82, 87 deflection 110, 118, 159, 166 degeneracy 141–143, 159, 163, 167, 170, 172–175 degrammaticalization 82, 227 degree modifiers: see intensifiers degree of clause integration: see clause integration dialogue 322, 338–350 – dialogical construction 312–313 distributional hypothesis 22 ditransitive 144

.4 Denwood, 23.4 Ritter, 23.4 Cyran Consonant clusters Dental-posterior 23.4 Ritter Consonant clusters Homogeneous voicing 23.4 Ritter Consonant clusters Non-homogeneous voicing 23.4 Ritter Constraints (in phonology) 23.4 Van der Hulst Control 23.2 Lightfoot Coordination 23.1 Hirata Copy and delete 23.2 Franks Cue-based acquisition 23.2 Lightfoot CVCV 23.4 Cyran Cyclicity 23.4 Denwood Data-oriented parsing 23.3 Bod Declarative Phonology 23.4 Van der Hulst Dependency Phonology (DP) 23.4 Van der Hulst Dependent 23.4 Van der Hulst, 23.4 Ritter, 23.4 Denwood Domain final 23

work. Suppose, for instance, that Rens Bod’s “DOP” (data-oriented parsing) theory were broadly on the right lines (see e. g., Bod and Scha 1996, Bod, Scha, and Sima’an 2003). It is not my concern here whether DOP will ultimately prove to be correct or incorrect (I would not presume to predict), but it surely cannot be re- jected out of hand as conceptually untenable? If not, that is one alterna- tive to generative grammar as a model of human language behaviour, which seems to meet all the requirements for consideration as a serious scientific theory, but which has

sentence production. Cognition 31: 163–186. Bock, Kathryn and Helga Loebell (1990). Framing sentences. Cognition 31: 1–39. Bod, Rens (1998). Beyond Grammar: An Experience-Based Theory of Language. Stanford: CSLI Publications. — (2000). The storage vs. computation of three-word sentences. In Proceedings of AMLaP 2000. — (2001). Sentence memory: the storage vs. computation of frequent sentences. In Proceedings CUNY-2001 Conference on Sentence Processing, Philadelphia, PA. Bod, Rens and Ronald Kaplan (2003). A data-oriented parsing model for lexical-functional gram- mar. In

rule with a probability estimate (Suppes 1972), reflecting the actual incidence of each rule in a parsed corpus. Although this is not the only approach to proba- bilistic grammar, it is the simplest probabilistic extension of a relatively straight- forward encoding of metrical phonology using a context-free grammar. Other methods currently being investigated in the hope of even better results employ a stochastic "strip grammar", as in Coleman and Pierrehumbert (1997), or data- oriented parsing, as in Bod (1998). Since we have already debugged the preceding grammar by

as mini-constructions (cf. Dąbrowska 2009) rather than assuming autono- mous syntactic representations. Secondly, drawing on the extensive literature on multiword chunks, it proposes that the units that are combined are typically larger than simple lexical items. In this respect, the recycling model resembles “Data-Oriented Parsing” (DOP), a computational approach developed by Rens Bod and his colleagues (Scha et al. 1999; Bod 2006, 2009). Like the approach outlined here, DOP is based on the principle that people store analyzed fragments of previously