Barend Beekhuizen and Rens Bod
3 Automating construction work:
Data-OrientedParsing and constructivist
accounts of language acquisition
Our world is filled with a vast array of objects and their relations and properties.
Human infants face the magnificent task of processing experiences with the out-
side world in such a way that they can later on respond in an adequate manner
when similar, but non-identical experiences present themselves.We can call this
processing “learning” and an important question studied throughout the cogni-
tive sciences is
Constructions at work or at rest?
We question whether Adele Goldberg fulfills her self-declared goal in ‘‘Con-
structions at Work’’, i.e. to develop a usage-based theory that ‘‘can produce
an open-ended number of novel utterances based on a finite amount of
input’’. We point out converging trends in computational linguistics that
suggest formalizations of Construction Grammar. In particular, we go
into recent developments in Data-OrientedParsing, such as U-DOP and
LFG-DOP, that produce an unlimited number of new utterances based on
pacity is needed. This article shows that this common wisdom is wrong. It starts
out by reviewing an exemplar-based syntactic model, known as Data-OrientedParsing, or DOP, which operates on a corpus of phrase-structure trees. While
this model is productive, it is inadequate from the point of grammatical produc-
tivity. We therefore extend it to the more sophisticated linguistic representations
proposed by Lexical-Functional Grammar theory, resulting in the model known
as LFG-DOP, which does allow for meta-linguistic judgments of acceptability.
We show how DOP deals
Table of contents
Ronny Boogaart, Timothy Colleman, and Gijsbert Rutten
1 Constructions all the way everywhere: Four new directions in
constructionist research 1
I Methodological advances
Natalia Levshina and Kris Heylen
2 A radically data-driven Construction Grammar: Experiments with Dutch
causative constructions 17
Barend Beekhuizen and Rens Bod
3 Automating construction work: Data-OrientedParsing and constructivist
accounts of language acquisition 47
II Construction morphology
Geert Booij and Matthias Hüning
4 Affixoids and constructional idioms 77
work. Suppose, for instance, that Rens Bod’s
“DOP” (data-orientedparsing) theory were broadly on the right lines
(see e. g., Bod and Scha 1996, Bod, Scha, and Sima’an 2003). It is not
my concern here whether DOP will ultimately prove to be correct or
incorrect (I would not presume to predict), but it surely cannot be re-
jected out of hand as conceptually untenable? If not, that is one alterna-
tive to generative grammar as a model of human language behaviour,
which seems to meet all the requirements for consideration as a serious
scientific theory, but which has
sentence production. Cognition 31: 163–186.
Bock, Kathryn and Helga Loebell (1990). Framing sentences. Cognition 31: 1–39.
Bod, Rens (1998). Beyond Grammar: An Experience-Based Theory of Language. Stanford: CSLI
— (2000). The storage vs. computation of three-word sentences. In Proceedings of AMLaP 2000.
— (2001). Sentence memory: the storage vs. computation of frequent sentences. In Proceedings
CUNY-2001 Conference on Sentence Processing, Philadelphia, PA.
Bod, Rens and Ronald Kaplan (2003). A data-orientedparsing model for lexical-functional gram-
rule with a probability estimate (Suppes 1972), reflecting the actual incidence
of each rule in a parsed corpus. Although this is not the only approach to proba-
bilistic grammar, it is the simplest probabilistic extension of a relatively straight-
forward encoding of metrical phonology using a context-free grammar. Other
methods currently being investigated in the hope of even better results employ
a stochastic "strip grammar", as in Coleman and Pierrehumbert (1997), or data-
orientedparsing, as in Bod (1998).
Since we have already debugged the preceding grammar by
as mini-constructions (cf. Dąbrowska 2009) rather than assuming autono-
mous syntactic representations. Secondly, drawing on the extensive literature
on multiword chunks, it proposes that the units that are combined are typically
larger than simple lexical items.
In this respect, the recycling model resembles “Data-OrientedParsing”
(DOP), a computational approach developed by Rens Bod and his colleagues
(Scha et al. 1999; Bod 2006, 2009). Like the approach outlined here, DOP is based
on the principle that people store analyzed fragments of previously