Ingo Plag, Gero Kunter, Sabine Lappe
December 18, 2007
This paper tests three factors that have been held to be responsible for the variable stress behavior of noun-noun constructs in English: argument structure, semantics, and analogy. In a large-scale investigation of some 4500 compounds extracted from the CELEX lexical database (Baayen et al. 1995), we show that traditional claims about noun-noun stress cannot be upheld. Argument structure plays a role only with synthetic compounds ending in the agentive suffix - er . The semantic categories and relations assumed in the literature to trigger rightward stress do not show the expected effects. As an alternative to the rule-based approaches, the data were modeled computationally and probabilistically using a memory-based analogical algorithm (TiMBL 5.1) and logistic regression, respectively. It turns out that probabilistic models and the analogical algorithm are more successful in predicting stress assignment correctly than any of the rules proposed in the literature. Furthermore, the results of the analogical modeling suggest that the left and right constituent are the most important factor in compound stress assignment. This is in line with recent findings on the semi-regular behavior of compounds in other languages.