Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
October 16, 2009
Abstract
This article describes a method for calculating the ‘dependency distance’ between the words in a text – i.e. the number of words that separate each word from the word on which it depends syntactically – and reports the results of applying this method to a Chinese treebank. This study shows that Chinese dependencies tend strongly to be governor-final and that the mean dependency distance of words is much higher for Chinese than for other languages that have been studied including English, German and Japanese. It is unclear whether this difference means that Chinese is syntactically more difficult to process.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
October 16, 2009
Abstract
It is desirable to combine corpus linguistic and psycholinguistic methods and data types to illuminate the phenomenon of word co-occurrence. However, it is not only the cases where parallels between corpus linguistic and psycholinguistic findings are found that prove significant. In this study, it is shown that there are fewer parallels, and far more differences, between corpus collocations and word association responses than has been previously assumed. For instance, word association responses are typically semantically related to the stimulus word and belong to the lexical word classes, neither of which is typically true of collocates of the same node word in a corpus. The conclusion drawn is therefore that the word association task does not reflect authentic language production, but should rather be seen as tapping into the semantic information of the mental lexicon only.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
October 16, 2009
Abstract
This paper investigates the prominence patterns of nominal triconstituent compounds in English. The standard assumption for such NNN compounds is that the branching-direction is responsible for stress assignment. In left-branching compounds, i.e. those of the structure [[NN] N], the leftmost noun is assigned highest prominence whereas in right-branching compounds, i.e. [N [NN]], the second noun is the most prominent one (so-called ‘Lexical Category Prominence Rule’, e.g. Liberman and Prince, Linguistic Inquiry 8: 249–336, 1977). This assumption has hardly ever been tested empirically in more detail. Using acoustic data from several hundred pertinent compounds from the Boston University Radio Speech Corpus, we found that the predictions of the Lexical Category Prominence Rule are borne out for the majority of the data. However, a considerable number of compounds do not behave as predicted and violate the Lexical Category Prominence Rule. The analysis of the aberrant cases shows that prominence assignment to triconstituent compounds is governed also by factors other than branching. These factors are suggested to be the same as those responsible for the assignment of leftward vs. rightward stress to biconstituent compounds.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
October 16, 2009
Abstract
In this paper, we will describe some theoretical and practical issues raised during the construction of the Basque Dependency Treebank (BDT): the syntactic annotation of EPEC (Reference Corpus for the Processing of Basque). EPEC is a 300,000 word corpus of standard written Basque whose purpose is to be a training corpus for the development and improvement of several NLP (Natural Language Processing) tools for Basque. BDT will be the first corpus for the Basque language tagged at syntactic level. We will also present the dependency-based annotation hierarchy that we have established for the syntactic tagging. Decisions made during design of the annotation hierarchy are based on the description of Basque grammar made by Euskaltzaindia (Academy for the Basque Language). When describing dependency relations, we consider lexical units as syntactic heads. This will open up a way for us to work with semantics.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
October 16, 2009