This paper explores the relation between familiarity of Chinese subjects and the syntactic distance. We propose two hypotheses: (1) contextually given Mandarin Chinese subjects are more likely to be used with long intervening adverbials than contextually new subjects; and (2) subjects with higher word frequency are more likely to be followed by long adverbials than those with lower word frequency. The data from two Mandarin Chinese treebanks provide supportive evidence for the first hypothesis, but not the second. Cognitively, this is probably due to the possibility that contextual givenness, which reflects familiarity, may lessen the effect of locality by increasing the activation level (the accessibility) of the subject and rendering these subjects less susceptible to the memory decay caused by the adverbials intervening between them and the predicate verbs. Subjects are usually the starting point of a sentence, which has a default given–new information structure. Therefore, when organizing a sentence, we are dominantly concerned with the information status (contextual givenness) relative to previous context when choosing the subjects, which may partly accounts for the observed irrelevance between word frequency and the use of adverbials. A sentence is structured based on the information status of the subjects, not their word frequency.
This paper applies the notion of linguistic motif to investigating the linear arrangement of dependency distance (DD) in Indo-European and its implicational meanings in language typology. A series of DD-motifs operating in a decreasing, increasing or equal magnitude are introduced. We first describe the frequency distribution of DD-motifs, and observe a preference for decreasing DD-motifs in human languages. Moreover, we further investigate the role of DD-motifs in controlling the syntactic complexity. The results show that serializing DD values in the same order of magnitude can more or less restrict the structural complexity, and it may be a useful method to realize the DD minimization in natural languages. Finally, we explore the value of DD-motifs in language typology. Our classification experiments reveal that adding the harmonic property and DD-motifs into dependency direction can improve the classification results.
The generalized valency patterns cover both obligatory arguments and optional adjuncts of valency carriers in authentic texts. Such a valency is defined as the number of all the dependents of the valency carrier. With a motif defined as the longest sequence with non-decreasing values, this paper chooses two news-genre dependency treebanks, one in Chinese and one in English and examines the motifs of generalized valencies in them and their lengths. They both are found to abide by the right truncated modified Zipf-Alekseev distribution. In addition, the Hyperpoisson model captures the interrelation between motif lengths and length frequencies. These research findings validate valency motifs as basic language entities and as results of a diversification process.