Head and dependent marking and dependency length in possessive noun phrases: a typological study of morphological and syntactic complexity

Abstract The interaction of morphosyntactic features has been of great interest in research on linguistic complexity. In this paper we approach such interactions in possessive noun phrases. First, we study the interaction of head marking and dependent marking in this domain with typological feature data and with multilingual corpus data. The data suggest that there is a clear inverse relationship between head and dependent marking in possessive noun phrases in terms of complexity. The result points to evidence on complexity trade-offs and to productive integration of typological and corpus-based approaches. Second, we explore whether zero versus overt morphological marking as a measure of morphological complexity affects dependency length as a measure of syntactic complexity. Data from multilingual corpora suggest that there is no cross-linguistic trend between these measures in possessive noun phrases.

. The 24 areas of the AUTOTYP on a world map (Bickel et al. 2022;used under CC-BY 4.0 license).
We first ran the models using the lme4 package (Bates et al. 2015) in R (R Core Team 2022). However, this resulted either in singular fit or models not converging. For this reason, we used generalized linear mixed effects modeling with maximum penalized likelihood by using the package blme in R (Chung et al. 2013). This package offers the possibility of using a prior and posterior modes for estimation yet without going into fully Bayesian inference with posterior means (or medians) from simulations; this will improve convergence and computational efficiency as well as pull the correlation terms between random intercepts and slopes away from perfect correlation (cf. Dorie 2014: 93-94). For priors we used weak (default) priors of blme (Chung et al. 2015). The models did not converge when using the default optimizers, and therefore we used BOBYQA as the optimizer in all models: this resulted in model convergence in all situations. The model specification in the lme4 notation was as in (1): (1) head_marking ~ dependent_marking + (1+dependent_marking|area) + (1|stock/language) P-values were drawn with a likelihood ratio test, and validated by a parametric bootstrap method, using 2 000 simulations (Halekoh and Højsgaard 2014). This method returns the fraction of those simulated likelihood ratio test values that are larger or equal to the observed likelihood ratio test value. The model's explanatory power was computed separately for the whole model (conditional R 2 ) and just for the fixed effects (marginal R 2 ) via the package MuMIn (Barton 2020). The algorithm is based on Nakagawa and Schielzeth (2013) and has been further developed by Johnson (2014), and Nakagawa, Johnson and Schielzeth (2017). All graphics in both case studies were created in R using packages COWPLOT (Wilke 2020), GGPLOT2 (Wickham 2016), SCALES (Wickham and Seidel 2020), and SJPLOT (Lüdecke 2018).
Table S1a presents the model's coefficients and Table S1b its random structure. The variances of the random terms are not too close to zero and the correlation between random intercept and random slope over area is also not close to perfect correlation. The choice of the response and the predictor variables was diachronically and thus plausibly also causally motivated: there is a diachronic trend of dependent marking developing into head marking but not the other way round (Nichols 1986). Synchronically the choice is somewhat arbitrary: either variable could have been selected as the response. Accordingly, we built a competing model by selecting dependent marking as the response and head marking as a predictor, keeping the random structure the same. The results were almost identical: head marking had a significant negative effect on dependent marking (coefficient = −2.96 ± 0.36; 2 (1) = 41.4; p < 0.001). Adding head marking to the null model lowered Akaike Information Criterion (AIC) by 39, which was just a little lower than the corresponding lowering of AIC in the main model (40). The model's marginal R 2 was 0.30, which was slightly lower compared to marginal R 2 of 0.31 in the main model. What this result suggests is that there is a strong mutual relationship between head and dependent marking. Because of the diachronic relationship between the two, we focus on reporting the results for head marking as the response variable.
In these models we accounted language-internal variation in the morphological marking of possessive constructions. As discussed in Section 2 of the article, the datapoints were thus constructions with different patterns of morphological marking. To contrast a different way of counting the datapoints, we also modeled the data by focusing only on the default morphological pattern in the language. Thus, each language contributed only one datapoint. For this model we used information in the column IsDefaultLocusOfMarking from the file LocusOfMarkingPerMicrorelation.csv to select only patterns which were analyzed as default in the AUTOTYP database (with value TRUE). The models were otherwise built in the same way, modeling head marking as the response, dependent marking as a predictor, random intercept for stocks and random slopes for dependent marking over areas. It was not possible to model random slopes for dependent marking over stocks because the number of parameters exceeded the number of datapoints.
When modeling languages as datapoints, dependent marking had a significant negative effect on head marking (coefficient = −3.22 ± 0.61; 2 (1) = 22.8; p < 0.001). Adding head marking to the null model lowered Akaike Information Criterion (AIC) by 21. The model's marginal R 2 was 0.28, which was slightly lower compared to the main model (0.31). The parametric bootstrap derived p-value was 0.0013 (using 2,000 simulations), which validates the statistical significance of the result. This result further suggests that there is a strong mutual relationship between head and dependent marking regardless of whether language-internal variation is accounted or not.

Case study 2
The data for the corpus-based analysis comes from Sinnemäki and Haakana (2020). They drafted short descriptions of how possession is expressed in the sample languages, then matched those with the Universal Dependencies annotation, and finally used the annotation to detect possessive constructions and their morphological marking in the corpus data. They delimited the analysis to constructions in which the possessed was a full noun, that is, either a common or proper noun (using the UPOS tags NOUN and PROPN, respectively). The analysis was delimited to constructions in which the possessor was personal possessive pronoun (for example, my house), common noun (for example, the house of men), or proper noun (for example, John's house), but demonstrative and other pronouns were excluded. The processed data contains roughly 725,000 datapoints, averaging to roughly 16,000 datapoints per language. For some languages, the data includes possessive NPs that have conjoined possessors (for example, the house of Tim and Linda). However, these constructions were excluded from the study on dependency length, because the conjoined possessors were not studied in all sample languages and including them could have biased the results.
For evaluating the degree of head vs. dependent marking, we used the following steps. First, we measured the presence vs. absence of the four different types of morphological marking in the identified possessive noun phrases. On top of these four types, we distinguished two subtypes of head marking, those in which the dependent was present in the construction as an independent syntactic constituent, as in (2a), and those in which it was absent, as in (2b). Constructions without a realized possessee, such as the predicate possessive pronouns mine and yours, were excluded from the study.
gâam'a 3SG.house 'his house' (Miller 1965: 177) Second, we then counted the frequency of each morphological type, labelled here as Nsubtype, where subtype is one of the following: dep = dependent marking; head = head marking; double = double marking; zero = zero marking; head0 = head marking with no dependent as a separate syntactic constituent. Third, degree (Deg) of dependent marking was calculated as the proportion of those constructions in which there was any type of dependent marking, as in (3): (3) ( ) = + + + ℎ + ℎ 0 + Degree of head marking was calculated as the proportion of those constructions in which there was any type of head marking, as in (4) We measured dependency length as the distance between the syntactic head and its dependent in a construction in terms of the number of intervening words. In the CoNLL-U format of Universal Dependencies treebanks each word in a sentence receives a unique running word index, or ID. For calculating dependency length, we subtracted the running word indices for head and dependent from each other in each construction for each sentence as in Liu, Hudson and Feng (2009), see (5): For modelling the relationship between head and dependent marking, we used generalized zero-inflated gamma linear mixed effects regression in R with the package glmmTMB (Brooks et al. 2017). This package enables modeling zero-inflated gamma distributions. For the link function, we used log link, following Bolker et al.'s (2021) recommendation.
As for the model's fixed terms, head marking was modeled as the response and dependent marking as a predictor. We used genealogical affiliation of the sample languages and the geographical area where they are spoken as grouping structure to adjust the coefficients for the intercept (the former for the conditional model, the latter for the zero-inflated model; see (6); other model specifications led to problems with convergence). Genealogical affiliation was modeled using the highest level of classification in AUTOTYP, namely stocks (see Bickel et al. 2022). As for geographical area, we followed the AUTOTYP and classified languages into 24 areas in which they are primarily spoken (see Bickel et al. 2022), as in case study 1. We tried modeling random slopes as well, but the models did not converge. For the optimizer we used BFGS, because it drew the random variances away from zero the most. The model specification in the lme4 notation was as in (6): head_marking ~ dependent_marking + (1|stock), ziformula = ~ 1 + (1|area) P-values were drawn with a likelihood ratio test. The model's goodness-of-fit was estimated with AIC. It was not possible to compute the model's explanatory power via the package MuMIn (Barton 2020), because zero-inflated models are not yet implemented for that package.
Table S1c presents the coefficients for the conditional and zero-inflation models and Table S1d present the random structure of the conditional model. The variances of the random terms were not approaching zero. In the conditional model the degree of dependent marking had a significant negative effect on the degree of head marking. For modelling the relationship between dependency length and morphological marking, we used generalized linear mixed effects regression for each language separately, following Yadav et al. (2020). As for the model's fixed terms, dependency length was modeled as the response and morphological marking as a binary predictor with values "no", referring to zero marking (reference level), and "yes", referring to dependent marking, head marking, and double marking. We focused only on those sample languages with at least some language-internal variation in zero vs. overt marking. In some languages there was only a handful of possessive noun phrases with zero marking, while in most sample languages almost all possessive noun phrases were dependent-marked. We selected languages in which the share of possessive noun phrases with zero marking was 1 % or greater and the number of possessive noun phrases with zero marking was 30 or more. This resulted in a sample of 16 languages.
Since dependency length is count data, we modelled its distribution with Poisson regression (following Yadav et al. 2020). For modelling we used generalized linear mixed effects regression with maximum penalized likelihood by using the package blme in R (Chung et al. 2013), as in case study 1. With this package all the models converged, the variances were pulled away from zero, the computation was faster, and the p-values were typically slightly more conservative than when using the package lme4 (Bates et al. 2015) or glmmTMB (Brooks et al. 2017). The models were checked for overdispersion and corrected where needed by including an observation-level random intercept. Sentence length was further used as a random intercept. The model specification with the lme4 notation was as in (7): dependency_length ~ morph_marking + (1|sentence_length) + (1|obs) # where needed Table S1e below shows the results for model m.incl and Table S1f for model m.excl. For each language the tables provide information about the following: the average dependency length (dependency length for zero-marked dependencies subtracted from overt-marked dependencies), the modelled coefficient, the standard error, the z-value, the p-value (drawn with a likelihood ratio test), the variance of the random intercept, and the standard deviation of the random intercept.