Skip to content
BY 4.0 license Open Access Published by De Gruyter Mouton December 21, 2020

Role-reference associations and the explanation of argument coding splits

Martin Haspelmath
From the journal Linguistics


Argument coding splits such as differential (= split) object marking and split ergative marking have long been known to be universal tendencies, but the generalizations have not been formulated in their full generality before. In particular, ditransitive constructions have rarely been taken into account, and scenario splits have often been treated separately. Here I argue that all these patterns can be understood in terms of the usual association of role rank (highly ranked A and R, low-ranked P and T) and referential prominence (locuphoric person, animacy, definiteness, etc.). At the most general level, the role-reference association universal says that deviations from usual associations of role rank and referential prominence tend to be coded by longer grammatical forms. In other words, A and R tend to be referentially prominent in language use, while P and T are less prominent, and when less usual associations need to be expressed, languages often require special coding by means of additional flags (case-markers and adpositions) or additional verbal voice coding (e.g., inverse or passive markers). I argue that role-reference associations are an instance of the even more general pattern of form-frequency correspondences, and that the resulting coding asymmetries can all be explained by frequency-based predictability and coding efficiency.

1 Overview

In many languages, the coding of core arguments depends not only on their semantic or syntactic role, but also on their referential prominence in one way or another, i.e., on animacy, definiteness, person prominence, and so on. Since the 1970s, it has been widely recognized that such prominence-conditioned splits in argument coding are crosslinguistically regular in a way that is surprising but apparently robust. Authors such as Silverstein (1976), Comrie (1978), Moravcsik (1978b), and Dixon (1979) focused on regularities found in splits in ergative flagging (case or adpositional marking), and Moravcsik (1978a), Bossong (1985, 1998, and Lazard (2001) demonstrated striking crosslinguistic parallels in splits in accusative flagging (well-known as “differential object marking”). Referential prominence has become widely known by names such as “animacy hierarchy” or “empathy hierarchy”. The present paper builds on this work and proposes that these generalizations, plus quite a few others, can be subsumed under a single overarching generalization, under the heading of role-reference associations. At the end of the paper, I will propose an explanation in terms of frequency-based coding efficiency.

We typically find coding contrasts as in (1) and (2), where the (b) examples show overt coding of an argument, and the (a) examples show zero coding of an argument with the same role. In (1), the variably coded argument type is the transitive subject or A-argument, so this is called split A coding (or split ergativity), and in (2) it is the object or P-argument, so we have split P coding (or split accusative coding, or differential object marking). Crucially, the variably coded argument has different referential-prominence properties in the (a) and (b) examples: first person versus third person in (1), indefinite versus definite in (2).

split ergative (A) coding: Kham
‘I killed a leopard.’ (first person A)əih-ke-o.
‘He killed a leopard.’ (third person A)
(Trans-Himalayan; Watters 2002: 67)
split accusative (P) coding: Sakha (Turkic; Baker 2015: 4–5)
‘Masha ate porridge quickly.’ (indefinite P)
‘Masha ate the porridge quickly.’ (definite P)

In addition, argument coding splits may depend on the referential-prominence properties within a scenario. Thus, in some languages a special construction is required in a monotransitive construction (A/P scenario) when both the A-argument and the P-argument are third person (aliophoric), i.e., in a 3 > 3 scenario. An example comes from Teop, where the object marker ben- is not used in the 1 > 3 scenario in (3a), but is required in the 3 > 3 scenario in (3b).

Teop (Oceanic; Mosel 2007: 10)
a.Enaapaadeema = ueguu.
1sgTAMcarrydir = immartpig
‘I have brought a pig.’ (1 > 3 scenario)
b.Abeikotenaapaaasun = uben-eguu.
artchildmyTAMkill = immobj-artpig
‘My child has killed the pig.’ (3 > 3 scenario, special flag required)

Similarly, in English ditransitive constructions, referential prominence within the R/T scenario is relevant for argument marking, specifically the full nominal or person-form status: The special Dative preposition to on the R (the recipient argument) is optional in most scenarios (as illustrated in (4a) and (4b)), but obligatory when the T (the theme argument) is a person form and the R is a full nominal (i.e., in an “N > pers” scenario), as seen in (4c).

split R marking conditioned by full nominal or person-form status: English
a.She gave KimRthe moneyT. (≈ She gave the moneyTtoKimR.)
(N > N scenario)
b.She gave himRitT. (≈ She gave itTtohimR.)
(pers > pers scenario)
c.*She gave KimRitT. (OK: She gave itTtoKimR.)
(N > pers scenario, special flag required)

The present paper provides an overview of such coding splits in core arguments (A, P, T, R) and argues that a wide variety of splits, both in monotransitive and in ditransitive constructions, are best understood as special cases of the high-level generalization in (5).

The role-reference association universal (Universal 1)
Deviations from usual associations of role rank and referential prominence tend to be coded by longer grammatical forms if the coding is asymmetric.

Most of the ideas in this paper are not original, and I base myself on the results of typological research since the 1970s. However, the generalizations of this paper have never been formulated in their full generality, and the predictions made by earlier authors have often not been fully explicit. While the widespread occurrence of differential object marking has become widely known, and special person-role interactions in ditransitives (“PCC effects”) are also often discussed in the specialized literature (e.g., Haspelmath 2004), these two are rarely seen together, and quite a few other patterns that can be subsumed under the role-reference association universal are typically treated in rather different terms. This paper makes the ambitious proposal that by considering all these constructions together, we have the chance of unifying a wide variety of phenomena under a single generalization, the role-reference association universal.

In addition to this primary goal, I will also discuss the explanation of the generalizations, arguing that the role-reference association universal is a special case of a still more general pattern, the form-frequency correspondence universal of Haspelmath (2021). The basic idea is that additional coding elements such as the accusative marker in (2b) or the dative marker in (4c) are required when they are least predictable and hence needed the most, i.e., that argument coding splits reflect the functional need for efficient grammatical coding. Across a range of different situations, arguments with a higher-ranked role (the transitive A-argument and the ditransitive R-argument) are referentially prominent in the usual case (or in other words, the most frequent case), and arguments with a lower-ranked role (the transitive P-argument and the ditransitive T-argument) usually exhibit lower referential prominence. In an efficient system, special coding by longer forms needs to be used only when a construction deviates from these usual associations.

2 Role rank and referential prominence

The primary observation of this paper is that special grammatical coding occurs in non-usual, unexpected situations – when role rank and referential prominence do not go together as they do most frequently. The usual associations between role rank and referential prominence are themselves universal, and can be summarized as in (6), a claim about frequency patterns in language use.

Usual role-reference associations (Universal 2)
Arguments with higher-ranked roles tend to be more referentially prominent, and vice versa.

This fundamental observation is not an abstract aspect of the language system (like the notion of “harmonic alignment” of Aissen (2003), which ultimately inspired it), but a concrete claim about universal discourse frequencies. In this paper I do not focus on documenting the discourse frequencies, but the reader should keep in mind that whenever I say, for example, that “A shows a greater tendency to be definite than P”, I mean that we find more definite A-arguments than definite P-arguments in all representative texts in all languages. The figures in the Appendix should make it very plausible that Universal 2 is correct, but testing this claim more thoroughly is a topic for future comparative corpus research.

Role rank is defined very simply as in (7) (see Haspelmath 2011, 2015 for precise definitions of the comparative role-types A, P, T and R).

role rank:
In monotransitive constructions, the A (agent) is ranked higher than the P (patient).
In ditransitive constructions, the R (recipient) is ranked higher than the T (theme).

Regarding the notion of “role rank”, it is of course an interesting question to ask why A should pattern with R, and why P should pattern with T, but in this paper, I will not make any attempt at further generalization. The main reason for having a notion of role rank is to make the formulation of the universals in (5) and (6) simple and intuitive. Alternatively, one could replace (6) by saying that A and R tend to be more referentially prominent than P and T, respectively. For the present paper, it makes no difference, so the notion of role rank is not crucial.[1] (Since many readers will be curious, I will make a few remarks below in Section 11.2, but it should be kept in mind that these considerations are not essential for this paper.)

Referential prominence is defined by the scales of inherent prominence and discourse prominence in (8).

scales of referential prominence
a.inherent prominence[2]
person scale:locuphoric (first/second) > aliophoric (third person)
(full) nominality scale:person form (independent or index) > full nominal
animacy scale:human (> animal) > inanimate
b.discourse prominence
definiteness scale:definite (> specific indefinite) > indefinite nonspecific
givenness scale:discourse-given > discourse-new
focus scale:background > focus

Here one could ask what these different kinds of semantic and discourse-pragmatic notions have in common that justifies subsuming them under a single notion of “referential prominence”, but again I will not give an answer here. The main empirical observation in (5) can be stated in a maximally simple way if we have a general notion of referential prominence, but alternatively it could be stated individually for each of the scales in (8), with no loss of explanatory success. In the approach advocated here, explanation does not rely on the simplicity of the formulations, and I use simple formulations mainly because they are easier to remember.

In the earlier literature, the scales in (8a) were often subsumed under general notions such as “extended animacy” or “empathy”, but it has long been recognized that definiteness patterns much like animacy in split P flagging, so discourse prominence plays a similar role as inherent prominence. To reflect this, other terms like “individuation” and “indexability” have also been used. The use of the term (referential) prominence in connection with split argument coding seems to have been introduced by Aissen (1999; 2003, inspired by phonological terminology, but it is by now well-established (e.g., Bornkessel-Schlesewsky and Schlesewsky 2009; Lockwood and Macaulay 2012; Malchukov 2008).[3]

It is also important to be aware that the notions in (7) and (8a) and (8b) are intended as comparative concepts for crosslinguistic comparison (as is usual in the Greenbergian tradition of comparative linguistics), not as necessarily corresponding closely to language-particular categories, let alone to any innate categories of a grammar blueprint. Categories often differ substantially across languages, but the concepts in (7) and (8) are meant to be applicable to all languages equally.

3 Usual associations and universals of coding splits

For a maximally general formulation of the universals of argument coding that were illustrated by (1)–(4), we need to distinguish two types of associations (Section 3.1) and two types of coding splits (Section 3.2). When we consider both monotransitive and ditransitive constructions, this yields four different types of argument coding splits, summarized in Table 1 at the end of this section.

Table 1:

Two types of argument coding splits, and two construction types.

Single-argument split (only the coded argument is relevant)Scenario split (the arguments in a scenario are relevant)
Monotransitive construction (A > P, agent and patient)Split P flagging (ex. 2) (“differential object marking”), split A flagging (ex. 1), (Section 4)Scenario-based special A or P flagging (ex. 3), (Sections 6 and 8)
Ditransitive construction (R > T, recipient and theme)Split R flagging (e.g., Wolof ci + vs. zero, ex. 31), split T flagging, (Section 5)Scenario-based special R or T flagging (ex. 4), (Sections 7 and 8)

3.1 Two types of usual associations

To make Universal 2 (in (6) above) more specific, the most straightforward possibility is to consider the arguments individually, regardless of what happens elsewhere in the clause, and to state their usual associations (or association tendencies) with the referential prominence scales:

single-argument association tendencies
a.the A and the R tend to be referentially prominent
b.the P and T tend to be referentially non-prominent

For example, clauses such as ‘The dog found a bone’ and ‘She gave the boy a key’, with definite A and R, and indefinite P and T, are more usual than ‘A rock hit the hiker’ or ‘She gave a boy the key’.

On the other hand, we can look at both arguments within a two-argument scenario simultaneously and compare their referential prominence values. In monotransitive constructions, the scenario consists of the A and the P, and in ditransitive constructions, the scenario consists of the R and the T.[4]

scenario association tendencies
a.the A tends to be referentially more prominent than the P
b.the R tends to be referentially more prominent than the T

Because of these tendencies, we can distinguish three kinds of scenarios:

a.downstream scenario (most usual):
when A/R is referentially more prominent than P/T
b.upstream scenario (least usual):
when A/R is referentially less prominent than P/T
c.balanced scenario (intermediate):
when A/R and P/T are equally prominent

The downstream scenarios (e.g., ‘I caught a rabbit’) are the most usual ones, and the upstream scenarios are the least usual (e.g., ‘A dog bit you’), with balanced scenarios in between. The “upstream/downstream” metaphor is introduced here because the most common scenarios are expressed in the “easiest” way (like swimming downstream), while the least common scenarios can be thought of as more “difficult” (like swimming upstream) and thus need more “coding energy”.

3.2 Two types of splits: Single-argument splits and scenario splits

Corresponding to single-argument associations and scenario associations, we have two types of argument coding splits:[5] single-argument splits (Sections 4 and 5) and scenario splits (Sections 6 and 7).

A single-argument split is defined in (12), and the corresponding universal is formulated in (13) (which is restricted to flagging and does not make claims about indexing, cf. Note 6).

single-argument coding split:
A single-argument coding split is an argument coding split for which only
the referential prominence of the coded argument is relevant.
The single-argument flagging universal (Universal 3)[6]
If a language has an asymmetric single-argument flagging split depending on
some prominence scale, then the coding is longer for prominent P/T-arguments or for non-prominent A/R-arguments.

For example, prominent P-arguments have special longer flags in languages with differential object marking (e.g., Spanish, where specific human Ps get the accusative preposition a), and non-prominent R-arguments have special flags in languages with split R flagging (e.g., in English, where full nominal Rs get the dative preposition to, cf. Section 5.1.2).

This universal makes use of the notion of an asymmetric split, i.e., a coding split in which one of the coding types involves longer coding (also called special coding). In most cases, this means overt coding contrasting with zero coding, but we will also see examples of shorter and longer overt coding.[7]

The universal in (13) is a more general formulation that subsumes the famous split P flagging universal in (14).[8] An illustration was given in (2) above, and more will be said in Section 4.1.

Split P flagging (“differential object case-marking”, Universal 4)
If a language has an asymmetric split in P flagging depending on some
prominence scale, then the special flag is used on the prominent P-argument.

The more general formulation in (13) also includes split A flagging (“differential subject marking”), as illustrated above in (1) and treated further in Section 4.2, as well as split R and split T flagging (treated in Section 5).

The other type of coding split is called scenario split because it is based on the scenario associations. A scenario split is defined in a general way in (15), and the most general scenario universal is given in (16).

scenario split:
A scenario split is an argument coding split for which the referential
prominence of the arguments in a scenario (A-P, or R-T) is relevant,
not only the referential prominence of the coded argument.[9]
The scenario universal (Universal 5)
If a language has an asymmetric scenario split, then the coding is longest for
upstream scenarios, shortest for downstream scenarios, and intermediate for
balanced scenarios.

For example, in (3) from Teop, the downstream scenario (locuphoric > aliophoric, in (3b)) does not have the special coding with the Object marker ben-, and in (4) from English, the upstream scenario (N > pers, in 4c) requires the Dative preposition to (Recall from 11b that an upstream scenario is one where R is less prominent than T, and from (8a) that full nominals are less prominent than person forms). The balanced scenarios (N > N, pers > pers) do not require this special coding, and neither do the downstream scenarios (an example of a downstream scenario in English is She gave himRthe moneyT, pers > N).[10]

The next four sections (Sections 47) are devoted to further illustration (including some discussion) of the two types of argument coding splits, both in monotransitive constructions and in ditransitive constructions (see Table 1 above). Within each type of coding split and for both construction types, I will illustrate how various prominence scales condition coding splits. Not all prominence dimensions are found to be relevant (equally frequently) with all the coding split types and both construction types, for unclear reasons, but most of the expected phenomena are attested in the literature. Crucially, constructions that are not expected and ruled out by the role-reference association universal are not attested (or found only marginally). In Section 11, I will discuss the explanation of these strikingly general patterns, and I will propose a correspondingly general explanation.

In addition to single-argument splits and scenario splits, I will also discuss obligatory and optional verbal voice constructions (Sections 9 and 10). I will end up formulating 12 additional universals (summarized in Figure 1 in Section 11 below), all of which are ultimately special cases of Universal 1.

Figure 1: Taxonomy of universal claims in this paper.

Figure 1:

Taxonomy of universal claims in this paper.

4 Single-argument splits in monotransitive constructions

The discussion of monotransitive single-argument splits is organized in this section by P splits and A splits. We begin with P splits, because this type of split is best known.

4.1 Split P flagging (or differential object marking)

As mentioned earlier, split P flagging, traditionally known as “differential object marking”, has been widely discussed in the earlier literature, so there is no need to dwell on it much in this paper (see Seržant and Witzlack-Makarevich 2018 for a recent overview). It is particularly often conditioned by animacy and definiteness, but sometimes by givenness and nominality, and occasionally by person (focus-conditioned split P flagging seems to be very rare, and I currently lack examples).

The universal nature of the phenomenon, formulated as Universal 4 in (14) above, has generally been taken for granted. Aissen (1999: 673) counts it “among the most robust generalizations in syntactic markedness”, and much of the literature focuses on the precise factors in particular languages as well as on possible explanations, not on providing evidence for or against the universal. Bossong (1985: 3–8) makes it very clear that he regards the phenomenon as an implicational universal (first described systematically by Thompson 1912), and even Filimonova (2005), who documents a few counterexamples, does not challenge its status as a universal tendency. Bickel et al. (2015) seem to suggest that the dependence of split P and split A flagging on referential prominence is not a universal tendency, but Schmidtke-Bode and Levshina (2018) show that Bickel et al.’s data can be interpreted rather differently, as providing support for Universal 4.[11]

The following subsections illustrate five subtypes of split P flagging, with conditioning by different prominence scales.

4.1.1 Animacy-conditioned split P flagging

This subtype is particularly well-known from many languages of South Asia and Southeast Asia (see, e.g., LaPolla 1992 for Trans-Himalayan languages), but it is also found in many other languages in different parts of the world, e.g., in Romance languages such as Spanish, or Sardinian as illustrated in (17) (where the crucial feature is humanness).

Nuorese Sardinian (Bossong 1991: 148)
‘He killed Salvatore.’
‘He killed the wolf.’

4.1.2 Definiteness-conditioned split P flagging

This was illustrated above in (2) from Sakha and is also well-known from Turkish and Hebrew. Definiteness/specificity is also one of the relevant factors in Spanish (e.g., García García 2007), as well as in Punjabi, illustrated in (18).

Punjabi (Indic; Bhatia 1993: 172–174)
‘Look at a book.’
‘Look at the book.’

4.1.3 Nominality-conditioned split P flagging

Some languages have P flagging only on personal pronouns, but not on full nominals. A well-known example of such a language is English (he vs. him, she vs. her, etc.), and many pidgins and creoles are of this sort as well (Haspelmath and APiCS Consortium 2013). Moreover, there are many languages throughout the world that lack nominal flagging, but many of these distinguish between A-arguments and P-arguments in their person indexes (Siewierska 2005), so they are like English in that they have an A versus P distinction for person forms but not for full nominals.[12]

4.1.4 Givenness-conditioned split P flagging

It has been known since Thompson (1912) that split P flagging is sometimes conditioned by givenness (or “topicality”). Dalrymple and Nikolaeva (2011: Chapter 6) discuss a number of relevant cases in some detail, e.g., Persian.

Persian (Dalrymple and Nikolaeva 2011: 108–112)
‘I bought the book.’
‘I ate an apple.’ (accusative flag is not allowed on nontopical P)
‘Who saw a car?’ (accusative flag is required on topical P)

4.1.5 Person-conditioned split P flagging

In Abruzzese (an Italo-Romance variety), special P-flagging by the preposition a occurs only with locuphoric (first and second) personal pronouns (D’Alessandro 2017: 8).

Abruzzese (dialect of Arielli)
‘I have seen myself/you.’
‘We have seen us/you.’
(‘I have seen Maria/them.’)

Similarly, in Yindjibarndi, accusative marking of P is obligatory with locuphoric personal pronouns, but optional for all third person arguments (Wordick 1982: 76).

4.2 Split A flagging

Since Silverstein (1976), split A flagging (or split ergativity, i.e., ergative flagging restricted to lower-prominence A) has often been seen as the mirror image of split P flagging. Kiparsky (2008: Section 2.3) argues that the pattern represents a “true universal”, i.e., not merely a typological generalization. By now it is well-known that split A flagging (also called “differential subject marking”,[13] cf. de Hoop and de Swart 2009) is rarely conditioned by animacy or definiteness, and much more often by person or nominality. Because in quantitative terms, the observed coding phenomena are not an exact mirror image of what we find in P flagging, some authors have questioned the proposal that they should be understood in an analogous way (Fauconnier and Verstraete 2014). However, accusative and ergative systems are not fully symmetrical in other ways either, but even so it has often proved useful to consider them as mirror images, and here I hypothesize that a very similar generalization applies to ergative systems. Universal 6 in (21) is completely parallel to Universal 4 in (14) above.

Split A flagging (“differential subject marking”, Universal 6)
If a language has an asymmetric split in A flagging depending on some
prominence scale, then the special flag is used on the nonprominent A-argument.

In other words, if a language has ergative marking but not on all kinds of arguments, then it tends to have it for the less prominent arguments, in particular for full-nominal arguments, for aliophoric (third-person) arguments, and for focused arguments (see Dixon 1994: Section 4.2).[14]

4.2.1 Person-conditioned split A flagging

Systems in which ergative marking is restricted in that it does not occur on locuphoric person forms are found at least in Australia, South Asia, and in two families of the Caucasus (Kartvelian, Nakh-Dagestanian). In (22), we see the singular paradigm of Godoberi personal pronouns which lacks ergative markers in the locuphoric forms, but not in the aliophoric forms.

Godoberi (Nakh-Dagestanian; Kibrik (ed.) 1996: 36, 42)

4.2.2 Nominality-conditioned split A flagging

Systems in which ergative marking is restricted to full nominals and does not occur on person forms are well-known from Australian languages, and illustrated in (23).

‘We will go.’ (no flag on S-argument)
‘We will see the woman.’ (no flag on person-form A-argument)
‘The man will see the woman.’ (ergative flag on full nominal A-argument)
(Pama-Nyungan; cf. Dixon 1980: 287–289)

4.2.3 Animacy-conditioned split A flagging

In Mangarrayi, only neuter-gender (inanimate) nominals take ergative case (cf. 24a), but masculine or feminine nouns do not (cf. 24b). So there exist cases of animacy-conditioned split A marking, but as Malchukov (2008: 206–207) notes, they seem to be quite rare (for reasons that I do not understand at the moment).

Mangarrayi (northern Australia; Merlan 1982: 61)
n.erg-watersubmerge3sg > 1sg-aux-pst.punct
‘Water covered/submerged me.’ (ergative prefix on inanimate A)
show3sg > 2sg-aux-pst.punctf.nom-old.woman
‘Did the old woman show you (to him)?’ (no ergative marking on animate A)

4.2.4 Focus-conditioned split A flagging

There are quite a few languages in different parts of the world in which the ergative marker occurs only when the A-argument is focused (see McGregor 2010). An example comes from Central Tibetan, which uses its ergative flag -ki’ only when the A is focused.

Central Tibetan
‘He prepares the meals.’ (no flag on topical A-argument)
‘HE prepares the meals.’ (ergative flag on focused A-argument)
(Tournadre 1995: 264)

5 Single-argument splits in ditransitive constructions

Split argument coding has been much less discussed in ditransitive constructions,[15] but as noted by Haspelmath (2007) and Kittilä (2008), such phenomena do occur, and by now there are quite a few languages where such splits have been described (see also a 2012 special issue of Linguistic Discovery: van Lier (2012)).

The situation in English is well-known, but it has almost always been discussed as an alternation (between the Double-Object construction and the Prepositional Dative construction), depending on the verb and information structure. However, as we saw in (4c) (*She gave Kim it), there is also a coding split: When the T (theme) is a personal pronoun and the R (recipient) is a full nominal, only the Prepositional Dative construction (She gave it to Kim) is possible. Clearly, this is a kind of split R flagging, of the scenario type.[16]

The present section illustrates single-argument splits, organized by R splits (Section 5.1) and T splits (Section 5.2). The relevant universals, given in (26) and (27), are completely parallel to those about split A and P flagging seen earlier in (21) and (14).

Split R flagging (Universal 7)
If a language has an asymmetric split in R flagging depending on some
prominence scale, then the special flag is used on the nonprominent R-argument.
Split T flagging (Universal 8)
If a language has an asymmetric split in T flagging depending on some
prominence scale, then the special flag is used on the prominent T-argument.

5.1 Split R flagging

Split R flagging is much more common than split T flagging, just as generally R flagging is more common than T flagging in ditransitives, so we begin with this.

5.1.1 Person-conditioned split R-flagging

In the domain of clitic object pronouns on verbs, French has special R flagging only on aliophoric R arguments, but not on locuphoric R-arguments. A different (and more traditional) way of putting it is that there is no Accusative/Dative distinction with first/second person atonic pronouns. The forms are given in (28). As noted in Haspelmath (2007: Section 3.1), similar systems are found in languages elsewhere in the world (Africa, the Caucasus, New Guinea), so there is little doubt that the French pattern is not an accident.

French person clitics
T (acc)R (dat)T (acc)R (dat)
3sgle, lalui3pllesleur

As expected, the Dative forms are (somewhat) longer than the Accusative forms (lui vs. le/la, and leur vs. les), though no special Dative marker can be discerned.[17]

5.1.2 Nominality-conditioned split R flagging

A language that has a special dative flag (preposition ta+) only with full nominal R-arguments is Northeastern Neo-Aramaic.

Northeastern Neo-Aramaic of Telkepe
(full nominal R)
‘He gave money to a certain poor person.’ (= Coghill’s 11b)
‘He gave it to a poor person.’ (= 14b)
(person-form R)
‘He gave him a present.’ (= 14c)
‘He gave them to you.’ (= 16b)
(Coghill 2010: 226–228)

A very similar situation is found in Bulgarian (Hauge 1999 [1976]), where the Dative preposition na is obligatory with full nominals, while clitic pronouns have different forms for Accusative and Dative (e.g., 1sgme/mi, 2sgte/ti, 3sg.mgo/mu). An even better known language with this pattern is French, where full nominals require the preposition à+, while person forms are bound to the verb, as in Neo-Aramaic.

5.1.3 Animacy-conditioned split R-flagging

In Yakkha, a Kiranti language of Nepal, R-arguments are in the Locative case when inanimate, but otherwise in the (zero-coded) Absolutive case.

a.kanniŋdaphoto-ciham-biʔ-meʔ-nenin = ha
1sg[nom]2pl[nom]photo-pl[nom]distribute-ben-npst-1>2pl = pl
‘I distribute the photos to you all.’
b.sarkar = ŋayaŋtenten = be
government = ergmoney[nom]villages = loc
ŋ-hapsu-bi-ci = ha
‘The government distributed the money to the villages.’
(Schackow 2012: 161–162)

5.1.4 Definiteness-conditioned split R-flagging

In Wolof, a dative preposition ci is required on the R when it is indefinite.

a.Joxnaaxale bu jigéénjibennvelo.
‘I gave the girl a bicycle.’
b.*Joxnaabennxale bu jigéénvelobi.
‘I gave a girl the bicycle.’
c.Joxnaavelobicibennxale bu jigéen.
‘I gave the bicycle to a girl.’
(Atlantic; Becher 2005: 19)

A similar restriction has been reported for Kinyarwanda (Kimenyi 1980: 59–60).

5.2 Split T flagging

Split T (theme) flagging is quite rare, just as T flagging in general is uncommon in ditransitive constructions (English has a T preposition only with a few verbs, e.g., provide [someone]r [with something]t). It seems that it is mostly found in languages of West Africa.

5.2.1 Nominality-conditioned split T flagging

In Ewe, a serial-verb flag is required preceding T if T is a personal pronoun, as can be seen in (32).

Ewe (Kwa)
‘Kosi gave the money to the girl.’ (no flag on nominal T)
(‘Kosi gave it to Ami.’) (person-form T, flagless construction ungrammatical)
‘Kosi gave it to Ami.’ (lit. ‘Kofi took it, gave-to Ali’)
(person-form T flagged with auxiliary tsɔ́ ‘take’)
(Essegbey 2010: 182–183)

The element tsɔ́ that is obligatorily used in (32c) looks like a verb (in a kind of serial verb construction), but from a comparative perspective it is a kind of flag. What exactly its categorial status is in Ewe is irrelevant – all that matters for my universal claim in (13) is that there is “longer coding”, and this is no doubt the case in (32c).

5.2.2 Definiteness-conditioned split T flagging

In Akan, another Kwa language of West Africa, the T argument must be indefinite in a simple double object construction, as in (33a). (33b) with the definite article on the T is ungrammatical, and a construction with a special T-marking serial verb must be used instead ( lit. ‘take’).

‘Kofi gave the child a chicken.’
(‘Kofi gave the child the chicken.’)
‘Kofi gave the chicken to the child.’
(Osam 1996: 63–64)

Thus, Akan is like Ewe except that the use of the serial-verb flag depends on the definiteness of the T, not on its person-form status.

5.2.3 Person-conditioned split T coding

I have found only a single case of this: In Georgian, locuphoric personal pronouns cannot be used in their ordinary form in T role, as seen in (34b). Instead, a reinforced form (šeni tavi lit. ‘your self’) must be used.

‘Vano compared Anzori to Givi.’
(‘Vano compared you to Givi.’)
‘Vano compared you to Givi.’
(Harris 1981: 48–49)

This situation is even less like the special-flagging patterns seen elsewhere, but again, Universal 3 only requires “longer coding”, and this is what the special form šeni tavi seems to achieve here.

6 Scenario splits in monotransitive constructions

Scenario splits in monotransitive constructions have sometimes been called “global case marking” (Malchukov 2008: 213; Silverstein 1976: 134), or “global case splits” (Bárány 2017; Georgi 2012; Keine 2010: Chapter 6). They are not common, but they have been found in languages all over the world, and they all obey Universal 5, repeated here.

The scenario universal (Universal 5)
If a language has an asymmetric scenario split, then the coding is longest for
upstream scenarios, shortest for downstream scenarios, and intermediate for
balanced scenarios.

6.1 Person-conditioned special P flagging

In Kolyma Yukaghir, a special accusative flag is required on the P when the A is aliophoric (cf. Keine 2010: Section 6.3), i.e., when the construction is not person-downstream.

Kolyma Yukaghir (Russian Far East)
‘My father has killed your husband.’
(A is third person, special accusative flag)
‘I killed a deer.’ (A is locuphoric, no P flagging)
(Maslova 2003: 89; 10)

A completely parallel situation is found in Teop, an Oceanic language (illustrated above in (3)). Yurok (an Algic language of California) is similar as well, requiring an accusative suffix if the A is aliophoric and the P is locuphoric (Robins 1958: 21; see Keine 2010: Section 6.2; Georgi 2012).[18]

6.2 Person-conditioned special A flagging

In Sahaptin, a special ergative flag is required on the A when the P is locuphoric (cf. Keine 2010: Section 6.1), i.e., when the construction is person-upstream.

Sahaptin (Sahaptian; Pacific Northwest)
a.ku= ši-q’ínun-atílaaki-nɨm
‘And the woman saw me.’ (P is locuphoric, special ergative flag)
‘And the boy saw the woman.’ (P is third person, no A flagging)
(Rude 2009: 13–14)

6.3 Definiteness-conditioned special A flagging

Eastern Khanty has an ergative flag on the A-argument when the P-argument is specific (see Baker 2015: 128), i.e., when the construction is not definiteness-downstream.

Eastern Khanty (Uralic)
‘I went to pick berries with my younger sister.’
(P is nonspecific, no A flagging)
‘We put them (pots of berries) beside a big tree.’
(P is specific, special ergative flag)
(Gulya 1966: 135)

Baker (2015: 128) also cites Ika (a Chibchan language of Colombia), and Kalin (2017: Section 3.4) cites Niuean, both of which have been described as having very similar systems.

6.4 Animacy-conditioned special P flagging

A similar phenomenon from a less exotic language is Spanish differential object marking that depends on the (in-)animacy of the A: As discussed by García García (2007: 64), Spanish uses the preposition a on the P also in cases such as (38), where the A is inanimate (i.e., when the construction is not animacy-downstream).

‘In this recipe, milk can replace the egg.’

7 Scenario splits in ditransitive constructions

Scenario splits in ditransitive constructions have become quite famous since the 1990s, but the discussion has almost exclusively centred on person-conditioned special R-coding (Section 7.1), and has almost exclusively taken place in the generative literature. The fact that these scenario splits are just a special case of a larger generalization has not been noted in this literature.

7.1 Special R coding conditioned by person of T

This kind of coding split covers most of what has become widely known in the literature as “person-case constraint” (PCC). For example, in Bulgarian, when T is third person (aliophoric), the R can be a clitic pronoun, but the special dative preposition na is required on the R when the T is a locuphoric clitic pronoun (see 39b–c).

‘I recommend her to them.’ (3 > 3, balanced)
‘I recommend you to them.’ (3 > 2, upstream)
‘I recommend you to them.’
(Hauge 1999 [1976])

As was pointed out in Haspelmath (2004), the term “person-case constraint” is a misnomer, because very similar phenomena are found in languages that have no case at all, like Shambala.

Shambala (Bantu)
‘S/he has brought him/her to me.’ (1 > 3, downstream)
‘S/he has brought me to him/her.’ (3 > 1, upstream)
‘S/he has brought me to him/her.’
(Duranti 1979: 36)

For constructions like these, I had earlier discussed the universal in (41), based on the “person-case constraint” of Bonet (1994) and subsequent work.

Ditransitive Person-Role Constraint (Universal 9a)
Combinations of bound person forms (indexes) with the roles R and T are
disfavoured if the T index is first or second person and the R index is third person.

In the present context, this can be reformulated as follows:[19]

Ditransitive person-role universal (Universal 9b)
If T is locuphoric and R is aliophoric (i.e., if T is higher on the person scale than R), a language may require a longer construction (not involving person indexes), while (short) person indexes are always allowed when the R is locuphoric and the T is aliophoric.

A scenario with locuphoric R and aliophoric T is a downstream scenario, so we expect it to be expressed by short forms. Universals 9a and 9b are thus merely special cases of Universal 5 in (16) (the fact that the person-role universal was not fully general was already noted in Haspelmath 2004: Sections 6.26.4, and the present paper expands on those observations).

7.2 Special R coding conditioned by nominality of T

In many varieties of English (especially American, it seems), the R cannot be coded in the simplest way when the T is a person form rather than a full nominal. In these varieties, (44a) is unacceptable (*Pat showed him it), not only (44b), which seems to be unacceptable in all varieties of English (see (4c) above).

a.Kim showed me his house.
(pers > N, downstream)
b.Lee showed her brother her new house.
(N > N, balanced)
a. *Pat showed him it. – OK: Pat showed it to him.
(pers > pers, balanced)
b. *Pat showed his wife it. – OK: Pat showed it to his wife.
(N > pers, upstream)

In these varieties of English, special coding of R (with the preposition to) is thus conditioned by a high position of T on the nominality scale: If T is a personal pronoun, R must be coded in a special way.

7.3 Special T coding conditioned by nominality of R

While the use of a special R marker (like na+ in Bulgarian and kwa+ in Shambala) to code the person-upstream scenarios 3 > 1 and 3 > 2 is perhaps the most widespread pattern, some languages use special focal forms for T arguments when the R is a person form.

For example, Modern Greek has a set of Genitive (i.e., dative) and Accusative proclitics used in downstream and aliophoric balanced scenarios, as seen in (45a). But these proclitics cannot be used in upstream scenarios, as seen in (45b).

‘He gave it to him.’ (3 > 3, person-balanced)
(‘He gave me to him.’) (3 > 1, person-upstream)
‘He gave me to him.’ (Lit. ‘He gave me to him.’)

The form eména in (45c) does not contain a special flag, but is simply the independent personal pronoun. However, it is longer and thus conforms to the general prediction of longer coding for less usual situations.

7.4 Special R coding conditioned by animacy of T

In Icelandic, the preposition fyrir is required on the R when the T is animate, according to Siewierska and van Lier (2013: 194).

‘He introduced this type of fiction to me.’ (1 > 3, downstream)
‘I will introduce you to her.’ (3 > 2, upstream)

Both constructions have accusative flagging for the T and dative flagging for the R, but the dative flag is the (shorter) Dative case form in the downstream scenario, while the dative flag is a (longer) preposition in the upstream scenario.

8 Relative scenario splits

So far, we have seen scenario splits where only the referential prominence of the coargument is relevant for the coding of an argument (monadic scenario splits, cf. n. 18), and also scenario splits where the prominence features of both arguments are relevant (dyadic scenario splits).

Here I briefly discuss a third type of scenario split, called relative scenario split: By this I mean situations where the coding of an argument is determined by the relation between the prominence levels of the two arguments. Such scenario splits are not possible with binary prominence scales as in (8), but only with ternary scales (and scales with even more members). For example, Kashmiri has the rule in (47b), based on the person scale in (47a) (as described by Wali and Koul 1994: Section 2.4; see also Nichols 2001).

a.first > second > third
b.If the A is higher than the P on the person scale (47a),
i.e., in a person-downstream scenario, the P has Absolutive (null) case,
but if it is not higher (in balanced and upstream scenario),
it has Dative case.

This is illustrated by the examples in (48). In (48a) and (48b), we see downstream scenarios with Absolutive-marked (and thus zero-marked) objects, and in (48c) and (48d) we see an upstream and a balanced scenario, with Dative-marked objects.

‘I am teaching you.’ (1 > 2, downstream)
‘You are teaching him.’ (2 > 3, downstream)
‘He is teaching you.’ (3 > 2, upstream)
‘He is teaching him. ‘ (3 > 3, balanced)
(Wali and Koul 1994: 976–977)

Kashmiri is thus similar to Yukaghir, Teop and Yurok (as seen in Section 6.1) in that it exhibits person-conditioned special P flagging (by Dative suffixes), but the condition crucially makes reference to the relation between the scale level of the A and the P, not merely to the position of the A and/or the P on the scale.[20]

Another, more famous case is Fore (a Goroka language of Papua New Guinea), where according to Scott (1978), ergative flagging is used only if the P is higher than the A on the ternary animacy scale (human > animal > inanimate). Fore has been discussed widely (Donohue and Donohue 1997; Foley 1986: 173; Malchukov 2008: 212–213), and another language that keeps being cited is Awtuw (a Sepik language of Papua New Guinea), where according to Feldman (1986), accusative flagging occurs only if the P is not lower than the A in animacy (de Swart 2006: 253; Malchukov 2008: 212). But it seems that such languages are rare, as few further cases seem to have come up.

In the ditransitive domain, I am aware of only one language which needs to be described in this way. According to Creissels and Kouadio (2010: 176), one of the ditransitive constructions in Baule (a Kwa language of Côte d’Ivoire, fairly closely related to Akan) must be described by the rule in (49b), based in the scale in (49a).

a.personal pronoun > proper name > common noun
b.R and T can be unflagged only if R is higher than T on the scale in (49a),
but in balanced and upstream scenarios, the T must be flagged
by the serial verb .

This rule is illustrated by the examples in (50)–(53).

‘Kouakou showed your house to Kofi.’ (prop > common, downstream)
‘Kouakou showed me Akissi.’ (pers > prop, downstream)
(‘Kouakou showed me to Akissi.’) (prop > pers, upstream)
Kouakoutake-pfv1sgshow- pfvAkissi
‘Kouakou showed me to Akissi.’
(‘Kouakou showed them to me.’) (pers > pers, balanced)
Kouakoutake-pfv3plshow- pfv1sg
‘Kouakou showed them to me.’
(‘Kouakou showed Akissi to Kofi.’) (prop > prop, balanced)
‘Kouakou showed Akissi to Kofi.’

Even though there are few attested cases of relative scenario splits, I hypothesize that they exemplify the broader generalization in (54). This is merely a special case of the scenario universal that we saw earlier, and in fact the prediction is exactly the same.

The relative scenario universal (Universal 10)
If a language has an asymmetric relative scenario split, then the coding tends to be longest for upstream scenarios, shortest for downstream scenarios, and intermediate for balanced scenarios.

9 Argument coding versus verbal voice coding

In the examples that we saw so far, the non-usual associations show additional coding on the verb’s arguments. However, the role-reference association universal in (5) is formulated in more general terms, without specific reference to argument coding. This is because the special coding can also be verb coding, or more specifically, verbal voice coding.[21]

In monotransitive constructions, verb-coded splits in non-usual situations are not uncommonly conditioned by person. A number of languages use the basic verb form in person-downstream scenarios, but a specially marked verb form in upstream scenarios. These markers are generally called inverse markers (Jacques and Antonov 2014; Zúñiga 2006). An example comes from Itonama, a language of lowland Bolivia, which has an inverse prefix k’i-.

‘you (f) see him/her’ (2 > 3, downstream)
‘he hit you (f) in the face’ (3 > 2, upstream)
(Crevels 2010: 680, 682)

Verb coding is very rare in ditransitive constructions, but there is at least one case in Makassarese, an Austronesian language of Sulawesi (Indonesia). In this language, the verb sare ‘give’ is the only underived ditransitive verb. It occurs as such only with an indefinite T, as seen in (56a). If the T is a definite nominal or a person form, the Applicative suffix -ang is required (cf. 56b) and (56c) (Jukes 2006: 341).

‘I’ll give you some money.’
‘I’ll give you my money.’
‘I’ll give it to you.’

The Itonama example is similar to a scenario split, and the Makassarese example is similar to a single-argument split. These constructions are not argument coding splits, of course (because the relevant arguments are always coded in the same way), but the patterns are clearly closely related, so I decided to include this brief discussion in this paper.[22]

I know of no extensive comparative studies of inverse patterns, but all the evidence that I have seen is compatible with the generalization in (57).

The inverse universal (Universal 11)
If a language uses different verb forms for downstream and upstream scenarios, i.e., an inverse form and a direct form, and the verb coding is asymmetric, then the inverse form tends to be longer than the direct form.

Note that the most famous inverse/direct pattern, as known from Algonquian languages, shows both overt inverse and overt direct marking (symmetric coding). This pattern is consistent with Universal 11, though it does not provide strong evidence for it, because the direct markers are not always shorter than the inverse markers.

10 Alternations

10.1 Classical passive, antipassive and dative alternations

Another widespread phenomenon in languages, closely related to argument coding splits, is argument coding alternations. An alternation is a situation where two different coding patterns can be used alongside each other, with roughly the same meaning. Well-known examples of coding alternations are passives and antipassives for monotransitives, and dative alternations for ditransitives.

passive alternation in English
a.The woman sold the house.
bThe house was sold by the woman.
antipassive alternation in Chukchi
mother-ergshirt-abssew-3sg > 3sg.aor
‘The mother sewed the shirt.’
‘The mother sewed the shirt.’
(Kozinsky et al. 1988: 667)
dative alternation in English
a.The girl gave the boy the pen.
b.The girl gave the pen to the boy.

There is a great variety of such argument coding alternations, but when we single out those that show asymmetric coding, we can generalize over them in a way that has apparently not been done before. By asymmetric coding, I mean a situation where either (i) one of the alternates has special verb coding, as is normally the case in passives (Haspelmath 1990), and also in the antipassive construction in (59), or (ii) the argument flagging in one of the alternates is clearly shorter. The latter is the case in the English Dative alternation, where the Double Object construction (in 60a) shows no preposition, while the Prepositional Dative construction has an extra dative preposition. Many alternations are asymmetric in both ways at the same time: Thus, the English passive alternation has special verb coding (the passive auxiliary be plus the Past Participle form of the verb), and in addition the argument flagging is longer (the preposition by on the agent argument).

Given this notion of an asymmetrically coded alternation, we can formulate the generalization in (61).

The alternation universal (Universal 12)
In an asymmetric argument coding alternation, the longer alternant tends
to be used in situations that deviate from the usual associations
of roles and referential prominence.

Here the most relevant subtype of referential prominence is topicality or givenness. For passives, which are by definition asymmetric, this means that they tend to be used when the A is not given/topical, and/or when the P is not new information. We can formulate this as a universal:

The passive universal (Universal 13)
If a passive alternation is sensitive to givenness, then the passive alternant tends to be used when the original A is not given information and/or the original P is not new information.

That this is indeed the case has been known for quite some time (e.g., Shibatani 1985; Siewierska 1984), although I am not aware of earlier formulations that are as general as Universal 13.

For dative alternations, which are generally (and almost by definition) asymmetrical, we can likewise say that the longer alternant occurs when unexpectedly the R is not given/topical, and/or when the T is not new information.

The dative alternation universal (Universal 14)
If a dative alternation is sensitive to givenness, then the dative alternant tends to be used when the R is not given information and/or the T is not new information.

For English, this is well established (e.g., Collins 1995; Thompson 1990), and the situation in related languages is not very different (see van der Beek 2004 for Dutch, for example). However, dative alternations are not very common in the world’s languages (Siewierska 1998), and I am not aware of in-depth studies of dative alternations in non-European languages. Thus, the available evidence for Universal 14 is currently slim, but there is no counterevidence either, and both 13 and 14 are special cases of Universal 12, so I would like to claim that it is indeed a universal generalization.

Universal 12, in turn, is evidently a special case of Universal 1, the general role-reference association universal.

10.2 Splitting alternations

In addition to coding splits and coding alternations, we also find an intermediate phenomenon that provides further confirmation for the present approach: Some construction pairs alternate under some conditions, but are in complementary distribution in other conditions. I call these situations splitting alternations.

For example, in Lummi (a Salishan language; Jelinek and Demers 1983), the ordinary Active construction is used only when the scenario is nominality-downstream (as in 64a) or nominality-balanced (as in 64b). When the scenario is upstream, the Passive construction (with the verb suffix -ŋ, and the Oblique preposition on the A) is obligatory (see 64c).

‘He knows the man.’ (NOT: ‘The man knows him.’) (pers > N, downstream)
‘He knows it.’ (pers > pers, balanced)
‘He is known by the man.’ (= The man knows him) (N > pers, upstream)

For the scenarios that we have seen so far, the Active-Passive pair can be seen as a split pattern that shows both an argument coding split (because the A is sometimes Oblique-marked) and verb coding (because the non-usual situation requires the verb marker -ŋ). However, when both the A and the P are full nominals, either the Active or the Passive can be used:

‘The man knows the boy.’ (N > N, balanced)
‘The boy is known by the man.’ (N > N, balanced)

A very similar situation is described for Northern Tiwa of Picurís by Nichols (2001).

In the domain of ditransitive constructions, we find a parallel case in Koyra Chiini (a Songhay language of Mali; Heath 1999). On the one hand, the Postpositional Dative construction (with the postposition + se) is the only possibility in not maximally usual situations, i.e., in nominality-upstream and nominality-balanced scenarios, as in (66a)–(66c). The shorter Double Object construction (66d) is possible only in nominality-downstream patterns.

Koyra Chiini Songhay
‘I gave it to the woman.’ (= Heath’s 445b)
(N > pers, nominality-upstream)
‘I gave the woman some water.’ (= 445d)
(N > N, balanced)
‘You give it to them.’ (= 449b)
(pers > pers, balanced)
‘You give them some money.’ (= 447b)
(pers > N, downstream)
(Heath 1999: Section 9.1.2)

However, in the nominality-downstream scenario, the Double Object construction is not obligatory, but either construction is possible:

‘We tell them to give us a piece.’ (= Heath’s 448)
‘anyone who can give us information about it’ (= 447a)

Heath (1999) has not investigated the differences between these two constructions, and Jelinek and Demers (1983) do not tell us about the usage differences between the Lummi Active and Passive either, but I suspect that further investigation will show that they differ in usage not unlike the corresponding English constructions, as suggested by the universals in Section 10.1.

Since the English Dative Alternation is also an optional alternation in some cases (cf. (60)) but a grammatically required split in others (cf. (4)), it is also an example of a splitting alternation.

11 Possible explanations

11.1 Summary of the generalizations

I have reviewed substantial evidence for the role-reference association universal that was introduced in the first section of this paper and is repeated here:

The role-reference association universal (Universal 1)
Deviations from usual associations of roles and referential prominence tend to be coded by longer grammatical forms if the coding is asymmetric.

The subsequent Universals 3–14 are all special instances of this super-universal. Figure 1 diagrams their taxonomic relationships.

The next question is what explains this universal tendency. In this section, I mention three possible explanations: frequency-based coding efficiency (Section 11.2), disambiguation (Section 11.3), and innate biocognitive-representational constraints of a grammar blueprint (Section 11.4). I will argue for frequency-based coding efficiency and against the other two explanations.

11.2 Frequency-based coding efficiency

The first explanation, which I argue for here, invokes a functional-adaptive constraint: Languages tend to have efficient coding systems, with zero or short coding for more frequently occurring meanings and functions, and overt and long coding for more rarely occurring functions. Through piecemeal adaptation in language use, languages come to have (or restore) efficient patterns. This explanation accounts for a very large number of other coding asymmetries, as summarized in the universal in (68).

The grammatical form-frequency correspondence universal
When two grammatical patterns that differ minimally in meaning (i.e., patterns that form a semantic opposition) occur with significantly different frequencies, the less frequent pattern tends to be overtly coded (or coded with more coding material), while the more frequent pattern tends to be zero-coded (or coded with less coding material), if the coding is asymmetric.

This accounts for coding asymmetries in pairs like singular/plural, present/future, affirmative/negative, allative/ablative, and many others. They were often called “markedness asymmetries” in the past (Croft 2003: Chapter 4; Greenberg 1966), but it is now clear that the formal patterns are due to frequency-based coding efficiency (Haspelmath 2008; 2021). The explanation is the same as for length differences in lexical forms (Zipf 1935): frequently expressed meanings are more predictable and can therefore be expressed by shorter forms.

For role-reference associations, this kind of explanation has long been advocated, for example by Caldwell (1856: 276), whose early remarks on special P flagging deserve to be quoted again here (cf. Filimonova 2005: 78):

[…] the principle that it is more natural for rational beings to act than to be acted upon; and hence when they do happen to be acted upon – when the nouns by which they are denoted are to be taken objectively – it becomes necessary, in order to avoid misapprehension, to suffix to them the objective case-sign.

It is of course unclear what is meant by “natural” and a vague appeal to “nature” is not explanatory by itself. However, if we replace “natural” by “frequent”, then the quotation is identical to what I claim here: Frequency asymmetries lead to differences in predictability, and grammatical coding is more efficient if more predictable meanings get less coding, and vice versa. A very similar formulation is found in Comrie (1989: 128), still using “natural” but evidently meaning “frequent”:

[…] it has been noted that in actual discourse there is a strong tendency for the information flow from A to P to correlate with an information flow from more to less animate and from more to less definite. In other words, the most natural kind of transitive construction is one where the A is high in animacy and definiteness, and the P is lower in animacy and definiteness; and any deviation from this pattern leads to a more marked construction. … the construction which is more marked in terms of the direction of information flow should also be more marked formally, i.e., we would expect languages to have some special device to indicate that the A is low in animacy or definiteness or that the P is high in animacy or definiteness[…] (Comrie 1989: 128).

Comrie also uses the “markedness” terminology, which does not lend itself to a clear causal explanation, but if “marked formally” is replaced by “coded overtly” and “marked in terms of information flow” is replaced by “non-usual (= rare) association of role and referential prominence”, then this is exactly the explanation that I am proposing. Perhaps the clearest statement in the earlier literature that appeals to frequency of use in explaining a subset of the phenomena discussed here is Dahl and Fraurud (1996) (as well as Dahl 2008).

Importantly, the functional view of role-reference association effects has no problem explaining some of the features of the crosslinguistic patterns that are otherwise hard to account for:

  1. The patterns that we find are implicational universals, not unrestricted universals. In general, implicational patterns suggest functional motivations rather than innate representational constraints.

  2. We often find optionality of coding around the cut-off points, as has been described, for example, in Spanish (e.g., von Heusinger and Kaiser 2005; for optional case-marking more generally, see Lestrade 2013).

  3. Asymmetric coding splits (Sections 49) and asymmetric coding alternations (Section 10) can be seen to fall under the same larger generalization, even though their status as grammatical rules is very different. The idea that “soft constraints mirror hard constraints” (Bresnan et al. 2001) is much more readily explained in a functional view of the origins of these grammatical patterns than if one attributes crosslinguistic regularities to an innate grammar blueprint.

  4. What matters for the crosslinguistic generalizations is the overt markers, not the abstract patterns. For example, while abstractly, many Australian languages exhibit tripartite case alignment patterns (Baker 2015: Section 1.2.2), these patterns show realizations by overt markers that are fully in agreement with the predictions of frequency-based coding efficiency. This is not explained by generative theories of abstract case (as is admitted by Baker 2015: 23, Note 12).

Finally, let me make a few remarks on the causes of the associations of role rank and referential prominence that we observe empirically. These associations are of course no surprise: Agent and recipient arguments have a strong tendency to be animate (and therefore definite), because humans are primarily interested in actions carried out by humans, as well as in transfer events with human recipients. This is what leads us to say that agents and recipients have a “high role rank”. I am somewhat hesitant to say that this is a real explanation, because one might justly ask what leads speakers to classify agents and recipients as coherent role classes to begin with (it could be that these are innate categories of a grammar blueprint). But regardless of the explanation of the associations, they are empirically testable, because linguists generally agree that agent roles and recipient roles can be identified in languages (and texts) independently of their referential prominence properties. For the efficiency explanation of form-frequency correspondences to go through, it is not necessary that we have an explanation of the usage frequencies (Haspelmath 2021).

11.3 Ambiguity avoidance

So far, I have not mentioned ambiguity avoidance as an explanatory factor, although intuitively, this makes a lot of sense: Languages serve to convey speakers’ thoughts to their interlocutors, so their expressions should not be ambiguous. And indeed, the classic cases of split P flagging (differential object marking) help the hearer to distinguish between the A and the P, in that they provide special marking precisely for those P-arguments that are more like typical A-arguments (in that they are definite and/or animate). The same reasoning applies to split A flagging, as well as to split R and T flagging, and also to the scenario splits.

So why am I not saying that the explanation of the grammatical patterns lies in ambiguity avoidance,[23] and instead appeal to a more abstract principle of efficient coding and predictability? There are three reasons for this.

First of all, ambiguity avoidance can be seen as merely a special case of efficient coding. If grammatical markers are preferentially used when the grammatical meaning is least predictable, then it follows that they should tend to be used when there is a danger of ambiguity, because ambiguity is an extreme form of lack of predictability. For differential object marking (DOM), this point was made by Newmeyer (2005: Section 4.9.2) (and endorsed by Hawkins 2014: 194).

All that one needs to adopt is the well-established hypothesis that within a given domain, more frequent combinations of features require less coding that less frequent ones. There is no need to appeal to ambiguity reduction to explain the phenomenon of DOM (Newmeyer 2005: Section 4.9.2).

Second, and more importantly, if ambiguity avoidance were the primary explanation, we would expect that languages can use diverse coding means to ensure unambiguous interpretation, including anti-efficient coding. For example, one could imagine languages in which topical agents are always ergative-marked and indefinite patients are always accusative-marked, but focused agents and definite patients are zero-coded. Such a language would have very little potential ambiguity (because only clauses with focused agents and definite patients would lack flagging, and these clauses are very rare), but this type is completely unattested. By contrast, languages of the mirror-image type, which are actually widely attested, show at least as much potential ambiguity (when the patient is indefinite and has no accusative marking). This shows that efficiency of coding is crucial for understanding limits on the world-wide distribution of languages.

Third, there are cases where a longer expression is required (or is possible) for the less frequently used meanings even though there would seem to be nothing wrong with the more regular, shorter expression. For example, in ditransitive person-role combinations, languages sometimes ban perfectly regular and non-ambiguous forms, as we saw in (45) from in Modern Greek, repeated here.

‘He gave it to him.’
‘He gave me to him.’

(45b) is perfectly clear and unambiguous, but the language still requires the longer form Tu éðose eména ((45c) in Section 7.3 above). A similar argument on the basis of experimental results is made by Diessel (2019: Section 11.5.2).

Of course, all cases of asymmetric alternations are precisely of this kind: A sentence such as A girl saw the stork is perfectly unambiguous, but languages still often prefer more complex forms (with additional flagging and/or additional verb coding) such as The stork was seen by a girl. This is explained by the preference to have additional coding for less predictable forms, even if these forms are no less ambiguous than the simpler forms.

11.4 Biocognitive-representational constraints of a grammar blueprint

Over the years, there have been a large number of proposals in generative grammar for how to account for various aspects of the phenomena that are here subsumed under role-reference associations. Since split P flagging (differential object marking) is particularly prominent, this has been treated frequently, but it seems that no consensus has emerged (Kalin (2018) surveys quite a few different recent approaches). Likewise, for person-role interactions (cf. Section 7.1), there are many competing ideas (see Tucker 2013: 254–276 for an overview). There is also some generative literature on other scenario splits, but most of this work does not establish clear links with single-argument splits.[24] What most of the generative papers share is that they talk about particular languages, but at the same time seem to propose highly general analyses that are motivated by many other considerations. This kind of work is thus very hard to compare to the present proposal, which focuses on claims about universal tendencies in the world’s languages.

There is really only one line of research in the generative tradition that is similar in generality to my proposal, and this is Aissen’s (1999, 2003) work, which has been widely cited. Aissen uses optimality-theoretic (OT) notation to express the universal tendency that role rank and referential prominence go hand in hand. There had been earlier hints in the literature about such role-reference connections, as in the following quotation from Farkas and Kazazis (1980).

[I]n the Rumanian clitic system, the case hierarchy [Ethical > Goal > Theme] and the personal hierarchy [1 > 2 > 3] are not supposed to conflict. Where there is no conflict […], the string is grammatical. Where there is strong conflict […], the sequence is unacceptable […] (Farkas and Kazazis 1980: 78; see Haspelmath 2004: Section 2.6).

But Aissen was the first to incorporate the prominence scales known from the functional-typological literature into generative grammar, in an attempt to provide a truly general explanation for all languages.

However, Aissen’s papers diverge from the standard generative approach in that they do not really make crucial reference to representational constraints of a grammar blueprint (“universal grammar”). On the one hand, Aissen’s work is generative in that it appeals to universal constraints in order to simultaneously describe individual languages and explain the gaps in attested languages. Her complex system of fixed constraint subhierarchies with constraint conjunction, and with intercalated constraints that penalize overt case-marking, derives the implicational universals nicely. But on the other hand, she does not really claim that these constraints are specific innate biocognitive mechanisms (as part of a grammar blueprint). Instead, she connects her approach with functionalist ideas about markedness and iconicity:

The effect of local conjunction here is to link markedness of content (expressed by the markedness subhierarchy) to markedness of expression (expressed by *Ø). That content and expression are linked in this way is a fundamental idea of markedness theory (Greenberg 1966; Jakobson 1939). In the domain of Differential Object Marking, this is expressed formally through the constraints [shown immediately above]. Thus they are iconicity constraints: they favor morphological marks for marked configurations (Aissen 2003: 448).

Another quotation makes it particularly clear that for Aissen, the optimality-theoretic machinery is primarily a notational device:

“OT provides a way … to reconcile the underlying impulse of generative grammar to model syntax in a precise and rigorous fashion with a conception of DOM which is based on prominence scales. The purpose … is to develop an approach … that is formal and at the same time expresses the functional-typological understanding of DOM” (Aissen 2003: 439).

Even though the orthodox position seems to be that all constraints (plus the OT architecture) are innate, Aissen is not alone in linking OT notation with functional considerations. Whatever one thinks of the machinery used in her papers, it seems clear that there is no argument against my functional-adaptive view of role-reference associations here.

12 Concluding remarks

In this paper, I have surveyed a substantial number of asymmetric argument coding patterns, in particular flagging (case and adpositional marking) patterns, both in monotransitive and in ditransitive constructions. I have argued that they can all be subsumed under a single very general universal, which I call the role-reference association universal: Deviations from usual associations of role rank and referential prominence tend to be coded by longer grammatical forms (Universal 1 in (5)).

I have also argued (in Section 11.2) that the role-reference association universal can be subsumed under the even more general form-frequency correspondence universal, and that alternative explanations in terms of disambiguation or biocognitive-representational constraints are much less plausible. The form-frequency correspondences themselves are explained by the functional-adaptive force of frequency-based efficient grammatical coding.

While some of the coding splits and other phenomena that I discussed here are very widespread in the world’s languages (especially split P flagging, or “differential object marking”, but probably also passive constructions and their correlation with topicality), others are uncommon, and some appear to be very rare. Nevertheless, I believe that they provide important support for my generalizations and explanations, because they are very specific patterns that are unlikely to have arisen by chance. Especially for the relative scenario splits of Section 8, nobody would suggest that they could be chance developments, and the systematicity of the scenario splits in Sections 6 and 7 has also been beyond question among researchers who have taken a closer look at them. This is despite the fact that they are not common, except for person-role interactions in ditransitives (Section 7.1). In addition to the great specificity of the rules for the splits, an additional argument for their non-accidental nature is the fact that these splits are attested in scattered languages around the world. And given that detecting such patterns requires a fairly sophisticated description, it is quite possible that more such patterns will be found as more and better descriptions become available. Finally, by stating the universals clearly (not only the overarching Universal 1, but also Universals 3 through 14, summarized in Section 11.1), I have provided a challenge for skeptical colleagues to provide counterevidence. For well-known generalizations, if little counterevidence comes to light (cf. the few counterexamples collected by Filimonova 2005), this can by itself taken as support for the generalizations.

Finally, I should emphasize that the crosslinguistic generalizations that I stated and explained here are in a rather indirect relationship with language-particular systems. Most discussions in linguistics concern language-particular systems, and these need not make any reference to the crosslinguistic generalizations, let alone to the explanations. It is possible, for example, that Grimshaw’s (2001) account of ditransitive person-role interactions in French clitic pronouns is correct for French, or that Baker’s (2015: Section 4.1) account of Accusative case in Sakha is correct for Sakha.[25] I am not making any claims about the systems of particular languages. What I am challenging is the idea that such language-particular accounts can be directly extended to other languages, i.e., that different languages somehow share parts of their systems. Languages exhibit similarities in their argument coding patterns which can be formulated in terms of comparative concepts, but this does not imply anything about their language-particular systems (let alone the mental grammars of their speakers). Nevertheless, if I am right, then if we want to understand (and not merely describe or analyze) the peculiarities of language-particular systems, it is necessary to look at the crosslinguistic regularities to which they correspond, because very promising explanations are available at the crosslinguistic level.

Special abbreviations
DOMdifferential object marking
Rrecipient of typical ditransitive clause
secsecundative (T-marking flag)
Ttheme of typical ditransitive clause
TAMtense, aspect, mood
Abbreviations also found in the Leipzig Glossing Rules
Aagent of typical transitive clause
Ppatient of typical transitive clause

Corresponding author: Martin Haspelmath, MPI-EVA Leipzig, Leipzig University, Deutscher Platz 6, 04103Leipzig, Germany, E-mail:


The support of the European Research Council (ERC Advanced Grant 670985, Grammatical Universals) is gratefully acknowledged. I also thank several reviewers and many colleagues who provided me with comments on or discussed with me the ideas presented in this paper, in particular Alexandra Aikhenvald, András Bárány, John Beavers, Balthasar Bickel, Bernard Comrie, Denis Creissels, Sonia Cristofaro, William Croft, Östen Dahl, Peter de Swart, Scott DeLancey, Holger Diessel, R.M.W. Dixon, Volker Gast, Laura Kalin, Seppo Kittilä, Randy LaPolla, Andrej Malchukov, William McGregor, Edith Moravcsik, Masayoshi Shibatani, Jenneke van der Wal, Eva van Lier, Alena Witzlack-Makarevich, Fernando Zúñiga. I am sure I have forgotten someone, for which I apologize here. Finally, I want to highlight the important role played by my Leipzig project colleagues Katarzyna Janic, Natalia Levshina, Susanne Maria Michaelis, Karsten Schmidtke-Bode, and Ilja Seržant in helping me understand the universal grammatical patterns discussed in this paper.

Appendix: Some illustrative frequency figures

This appendix gives some illustrative figures from the earlier literature to show that it is very plausible that Universal 2 (Section 2) is true: “Arguments with higher-ranked roles (A, R) tend to be more referentially prominent than arguments with lower-ranked roles (P, T), and vice versa.” These figures are only from seven different languages, as frequency figures are not often given in the literature, and I did not do any corpus research myself. But they should suffice for initial plausibility.

Animacy of A and P

Swedish (Dahl and Fraurud 1996: 51)
Movima (Bolivia; Haude 2014: 9)

Full nominality of A and P

Vera’a (Vanuatu; Haig and Schnell 2016: 599)26
person formfull nominaltotal

Definiteness of A and P

English (Jäger 2007: 80, Table 3)

Person of A and P

English (Jäger 2007: 80, Table 3)
locuphoric (1/2)aliophoric (3)total

Animacy of R and T

Tahitian (Snyder 2003: 80, 329)
English (Thompson 1990: 243)

Full nominality of R and T

English (Thompson 1990: 244)
person formfull nominaltotal

Givenness of R and T

Finnish (Kaiser 2002: 9, 81)

Person of R and T

German (Haspelmath 2004: 35)
locuphoric (1/2)aliophoric (3)total


Aissen, Judith. 1999. Markedness and subject choice in optimality theory. Natural Language & Linguistic Theory 17. 673–711.10.1023/A:1006335629372Search in Google Scholar

Aissen, Judith. 2003. Differential object marking: Iconicity versus economy. Natural Language & Linguistic Theory 21(3). 435–483.10.1023/A:1024109008573Search in Google Scholar

Baker, Mark C. 2015. Case. Cambridge: Cambridge University Press.10.1017/CBO9781107295186Search in Google Scholar

Bárány, András. 2017. Person, case, and agreement: The morphosyntax of inverse agreement and global case splits. Oxford: Oxford University Press.10.1093/oso/9780198804185.003.0001Search in Google Scholar

Becher, Jutta. 2005. Ditransitive Verben und ihre Objekte im Wolof: Positionsregeln und Kombinierbarkeit. Hamburger Afrikanistische Arbeitspapiere (HAAP) 3. 13–27.Search in Google Scholar

Bhatia, Tej K. 1993. Punjabi: A cognitive-descriptive grammar. London: Routledge.Search in Google Scholar

Bickel, Balthasar. 1995. In the vestibule of meaning: Transitivity inversion as a morphological phenomenon. Studies in Language 19(1). 73–127. in Google Scholar

Bickel, Balthasar, Alena Witzlack-Makarevich & Taras Zakharko. 2015. Typological evidence against universal effects of referential scales on case alignment. In Ina Bornkessel-Schlesewsky, Andrej Malchukov & Marc D. Richards (eds.), Scales and hierarchies: A cross-disciplinary perspective, 7–43. Berlin & Boston: De Gruyter Mouton.10.1515/9783110344134.7Search in Google Scholar

Bonet, Eulalia. 1994. The person-case constraint: A morphological approach. In Heidi Harley & Colin Phillips (eds.), The morphology-syntax connection, 33–52. (MIT Working Papers in Linguistics 22). Cambridge, MA: MIT Press.Search in Google Scholar

Bornkessel-Schlesewsky, Ina & Matthias Schlesewsky. 2009. The role of prominence information in the real-time comprehension of transitive constructions: A cross-linguistic approach. Language and Linguistics Compass 3(1). 19–58. in Google Scholar

Bossong, Georg. 1985. Differenzielle Objektmarkierung in den neuiranischen Sprachen. Tübingen: Narr.Search in Google Scholar

Bossong, Georg. 1991. Differential object marking in Romance and beyond. In Douglas Kibbee & Dieter Wanner (eds.), New analyses in Romance linguistics, 143–170. Amsterdam & Philadelphia: John Benjamins.10.1075/cilt.69.14bosSearch in Google Scholar

Bossong, Georg. 1998. Le marquage différentiel de l’objet dans les langues d’Europe. In Jack Feuillet (ed.), Actance et valence dans les langues de l’Europe, 193–258. Berlin & New York: Mouton de Gruyter.10.1515/9783110804485.193Search in Google Scholar

Bresnan, Joan, Shipra Dingare & Christopher D. Manning. 2001. Soft constraints mirror hard constraints: Voice and person in English and Lummi. In Miriam Butt & Tracy Holloway King (eds.), Proceedings of the LFG 01 conference. Stanford, CA: CSLI Publications.Search in Google Scholar

Caldwell, Robert. 1856. A comparative grammar of the Dravidian or South-Indian family of languages. London: Harrison.Search in Google Scholar

Coghill, Eleanor. 2010. Ditransitive constructions in the Neo-Aramaic dialect of Telkepe. In Andrej L. Malchukov, Martin Haspelmath & Bernard Comrie (eds.), Studies in ditransitive constructions: A comparative handbook, 221–242. Berlin & New York: De Gruyter Mouton.10.1515/9783110220377.221Search in Google Scholar

Collins, Peter. 1995. The indirect object construction in English: An informational approach. Linguistics 33(1). 35–50. in Google Scholar

Comrie, Bernard. 1978. Ergativity. In Winfred P. Lehmann (ed.), Syntactic typology: Studies in the phenomenology of language, 329–394. Austin, TX: University of Texas Press.10.1075/tilar.9.02comSearch in Google Scholar

Comrie, Bernard. 1989. Language universals and linguistic typology: Syntax and morphology. Oxford: Blackwell.Search in Google Scholar

Creissels, Denis & Jérémie Kouadio. 2010. Ditransitive constructions in Baule. In Andrej L. Malchukov, Martin Haspelmath & Bernard Comrie (eds.), Studies in ditransitive constructions: A comparative handbook, 166–189. Berlin & New York: De Gruyter Mouton.10.1515/9783110220377.166Search in Google Scholar

Crevels, Mily. 2010. Ditransitives in Itonama. In Andrej L. Malchukov, Martin Haspelmath & Bernard Comrie (eds.), Studies in ditransitive constructions: A comparative handbook, 678–709. Berlin & New York: De Gruyter Mouton.10.1515/9783110220377.678Search in Google Scholar

Croft, William. 2003. Typology and universals, 2nd edn. Cambridge: Cambridge University Press.Search in Google Scholar

D’Alessandro, Roberta. 2017. When you have too many features: Auxiliaries, agreement and clitics in Italian varieties. Glossa 2(1). 1–36. in Google Scholar

Dahl, Östen. 2008. Animacy and egophoricity: Grammar, ontology and phylogeny. Lingua 118(2). 141–150. in Google Scholar

Dahl, Östen & Kari Fraurud. 1996. Animacy in grammar and discourse. In Thorstein Fretheim & Jeanette K. Gundel (eds.), Reference and referent accessibility, 47–64. Amsterdam & Philadelphia: John Benjamins.10.1075/pbns.38.04dahSearch in Google Scholar

Dalrymple, Mary & Irina Nikolaeva. 2011. Objects and information structure. Cambridge: Cambridge University Press.10.1017/CBO9780511993473Search in Google Scholar

de Hoop, Helen & Peter de Swart (eds.). 2009. Differential subject marking. (Studies in natural language and linguistic theory 72). Dordrecht: Springer.10.1007/978-1-4020-6497-5Search in Google Scholar

de Swart, Peter. 2006. Case markedness. In Leonid Kulikov, Andrej Malchukov & Peter de Swart (eds.), Case, valency and transitivity (Studies in language companion series 77), 249–267. Amsterdam & Philadelphia: John Benjamins.10.1075/slcs.77.16swaSearch in Google Scholar

DeLancey, Scott. 1981. An interpretation of split ergativity and related patterns. Language 57. 626–657. in Google Scholar

Diessel, Holger. 2019. The grammar network. Cambridge: Cambridge University Press.10.1017/9781108671040Search in Google Scholar

Dixon, Robert M. W. 1979. Ergativity. Language 55. 59–138. in Google Scholar

Dixon, Robert M. W. 1980. The languages of Australia. Cambridge: Cambridge University Press.Search in Google Scholar

Dixon, Robert M. W. 1994. Ergativity. Cambridge: Cambridge University Press.10.1017/CBO9780511611896Search in Google Scholar

Donohue, Cathryn & Mark Donohue. 1997. Fore case marking. Language and Linguistics in Melanesia 28. 69–98.Search in Google Scholar

Duranti, Alessandro. 1979. Object clitic pronouns in Bantu and the topicality hierarchy. Studies in African Linguistics 10(1). 31–45.Search in Google Scholar

Essegbey, James. 2010. Inherent complement verbs and the basic double object construction in Gbe. In Enoch Oladé Aboh & James Essegbey (eds.), Topics in Kwa syntax, 177–193. Dordrecht: Springer.10.1007/978-90-481-3189-1_8Search in Google Scholar

Farkas, Donka & Kostas Kazazis. 1980. Clitic pronouns and topicality in Rumanian. Chicago Linguistic Society 16. 75–82.Search in Google Scholar

Fauconnier, Stefanie & Jean-Christophe Verstraete. 2014. A and O as each other’s mirror image? Problems with markedness reversal. Linguistic Typology 18(1). 3–49. in Google Scholar

Feldman, Harry. 1986. A grammar of Awtuw. Canberra: Australian National University. in Google Scholar

Filimonova, Elena. 2005. The noun phrase hierarchy and relational marking: Problems and counterevidence. Linguistic Typology 9(1). 77–113. in Google Scholar

Foley, William A. 1986. The Papuan languages of New Guinea. Cambridge: Cambridge University Press.Search in Google Scholar

García García, Marco. 2007. Differential object marking with inanimate objects. In Georg A. Kaiser & Manuel Leonetti (eds.), Proceedings of the workshop “definiteness, specificity and animacy in Ibero-Romance languages,” (Arbeitspapier 122), 63–84. Konstanz: University of Konstanz. Available at: .Search in Google Scholar

Georgi, Doreen. 2012. A local derivation of global case splits. In Artemis Alexiadou, Tibor Kiss & Gereon Müller (eds.), Local modelling of non-local dependencies in syntax (Linguistische Arbeiten 547), 305–336. Tübingen: Niemeyer.10.1515/9783110294774.305Search in Google Scholar

Gerwin, Johanna. 2014. Ditransitives in British English dialects. Berlin & Boston: De Gruyter Mouton.10.1515/9783110352320Search in Google Scholar

Greenberg, Joseph H. 1966. Language universals: With special reference to feature hierarchies. The Hague: Mouton.Search in Google Scholar

Grimshaw, Jane. 2001. Optimal clitic positions and the lexicon in Romance clitic systems. In Legendre Géraldine, Grimshaw Jane & Vikner Sten (eds.), Optimality-theoretic syntax, 205–240. Cambridge: MIT Press.Search in Google Scholar

Grossman, Eitan. 2015. No case before the verb, obligatory case after the verb in Coptic. In Eitan Grossman, Martin Haspelmath & Tonio Sebastian Richter (eds.), Egyptian-Coptic linguistics in typological perspective, 203–225. Berlin & Boston: De Gruyter Mouton.10.1515/9783110346510.203Search in Google Scholar

Gulya, János. 1966. Eastern Ostyak chrestomathy. Bloomington, IN: University of Indiana Press.Search in Google Scholar

Haig, Geoffrey & Stefan Schnell. 2016. The discourse basis of ergativity revisited. Language 92(3). 591–618. in Google Scholar

Harris, Alice C. 1981. Georgian syntax: A study in relational grammar. Cambridge: Cambridge University Press.Search in Google Scholar

Haspelmath, Martin. 1990. The grammaticization of passive morphology. Studies in Language 14(1). 25–72. in Google Scholar

Haspelmath, Martin. 2004. Explaining the ditransitive person-role constraint: A usage-based approach. Constructions 2. Available at: .Search in Google Scholar

Haspelmath, Martin. 2007. Ditransitive alignment splits and inverse alignment. Functions of Language 14(1). 79–102. in Google Scholar

Haspelmath, Martin. 2008. Frequency versus iconicity in explaining grammatical asymmetries. Cognitive Linguistics 19(1). 1–33. in Google Scholar

Haspelmath, Martin. 2011. On S, A, P, T, and R as comparative concepts for alignment typology. Linguistic Typology 15(3). 535–567. in Google Scholar

Haspelmath, Martin. 2013. Argument indexing: A conceptual framework for the syntax of bound person forms. In Dik Bakker & Martin Haspelmath (eds.), Languages across boundaries: Studies in memory of Anna Siewierska, 197–226. Berlin & Boston: De Gruyter Mouton.10.1515/9783110331127.197Search in Google Scholar

Haspelmath, Martin & the APiCS Consortium. 2013. Alignment of case marking of personal pronouns. In Michaelis Susanne Maria, Maurer Philippe & Haspelmath Martin, & Huber Magnus (eds.), The atlas of pidgin and creole language structures. Oxford: Oxford University Press. .Search in Google Scholar

Haspelmath, Martin. 2015. Ditransitive constructions. Annual Review of Linguistics 1. 19–41. in Google Scholar

Haspelmath, Martin. 2018. Review of Mark C. Baker (2015) case. Studies in Language 42(2). 474–486. in Google Scholar

Haspelmath, Martin. 2019a. Flagging and indexing, and head and dependent marking. Te Reo 62(1). 93–115. in Google Scholar

Haspelmath, Martin. 2019b. Can cross-linguistic regularities be explained by constraints on change? In Karsten Schmidtke-Bode, Natalia Levshina, Susanne Maria Michaelis & Ilja A. Seržant (eds.), Explanation in typology: Diachronic sources, functional motivations and the nature of the evidence (Conceptual foundations of language science 3), 1–25. Berlin: Language Science Press. Available at: .Search in Google Scholar

Haspelmath, Martin. 2019c. Differential place marking and differential object marking. Language Typology and Universals (STUF) 72(3). 313–334. in Google Scholar

Haspelmath, Martin. 2021. Explaining grammatical coding asymmetries: Form-frequency correspondences and predictability. Journal of Linguistics, forthcoming.10.1017/S0022226720000535Search in Google Scholar

Haude, Katharina. 2014. Animacy and inverse in Movima: A corpus study. Anthropological Linguistics 56(3). 294–314. in Google Scholar

Hauge, Kjetil Rå. 1999 [1976]. The word order of predicate clitics in Bulgarian. Journal of Slavic Linguistics 7(1). 89–137.Search in Google Scholar

Hawkins, John A. 2014. Cross-linguistic variation and efficiency. New York: Oxford University Press.10.1093/acprof:oso/9780199664993.001.0001Search in Google Scholar

Heath, Jeffrey. 1999. A grammar of Koyra Chiini: The Songhay of Timbuktu. Berlin & New York: Mouton de Gruyter.10.1515/9783110804850Search in Google Scholar

Helmbrecht, Johannes, Lukas Denk, Sarah Thanner & Ilenia Tonetti. 2018. Morphosyntactic coding of proper names and its implications for the animacy hierarchy. In Sonia Cristofaro & Fernando Zúñiga (eds.), Typological hierarchies in synchrony and diachrony, 377–401. Amsterdam & Philadelphia: John Benjamins.10.1075/tsl.121.11helSearch in Google Scholar

Iemmolo, Giorgio. 2013. Symmetric and asymmetric alternations in direct object encoding. STUF – Language Typology and Universals 66(4). 378–403. in Google Scholar

Jacques, Guillaume & Anton Antonov. 2014. Direct/inverse systems. Language and Linguistics Compass 8(7). 301–318. in Google Scholar

Jäger, Gerhard. 2007. Evolutionary game theory and typology: A case study. Language 83(1). 74–109. in Google Scholar

Jakobson, Roman. 1939. Signe zéro. In Mélanges linguistiques offerts à Charles Bally, 143–152. Geneva: Georg & Cie.10.1515/9783110873269.211Search in Google Scholar

Jelinek, Eloise & Richard A. Demers. 1983. The agent hierarchy and voice in some Coast Salish languages. International Journal of American Linguistics 49(2). 167–185. in Google Scholar

Jukes, Anthony. 2006. Makassarese (basa Mangkasara’): A description of an Austronesian language of South Sulawesi. Melbourne: University of Melbourne Dissertation.Search in Google Scholar

Kaiser, Elsi. 2002. The syntax-pragmatics interface and Finnish ditransitive verbs. In Marjo van Koppen, Erica Thrift, Erik Jan van der Torre & Malte Zimmermann (eds.), Proceedings of the 9th annual conference of the Student Organization of Linguistics in Europe (ConSOLE IX). Leiden: Leiden University. Available at: .Search in Google Scholar

Kalin, Laura. 2017. Dropping the F-bomb: An argument for valued features as derivational time-bombs. In Proceedings of the Northeastern Linguistic Society (NELS) 47, vol. 2, 119–132. Amherst, MA: GLSA University of Massachusetts.Search in Google Scholar

Kalin, Laura. 2018. Licensing and differential object marking: The view from Neo-Aramaic. Syntax 21(2). 112–159. in Google Scholar

Keine, Stefan. 2010. Case and agreement from fringe to core: A minimalist approach. Berlin & New York: De Gruyter.10.1515/9783110234404Search in Google Scholar

Kibrik, Alexandr (ed.). 1996. Godoberi. Munich: Lincom Europa.Search in Google Scholar

Kimenyi, Alexandre. 1980. A relational grammar of Kinyarwanda. Berkeley, CA: University of California Press.Search in Google Scholar

Kiparsky, Paul. 2008. Universals constrain change; change results in typological generalizations. In Jeff Good (ed.), Linguistic universals and language change, 23–53. Oxford: Oxford University Press.10.1093/acprof:oso/9780199298495.003.0002Search in Google Scholar

Kittilä, Seppo. 2008. Animacy effects on differential goal marking. Linguistic Typology 12. 245–268. in Google Scholar

Kozinsky, Isaac, Vladimir P. Nedjalkov & Maria S. Polinskaja. 1988. Antipassive in Chuckchee: Oblique object, object incorporation, zero object. In Masayoshi Shibatani (ed.), Passive and voice, 651–706. Amsterdam & Philadelphia: John Benjamins.10.1075/tsl.16.21kozSearch in Google Scholar

LaPolla, Randy J. 1992. Anti-ergative marking in Tibeto-Burman. Linguistics in the Tibeto-Burman Area 15(1). 1–9.Search in Google Scholar

Lazard, Gilbert. 2001. Le marquage différentiel de l’objet. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals: An international handbook, vol. 2, 873–885. Berlin & New York: Walter de Gruyter.Search in Google Scholar

Lestrade, Sander. 2013. The optional use of morphological case. Linguistic Discovery 11(1). 84–104. in Google Scholar

Lockwood, Hunter T. & Monica Macaulay. 2012. Prominence hierarchies. Language and Linguistics Compass 6(7). 431–446. in Google Scholar

Malchukov, Andrej. 2008. Animacy and asymmetries in differential case marking. Lingua 118. 203–221. in Google Scholar

Maslova, Elena. 2003. A grammar of Kolyma Yukaghir. (Mouton grammar library 27). Berlin & New York: Mouton de Gruyter.10.1515/9783110197174Search in Google Scholar

McGregor, William B. 2010. Optional ergative case marking systems in a typological-semiotic perspective. Lingua 120(7). 1610–1636. in Google Scholar

Merlan, Francesca. 1982. Mangarayi. Amsterdam: North-Holland.Search in Google Scholar

Moravcsik, Edith A. 1978a. On the case marking of objects. In Joseph H. Greenberg (ed.), Universals of human language, vol. 4: Syntax, 249–289. Stanford, CA: Stanford University Press.Search in Google Scholar

Moravcsik, Edith A. 1978b. On the distribution of ergative and accusative patterns. Lingua 45. 233–279. in Google Scholar

Mosel, Ulrike. 2007. Ditransitivity and valency change in Teop: A corpus based approach. Tidsskrift for Sprogforskning 5(1). 1–40. in Google Scholar

Newmeyer, Frederick J. 2005. Possible and probable languages: A generative perspective on linguistic typology. Oxford: Oxford University Press.10.1093/acprof:oso/9780199274338.001.0001Search in Google Scholar

Nichols, Lynn. 2001. The syntactic basis of referential hierarchy phenomena: Clues from languages with and without morphological case. Lingua 111(4). 515–537. in Google Scholar

Osam, Kweku E. 1996. The object relation in Akan. Afrika und Übersee Band 79. 57–83.Search in Google Scholar

Robins, Robert H. 1958. The Yurok language: Grammar, texts, lexicon. Berkeley & Los Angeles: University of California Press.Search in Google Scholar

Rude, Noel. 2009. Transitivity in Sahaptin. Northwest Journal of Linguistics 3(3). 1–37.Search in Google Scholar

Schackow, Diana. 2012. Referential hierarchy effects in Yakkha three-argument constructions. Linguistic Discovery 10(3). 148–173. in Google Scholar

Schmidtke-Bode, Karsten & Natalia Levshina. 2018. Reassessing scale effects on differential case marking: Methodological, conceptual and theoretical issues in the quest for a universal. In Ilja A. Seržant & Alena Witzlack-Makarevich (eds.), Diachronic typology of differential argument marking. (Studies in diversity linguistics 19), 509–537. Berlin: Language Science Press. in Google Scholar

Scott, Graham. 1978. The Fore language of Papua New Guinea. (Pacific linguistics: Series B 47). Canberra: Australian National University.Search in Google Scholar

Seržant, Ilja & Alena Witzlack-Makarevich. 2018. Differential argument marking: Patterns of variation. In Ilja Seržant & Alena Witzlack-Makarevich (eds.), The diachronic typology of differential argument marking (Studies in diversity linguistics 19), 1–40. Berlin: Language Science Press.Search in Google Scholar

Shibatani, Masayoshi. 1985. Passives and related constructions: A prototype analysis. Language 61(4). 821–848. in Google Scholar

Shibatani, Masayoshi. 2006. On the conceptual framework for voice phenomena. Linguistics 44(2). 217–269. in Google Scholar

Siewierska, Anna. 1984. The passive: A contrastive linguistic analysis. London: Croom Helm.Search in Google Scholar

Siewierska, Anna. 1998. Languages with and without objects: The functional grammar approach. Languages in Contrast 1(2). 173–190. in Google Scholar

Siewierska, Anna. 2005. Alignment of verbal person marking. In Martin Haspelmath, Matthew S. Dryer, David Gil & Bernard Comrie (eds.), The world atlas of language structures, 406–409. Oxford: Oxford University Press. Available at: .Search in Google Scholar

Siewierska, Anna & Eva van Lier. 2013. Introduce: Encoding a non-prototypical three-participant event across Europe. In Elly van Gelderen, Michela Cennamo & Jóhanna Barðdal (eds.), Argument structure in flux: The Naples-Capri papers, 169–200. (Studies in Language Companion Series 131). Amsterdam & Philadelphia: John Benjamins.10.1075/slcs.131.07sieSearch in Google Scholar

Silverstein, Michael. 1976. Hierarchy of features and ergativity. In Robert M. W. Dixon (ed.), Grammatical categories in Australian languages, 112–171. Canberra: Australian National University.10.1515/9783110871661-008Search in Google Scholar

Sinnemäki, Kaius. 2014. A typological perspective on differential object marking. Linguistics 52(2). 281–313. in Google Scholar

Snyder, Kieran. 2003. The relationship between form and function in ditransitive constructions. Philadelphia: University of Pennsylvania Dissertation.Search in Google Scholar

Thompson, Alexander. 1912. Beiträge zur Kasuslehre IV. Indogermanische Forschungen 30. 65–79.10.1515/9783110242706.65Search in Google Scholar

Thompson, Sandra A. 1990. Information flow and “dative shift” in English. In Jerrold Edmondson, Katherine Feagin & Peter Mühlhäusler (eds.), Development and diversity: Linguistic variation across time and space, 239–252. Dallas, TX: Summer Institute of Linguistics.Search in Google Scholar

Tournadre, Nicolas. 1995. Tibetan ergativity and the trajectory model. In Yoshio Nishi, James A. Matisoff & Yasuhiko Nagano (eds.), New horizons in Tibeto-Burman morphosyntax, 261–275. Osaka: National Museum of Ethnology.Search in Google Scholar

Tsunoda, Tasaku. 1981. Split case-marking patterns in verb-types and tense/aspect/mood. Linguistics 19. 389–438.10.1515/ling.1981.19.5-6.389Search in Google Scholar

Tucker, Matthew Alan. 2013. Building verbs in Maltese. Santa Cruz: University of California Santa Cruz Dissertation.Search in Google Scholar

van der Beek, Leonoor. 2004. Argument order alternations in Dutch. In Miriam Butt & Tracy Holloway, King (eds.), Proceedings of the LFG04 conference. Stanford, CA: CSLI Publications. Available at: .Search in Google Scholar

van Lier, Eva. 2012. Referential effects on the expression of three-participant events across languages: An introduction in memory of Anna Siewierska. Linguistic Discovery 10(3). 1–16. in Google Scholar

von Heusinger, Klaus & Georg A. Kaiser. 2005. The evolution of differential object marking in Spanish. In Klaus von Heusinger, Georg A. Kaiser & Elisabeth Starke (eds.), Proceedings of the workshop “specificity and the evolution of nominal determination systems in Romance” (Arbeitspapier 119), 33–70. Konstanz: Universität Konstanz.Search in Google Scholar

Wali, Kashi & Ashok Kumar Koul. 1994. Kashmiri clitics: The role of case and CASE. Linguistics 32(6). 969–994. in Google Scholar

Watters, David E. 2002. A grammar of Kham. Cambridge: Cambridge University Press.10.1017/CBO9780511486883Search in Google Scholar

Witzlack-Makarevich, Alena, Taras Zakharko, Lennart Bierkandt, Fernando Zúñiga & Balthasar Bickel. 2016. Decomposing hierarchical alignment: Co-arguments as conditions on alignment and the limits of referential hierarchies as explanations in verb agreement. Linguistics 54(3). 531–561. in Google Scholar

Wordick, Frank J. F. 1982. The Yindjibarndi language. Canberra: Australian National University.Search in Google Scholar

Zipf, George Kingsley. 1935. The psycho-biology of language: An introduction to dynamic philology. Cambridge, MA: MIT Press.Search in Google Scholar

Zúñiga, Fernando. 2006. Deixis and alignment: Inverse systems in indigenous languages of the Americas (Typological studies in language 70). Amsterdam & Philadelphia: John Benjamins.10.1075/tsl.70Search in Google Scholar

Published Online: 2020-12-21
Published in Print: 2021-01-27

© 2020 Martin Haspelmath, published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.