Evidence Based on a dynamic source: Database support for a theory of transitive reciprocals

Ken Safir
  • Department of Linguistics, Rutgers University, 18 Seminary Place, New Brunswick, NJ 08901
  • Email:
Published Online: 2015-06-10 | DOI: https://doi.org/10.1515/lingvan-2014-1016


This article offers supplementary material which is provided at the end of the article.

Keywords: linguistic database; reciprocal; transitive reciprocal

1 Introduction

The novelty of this document is that the empirical support for the predictions it examines, predictions about the distribution and interpretation of transitive reciprocal constructions, will be different each time it is read. The evidence will change because this paper will only provide parameters for a search of the Afranaph Database (ongoing) and two other databases, and as these databases grow and change over time, the search results returned today will be different from the results returned by the same search executed months or years from now (now = the moment that this paper has been submitted, which is 10/07/2014). Reversing the normal priorities of linguistic research, the proposal we present about the nature of reciprocal constructions in natural language (which is more broadly and specifically defended by Safir and Selvanathan (in preparation) is secondary (a) to our demonstration of the methodology we employ to support our claims and (b) to the lessons we draw from it about the evaluation of evidence for research in linguistics in the digital age.

2 Theory and predictions

The phenomenon under study is the nature and distribution of reciprocal constructions. Safir and Selvanathan (in preparation), henceforth S&S, argue against a compositional interpretation of transitive reciprocals based on the decomposition of the morphology of argument anaphors. Proponents of argument anaphor decompositional theories, the most prominent such account being that of Heim et al. (1991) (HLM), claim that reciprocal meaning is retrievable directly from the direct object anaphor. S&S argue that HLM’s account is wrong even for English, as others have shown (e.g., Williams 1991). Moreover, it appears unnecessary to appeal to such decomposition, 1 in light of the typological argument made by S&S, which evokes the polysemy of argument position anaphors (e.g., anaphors filling a direct object or indirect object position with the same properties of full lexical non-anaphoric DPs that are direct or indirect objects), on the one hand, and the co-occurrence, in some constructions in some languages, of a reciprocal affix in addition to an argument position anaphor. Many works have noted the prevalence of polysemous verbal affixes that can signal reciprocal interpretations of sentences amongst other meanings (e.g., Lichtenberk 1985; Genušiene 1987; Kemmer 1993), 2 although there are also verb affixes that are exclusively reciprocal. The same observation is not often made for argument anaphors, that is, there are languages where a direct object anaphor with a plural antecedent is always potentially ambiguous between reflexive and reciprocal readings (but see Heine 1999, whose examples are also from African languages). S&S argue that local reciprocal semantics is always imposed by a verbal affix (overt or covert), rather than whatever is in object position, and therefore they expect constructions where the reciprocal affix and the argument position anaphor co-occur.

If a local argument position anaphor has no reciprocal-related morphology and if the resulting interpretation is ambiguous between reflexive and reciprocal readings, the source of reciprocal meaning must be sought elsewhere, i.e., it is not in the argument anaphor itself, at least not all the time, if ever. S&S argue that for local reciprocity, reciprocal meaning always emerges from a verbal affix that is not always visible (does not always have a morphological exponent). In languages like English, the reciprocal shape each other just arises by shape concord with a covert reciprocal marker on the verb (a kind of agreement relation matching reciprocity on an affix and on an argument anaphor). The Afranaph Database will be exploited to show that argument position anaphor polysemy for reflexivity/reciprocity is widespread, that the forms involved have certain consistent properties, and that in several cases the polysemous argument position anaphor co-occurs with a verbal affix that serves to enforce reciprocal readings. Finally there are cases where unambiguous argument anaphors appear to be accompanied by overt verbal affixes, where it is the affixes that are crucial for the reciprocal reading (the presence of the argument position anaphor is optional).

A full defense of the theory just proposed would take us beyond the methodological questions to be addressed here. Instead, it seems best to examine a set of predictions made by the theory and see whether there is database-accessible evidence to support them. The main claim of the theory at issue here is (1).

1) If a reciprocally interpreted sentence is a transitive construction, then the direct object anaphor plays no role in determining reciprocal meaning.

Apparent reciprocal objects are only possible when the reciprocal agrees in shape with a reciprocal marker on the verb (RCM), but since the RCM is covert in some languages, it looks as if the object anaphor is contributing the meaning (and to an addressee, it is thus evidence that the RCM is present, even if not pronounced).

To test the theory, we first limit our scrutiny to transitive sentences that have reciprocal readings. That means there must be some reason to believe that the sentences are syntactically transitive (even in cases where the direct object is silent) and it must be the case that the interpretation is (at least) reciprocal. No prediction is made by the theory for intransitive reciprocals, which receive a different account (and can only be set aside if it can be shown by other tests that the structure is indeed intransitive). 3 Since in many languages, the overt RCM is indistinct from the overt reflexive affix (RFM) (e.g., Imbabura Quechua – see Cole 1982, cited in Maslova and Nedjalkov 2013, or French se) we consider only languages where the overt RCM is distinct from the overt RFM (unless neither of them is visible). Most of the southern and eastern Bantu languages, for example, have overtly distinct RFM and RCM. An argument anaphor ‘agrees in shape’ 4 in this account if it shows specific reciprocal morphology, as opposed to non-specific anaphor morphology (e.g., the same morphology is used for the reflexive). In the gloss schemas below, intended as illustrative, ANA-REC is employed as the typical agreeing argument position anaphor idiom and ANA is an object anaphor unmarked for reflexive or reciprocal reading. The gloss sentences do not correspond to possible English sentences, except for (2b). The S&S theory predicts the following typology, where all of the readings are at least reciprocal, i.e., “The men praise each other”, and the strike-through means the morpheme is silent.

2a)  Overt argument anaphor agrees in reciprocity with overt reciprocal affix

  The men praise-RCM Ana-REC (unambiguously reciprocal)

b) Overt argument anaphor agrees in reciprocity with covert reciprocal affix

  The men praise-RCM Ana-REC (unambiguously reciprocal)

c) Covert argument anaphor with overt reciprocal affix

  ‘The men praise-RCM ANA (unambiguously reciprocal)

d) Covert argument anaphor with covert reciprocal affix (ineffable) 5

e) Overt argument anaphor does not agree in reciprocity with overt reciprocal affix.

  ‘The men V-RCM ANA’ (unambiguously reciprocal)

f) Overt anaphor does not agree in reciprocity with covert reciprocal affix.

  ‘The men V-RCM ANA’ (ambiguous between reflexive and reciprocal)

All the predictions in (2) are all positive predictions, that is, they are confirmed if, for each of the strategies (2a-f), excluding (2d), there is at least one language where a given strategy is used to form transitive reciprocal sentences. Crucial for the theory are predictions in (2a), (2e), and (2f), which other theories do not predict. Cases like (2a) are attested if the overt reciprocal anaphor can be shown to be a direct object, even though the reciprocal marker appears on the verb. These cases overtly show the two markers working in tandem. Cases like (2e) are those where the object position is filled by an anaphor, the language has no special form of reciprocal argument anaphor, but the RCM is on the verb. These cases show both the teamwork and that the reciprocal interpretation is uniquely contributed by the verbal affix. Finally, cases like (2f) don’t have overt reciprocal marking of any sort, but a plural anaphor can be interpreted as reflexive or reciprocal. The difference in interpretation, we propose, is whether or not the covert RCM is present, since only the RCM contributes the semantics. Cases (2e) and (2f) do not appear to have any viable account if reciprocal meaning is confined to the argument anaphor, rather than to the affix that is on the verb, since no reciprocal meaning is evidenced by the direct object. There is also one negative prediction that we will briefly explore, and it is the following:

3) In a language that distinguishes overt RFM and RCM, the RFM cannot co-occur with an argument position anaphor that has reciprocal agreeing shape (and interpretation).

Our theory predicts that all reciprocal argument anaphors arise by concord/agreement with the RCM. If the RFM is overt and distinct from the RCM, then there is no source for reciprocal morphology on the argument anaphor.

3 Evidence from a dynamic source

The predictions of the last section could be investigated by consulting a variety of sources but it is the purpose of this essay to show how such questions can be investigated using a particular database resource, namely, the Afranaph Database, designed for the study of African languages, which is freely available online from the Afranaph website http://www.africananaphora.rutgers.edu/. Researchers of the Afranaph Project prepare a proposal to study theoretical questions in linguistics on which African language data may be brought to bear. The researchers formulate a questionnaire that is designed as an exploratory document intended to unearth the major generalizations that are relevant to the research goal. The questionnaire is sent to native-speaker linguist consultants recruited by Afranaph who then complete the questionnaire, including a great deal of elicited sentence data and commentary about the data. The consultants provide morphophonological analysis and glossing to the best of their understanding. Follow-up interaction with the project researchers involves clarifying the data, eliciting more, and working on secondary documents, such as data summaries or research papers. In principle, all the data elicited is posted on the database, often in stages. As new consultants are recruited, new language data is added, and as new research projects spawn new questionnaires, the data from the new questionnaires is incorporated into the database, mostly with the search tags associated the project that collects the data, but sometimes with the tags of other projects as well. Periodic data-policing and reanalysis sometimes recategorizes or alters data based on better understanding. The organization of the data in the database also changes as the research purposes of the different projects evolve, and the changes in the organization of the database then inevitably affect the search parameters. In short, the database is constantly expanding and evolving to suit the needs of researchers, so as the data is refined, the results of searches and even the nature of searches can change over time. The search results presented as evidence in this paper will be subject to these changes and while it is the author’s hope that future searches of the database will continue to support and enhance his conclusions, only time will tell if that hope will be realized.

We begin our search for relevant evidence by going to the Afranaph homepage http://www.africananaphora.rutgers.edu/ and clicking on ‘Afranaph Database’. At the left of the Database page, amongst the gray boxes, select ‘Full window view’ and then click on ‘Enhanced Search’ (which may have a new name soon, perhaps ‘Anaphora Search’). Three search entities are available, namely ‘Languages’, ‘Sentences’ and ‘Anaphoric markers’, which means that a search for these entities returns a set of entities of the type designated. Anaphoric markers are morphemes or morpheme combinations that are dependent for their referential value on an antecedent, and in this database, they are treated as entities to which properties areascribed based on preliminary analysis of the Afranaph data. The properties ascribed to the anaphoric markers (for each anaphoric marker in every language in the database) are entered by Afranaph staff whofollow, as best as possible, the Database Property Attributions guidelines that are provided on the site(http://www.africananaphora.rutgers.edu/attributions-hiddenmenu-187?task=view&id=140). These property attributions can be revised if new information or better analysis reveals that a property attribution should be changed, and periodically, new anaphoric marker properties are added because they permit useful searches. And, of course, as additional languages are added, the anaphoric markers in those languages are populated with properties, as the pace of analysis allows. For all these reasons, searches for anaphoric markers may produce different results over time.

So at this point we are in the database, ‘Enhanced Search’ has been clicked and at the top of the search page, the circle for anaphoric markers has been selected. The searchable properties are then of three types, corresponding to the entities that can be searched. The simplest way to call up relevant cases for the predictions in (2) is to search for all anaphoric markers that are interpreted as reciprocal and that permit the anaphoric marker to fill an argument position. To do this, open the drop-down for ‘Anaphoric marker’ by clicking on the plus sign – the properties will appear. Scroll down to ‘exponent position’ and click on ‘argument position marker’. Then scroll down to the ‘Readings’ section and click on ‘Reciprocal’. At the bottom of the page is the gray ‘search’ button and now click on that. At least thirty anaphoric markers from across the Afranaph languages will appear from the various languages in the database, all of which involve an argument anaphor and a reciprocal reading.

The name of the marker and a three or four words describing it appear along with four sections of properties on the far right. For more information about any of the markers, click on any of the four sections for that marker. For most of the markers, not all of the sections have been populated with tagged properties.

To see the sentences in the database that have been used to as support for the properties so ascribed to any particular marker, go to ‘Enhanced Search’, set the search for sentence entities, open the ‘Language’ drop-down and click on the language the marker is from. Then open the ‘Anaphoric Marker’ dropdown and enter the name of the marker. Then search. A set of example sentences, acceptable and unacceptable, will be returned in the search, any one of which can be explored further by clicking on ‘Details’ for that sentence. For example, look for sentences with Ibibio’s Body Pron anaphor, which returns about 180 sentences. We consider it especially important that Afranaph Database users can examine the evidence on which our analyses and our property ascriptions are based.

Now we are ready to look for evidence pertaining to the predictions in (1). First, let us confirm that (2f), which predicts the existence of polysemous reflexive/reciprocal unmarked anaphors, is sustainable. To do this, return to Enhanced Search, set the entity to anaphoric markers, open ‘Anaphoric Marker’, set the search for argument position anaphors again, but this time under ‘Readings’, click on ‘more than one anaphoric reading’, then search. The search should return 18 markers or more. Of these, most are of the kind pertinent to the prediction in (1). Pronoun-BODY in Ga, for example, is indeed an argument anaphor that is ambiguous between reciprocal and reflexive readings.

Id: 7005Strategy: pronoun-BODY, Language: Ga
They pinched themselves/They pinched each other.

A search for the sentences with this marker will show over 150 examples, not all of them with plural antecedents, so not all showing the ambiguity, but ID7005 is typical of reflexive/reciprocal ambiguity. Most of these sentences are unacceptable without a direct object, so it is safe to assume that the anaphor counts as the direct object. Many other markers returned by the marker search are equally clear cut – (2f) is confirmed.

Turning to (2a) and (2e), we are interested in contexts where two markers, one on the verb, one in argument position, appear together in sentences with a reciprocal reading. The Afranaph Database is designed to find such cases, which are a subset of ‘combination anaphors’. Combination anaphors are those that have two parts, each of which can appear independently. To search for these, open Enhanced Search and set the entity search at the top of the page for ‘Anaphoric Marker’. Then open the ‘Anaphoric Marker’ dropdown, select ‘Argument Position Marker’, which is an option under “Exponent position”. Next, under ‘Exponent complexity type’, click on ‘Combination Marker’. Then scroll down to ‘Readings’ (near the bottom), set ‘Reciprocal’, then click on the ‘Search’ button at the bottom of the page. The search should return at least 10 markers from a total of 5 languages (see Screenshot #1 in the Supplementary material [online]). All but one of these constructions 6 involves the joint appearance of an affix on the verb and what appears to be an argument position anaphor. By clicking on ‘Marker Shape’ for Lubukusu ‘RCM + reciprocal phrase’, we see examples with both an RCM and a phrase, where the reciprocal phrase is babeene khu beene (the ba- prefix following khu is deleted by a regular rule) and a comment indicates that it is not clear whether or not the reciprocal phrase is in object position or is, instead, an adjunct. If the argument anaphor is not in an argument position, prediction (2a) is undermined, so it is then an analytic question as to whether or not the reciprocal phrase in Lubukusu is, in fact in an argument position.

At this point, we come up against the limitations of the Database information because some close analytical work is necessary to determine what the diagnostics are for argument positions in Lubukusu and whether the distribution of babeene khu beene is consistent with those diagnostics. This has been done for Lubukusu in two papers, including the Lubukusu Anaphora Sketch (Safir and Sikuku 2011) in the Case File for Lubukusu on the Afranaph site (which was preliminary work) and in Baker, Safir and Sikuku (2011), also available on the site as Technical Report #7 (see also S&S). The example provided under ‘Marker shape’ for RCM + reciprocal phrase, reproduced below, is part of that evidence. 7

Id: 3763Strategy: RCM +Reciprocal Phrase, Language: Lubukusu
(The) boys expect each other to win.

The reciprocal phrase corresponding to the agent of –khil- can be represented on -komb- with an object marker is evidence that the reciprocal phrase is in an argument position. (Notice that all sentence examples have unique ID numbers and can be located, if you know the ID number, by entering it in Simple Search). Moreover, a search for all sentences with RCM + reciprocal phrase in Lubukusu produces over thirty examples, including ID 5161, where an adjunct is not allowed to intervene between the verb + RCM and a reciprocal phrase, as is generally the case for direct objects in Lubukusu. These remarks are only to show how the view that the reciprocal phrase is in an argument position can be justified, a demonstration that is not available for all of the markers designated as combination anaphors and one that further research must establish, with attention to the argument position diagnostics in each relevant language.

The upshot of the discussion so far is that we have found confirmation for predictions (2a) and (2f), and so now we turn to prediction (2e), the cases where an RCM appears on the verb, there is no special reciprocal form for the argument position anaphor, and the reading is unambiguously reciprocal. Returning to the search for combination markers that can be interpreted reciprocally, we see among them RCM + Pronoun-MIEN in Bulu. The sample sentence given under Marker Shape, ID5325, illustrates the construction.

Id: 5325Strategy: RCM + Pronoun-MIEN, Language: Bulu
“The women saw each other”

To see that there is no dedicated reciprocal argument anaphor in Bulu, click on ‘Browse’ amongst the boxes on the lefthand side of the page, then on ‘Bulu’, then on ‘Details’. The Bulu page comes up with five anaphoric strategies listed, none of which consists of an argument anaphor that is uniquely reciprocal. If the RCM is missing, the sentence is interpreted reflexively. Thus (2e) is now confirmed, in that it is possible to have an overt RCM with an argument anaphor that does not distinguish reciprocal and reflexive readings. Other anaphoric constructions that support this conclusion include RCM + PRN + ORN.OBL-mbɔ̂ŋ, also in Bulu, for RCM-se- + BODY in Limbum, and RCM + Agr-eene.in Lubukusu. 8

Finally we come to the negative prediction in (3), repeated below

3) In a language that distinguishes RFM and RCM, 9 the RFM cannot co-occur with an argument position anaphor that has reciprocal agreeing shape (and interpretation).

To search for such sentences, go to Enhanced Search, and in the ‘Sentence’ dropdown, enter ‘RFM’ on the Gloss line, ‘each other’ in the Translation line, 10 and then open the Anaphoric Marker dropdown and click on ‘Argument position marker’ for the ‘Exponent Position’ section. Then search (see Screenshot #2 in the Supplementary material [online]). When the first draft of this paper was written, at least 9 examples came up, all from Lubukusu. The first eight have both the RFM and RCM present and the first six of these do not have the reflexive argument anaphor, consistent with prediction if it is interpreted to mean that reciprocal argument anaphors agree in shape with their RCM antecedents, if there is one. ID1533 seems to permit the reflexive argument anaphor, contrary to prediction, but then the argument anaphor may have the option to agree with the RFM, which is also present. ID1553 is closer to a counterexample, though the AGR-eene is embedded in a PP, which also may be a factor to consider. Further analysis will be required. The status of (3) seems to be in question for the Lubukusu data, but for a few rare cases that are underanalyzed. In the months intervening between the review of this paper and this revised draft, evidence from Chaha was entered that may challenge (3), in that the Chaha examples show RFM with an argument anaphor and the sentence is interpreted as reciprocal. This would suggest that the argument is contributing reciprocal meaning, contrary to prediction. The prediction requires, however that Chaha should be a language where RFM and RCM are distinguished, and according to the analysis on the Chaha browse page in the database, RCM in Chaha is restricted to inherent reciprocal readings. Thus it is not certain that RFM is restricted to reflexive readings, as it always appears in combination, except in some inherent reflexive contexts. The jury is still out pending further analysis of Chaha. Let’s see what this prediction looks like in a few years when the database is searched the same way!

The existence of polysemous argument position anaphors in languages not in the Afranaph Database are not studied here, but it is useful to report that at least one other typological database, the TDS typology of reciprocal markers (http://languagelink.let.uu.nl/burs/), 11 also reports such cases, which can be found by searching for the entity type: ‘reciprocal markers’, selecting for exponent position, “NP (one of the coindexed NP positions)” and under polysemy types, select ‘yes’ for reflexive. Although 16 forms come up from 14 languages (out of a sample of 110 languages), not all the cases show that the form acts as a direct or indirect object by comparison with non-anaphoric (in)direct objects (i.e., evidence that the anaphoric marker is in argument position), or if the language is SOV, there is not always clear evidence about what is an object and what is a subject if there is only one argument before the verb. These cases are frequently based on publications, third party reports or very limited elicitations, so in many cases it is not easy to verify the analytic conclusions on the basis of the primary (sentence and morphological) evidence provided. Information on combination anaphors is not provided. The primary goal of the TDS Typology is breadth rather than Afranaph-style depth and there is just enough evidence in many cases to determine whether a language is a good place to look in more detail for a reciprocal of a particular type, just as I am doing now. 12 The World Atlas of Language Structures http://wals.info/ is another online database resource, 13 but it is not directly useful for the predictions made here, as it does not distinguish affix polysemy from argument position polysemy 14 and rarely provides more than a few examples supporting any conclusion. Still, it is potentially very useful, because in most cases there is a pertinent reference to a richer source. The strategy of the Afranaph Database is different and relies on much more detailed evidence that supports the property ascriptions, all made available online, but, on the other hand, the class of languages covered by Afranaph is geographically limited and opportunistic, 15 and in that sense it is best suited to existence arguments.

To conclude this section, it appears that he key positive predictions of the Safir and Selvanathan theory are supported by results gathered from the Afranaph Database, and the one negative prediction appears likely to be true, but there are some interesting potential counterexamples that deserve closer analysis.

4 Looking ahead

Most of the supporting evidence for the proposals of S&S is in the form of positive evidence, that is, we make predictions about phenomena that will be found and then we find them. These predictions may fail in the future because the positive results may be reclassified as phenomena of another sort, and in that case, our positive result will disappear. However, this is unlikely, since most of the evidence that appears in the database is the output of an initial round of analysis, some of it intensive. We report what we have found. The same cannot be said of negative results, which most often stem from what we may not have yet entered, even if we have found it, and here the weakness of Afranaph shows – it is relatively narrow in both the range and genealogical classes of the languages it covers. Moreover, the Afranaph Database, like all of the databases mentioned, is a growing work in progress. That means that data is coming in which has not been recorded yet and may not be recorded in the database simply due to delays in our process. A negative claim may turn out to be false when the next piece of data appears, but evidence for a positive one means that supporting evidence has already been recorded, and the more of it there is, the less likely it is that reclassification will remove all the positive cases (though it could happen, whether evidence is from a database or not). In the latter respect I have taken few chances. Almost all my predictions are positive predictions about what will be found. While at this point it appears that our negative predications are confirmed, we are at the mercy of the next piece of data, as the Chaha data that arrived between drafts makes clear, and this too is not a problem unique to databases. The way that these databases are fleshed out with new data, however, the chance of false negatives is perhaps higher, especially if one relies only on what is in the database, ignoring what is in the literature.

Faced with the immense diversity of the world’s existing languages, most of which linguists know very little about, we must ask how we can test our hypotheses about the possible range of linguistic variation against the facts as we understand them. We will never know about all the languages that have ever been spoken or that ever will be. On the other hand, the existing languages provide a rich guide to what we know to be possible, rare and unattested, and insofar as linguistic theories accord with what we know at a given point in time, our confidence that we are on the right track has some justification, even if we turn out to be wrong in detail, or even in conception, several years down the road. One of the best indications that a science is making progress, however, is that the questions it generates leads to the discovery of empirical patterns that would not likely have been discovered otherwise. We believe that the patterns uncovered by the research reported here have this character.


The author would like to acknowledge the support of NSF BCS-1324404 and the advice and commentary of Jeff Good, Larry Hyman and Naga Selvanathan.

I would like to thank all of the consultants who helped us develop rich datasets for the languages mentioned in searches here. They are Derib Ado Jekale (Amharic), Pius Akumbu (Babanki), Pius Tamanji (Bafut), Paul Bassong (Basa;a), Nancy Kula (Bemba), Noureddine Elouazizi (Berber-Tarifyt), Cyrill Ondoua Engon (Bulu), Sylvester Ron Simango (Cinsenga), Gabriel Djomeni (Fe’efe’e), Akua Campbell (Ga), Enoch Aboh (Gungbe), Willie Willie (Ibibio), Rose Letsholo (Ikalanga), Mamadou Bassene (Jóola Eegimaa), Kinande (Philip Mutaka), Kirundi (Juvenal Ndayiragije), Ongaye Oda (Konso), Francis Ndi Wepngong, (Limbum), Alex Iwara (Lokaa), Justine Sikuku (Lubukusu), Edmond Biloa (Tuki), Rose Aziza (Urhobo), Khadi Tamba (Wolof), and Oluseye Adesola (Yoruba).


