Show Summary Details
More options …

Theoretical Linguistics

An Open Peer Review Journal

Editor-in-Chief: Krifka, Manfred

Ed. by Gärtner, Hans-Martin

4 Issues per year

IMPACT FACTOR 2016: 0.864
5-year IMPACT FACTOR: 1.744

CiteScore 2016: 0.72

SCImago Journal Rank (SJR) 2016: 0.555
Source Normalized Impact per Paper (SNIP) 2016: 1.105

Online
ISSN
1613-4060
See all formats and pricing
More options …
Volume 42, Issue 1-2 (Jul 2016)

Formal monkey linguistics

Philippe Schlenker
• Corresponding author
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• Department of Linguistics, New York University, New York, NY, USA
• Email
• Other articles by this author:
/ Emmanuel Chemla
• Laboratoire de Sciences Cognitives et Psycholinguistique (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• Other articles by this author:
/ Anne M. Schel
/ James Fuller
• Department of Ecology, Evolution and Environmental Biology, Columbia University, New York, NY, USA
• New York Consortium in Evolutionary Primatology (NYCEP)
• BCC, City University of New York, NY, USA
• Other articles by this author:
/ Jean-Pierre Gautier
/ Jeremy Kuhn
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• Other articles by this author:
/ Dunja Veselinović
/ Kate Arnold
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• School of Psychology and Neuroscience, University of St Andrews, St Mary’s Quad, St Andrews, Fife, UK
• Other articles by this author:
/ Cristiane Cäsar
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• School of Psychology and Neuroscience, University of St Andrews, St Mary’s Quad, St Andrews, Fife, UK
• Instituto de Ciências da Natureza, Universidade Federal de Alfenas, Alfenas, Brazil
• Instituto de Pesquisa Bicho do Mato, Belo Horizonte, Brazil
• Other articles by this author:
/ Sumir Keenan
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• School of Psychology and Neuroscience, University of St Andrews, St Mary’s Quad, St Andrews, Fife, UK
• Other articles by this author:
/ Alban Lemasson
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• Université de Rennes 1, Ethologie animale et humaine (UMR 6552 – CNRS), Station Biologique, Paimpont, France
• Other articles by this author:
/ Karim Ouattara
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• Laboratory of Zoology and Animal Biology, University Félix Houphouet Boigny, Abidjan, Ivory Coast
• Other articles by this author:
/ Robin Ryder
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• Centre de Recherche en Mathématiques de la Décision, CNRS, UMR 7534, Université Paris-Dauphine, PSL Research University, Paris, France
• Other articles by this author:
/ Klaus Zuberbühler
• Institut Jean-Nicod (ENS, EHESS, CNRS), Département d’Etudes Cognitives, Ecole Normale Supérieure, PSL Research University, 29 rue d’Ulm, 75011 Paris, France
• School of Psychology and Neuroscience, University of St Andrews, St Mary’s Quad, St Andrews, Fife, UK
• Centre for Cognitive Science, University of Neuchâtel, Neuchâtel, Switzerland
• Other articles by this author:
Published Online: 2016-07-05 | DOI: https://doi.org/10.1515/tl-2016-0001

Abstract

We argue that rich data gathered in experimental primatology in the last 40 years can benefit from analytical methods used in contemporary linguistics. Focusing on the syntactic and especially semantic side, we suggest that these methods could help clarify five questions: (i) what morphology and syntax, if any, do monkey calls have? (ii) what is the ‘lexical meaning’ of individual calls? (iii) how are the meanings of individual calls combined? (iv) how do calls or call sequences compete with each other when several are appropriate in a given situation? (v) how did the form and meaning of calls evolve? We address these questions in five case studies pertaining to cercopithecines (Putty-nosed monkeys, Blue monkeys, and Campbell’s monkeys), colobinae (Guereza monkeys and King Colobus monkeys), and New World monkeys (Titi monkeys). The morphology mostly involves simple calls, but in at least one case (Campbell’s -oo) we find a root-suffix structure, possibly with a compositional semantics. The syntax is in all clear cases simple and finite-state. With respect to meaning, nearly all cases of call concatenation can be analyzed as conjunction. But a key question concerns the division of labor between semantics, pragmatics and the environmental context (‘world’ knowledge and context change). An apparent case of dialectal variation in the semantics (Campbell’s krak) can arguably be analyzed away if one posits sufficiently powerful mechanisms of competition among calls, akin to scalar implicatures. An apparent case of non-compositionality (Putty-nosed pyow-hack sequences) can be analyzed away if one further posits a pragmatic principle of ‘urgency’, whereby threat-related calls must come early in sequences (another potential case of non-compositionality – Colobus snort-roar sequences – might justify assigning non-compositional meanings to complex calls, but results are tentative). Finally, rich Titi sequences in which two calls are re-arranged in complex ways so as to reflect information about both predator identity and location are argued not to involve a complex syntax/semantics interface, but rather a fine-grained interaction between simple call meanings and the environmental context. With respect to call evolution, we suggest that the remarkable preservation of call form and function over millions of years should make it possible to lay the groundwork for an evolutionary monkey linguistics, which we illustrate with cercopithecine booms, and with a comparative analysis of Blue monkey and Putty-nosed monkey repertoires. Throughout, we aim to compare possible theories rather than to fully adjudicate between them, and our claims are correspondingly modest. But we hope that our methods could lay the groundwork for a formal monkey linguistics combining data from primatology with formal techniques from linguistics (from which it does not follow that the calls under study share non-trivial properties, let alone an evolutionary history, with human language).

1 Introduction

We argue that rich data gathered in experimental primatology in the last 40 years can benefit from the type of analytical methods used in contemporary linguistics. Focusing on the syntactic and especially semantic side, we argue that these methods could help clarify five questions: (i) what morphology and syntax, if any, do monkey calls have? (ii) what is the ‘lexical meaning’ of individual calls? (iii) how are the meanings of individual calls combined? (iv) how do calls or call sequences compete with each other when several are appropriate in a given situation? (v) how did the form and meaning of calls evolve?

We address these questions in five case studies pertaining to cercopithecines (Putty-nosed monkeys, Blue monkeys, and Campbell’s monkeys), colobinae (Guereza monkeys and King Colobus monkeys), and New World monkeys (Titi monkeys). As we will see, the morphology mostly involves simple calls, but in at least one case (Campbell’s -oo) we find a root-suffix structure, possibly with a compositional semantics. The syntax is in all clear cases simple and finite-state, and our findings will be extremely modest in this area (see Kershenbaum et al. 2014a, 2014b for a far more ambitious program in animal syntax). With respect to meaning, nearly all cases of call concatenation can be analyzed without positing any non-trivial semantic operation – each call can be analyzed as a separate utterance. But a key question concerns the division of labor among semantics, pragmatics and properties of the environment (‘world knowledge’ and context change). An apparent case of dialectal variation in the semantics (Campbell’s krak) can arguably be analyzed away if one posits sufficiently powerful mechanisms of competition among calls, akin to scalar implicatures. An apparent case of non-compositionality (Putty-nosed pyow-hack sequences) can be analyzed away if one further posits a pragmatic principle of ‘urgency’, whereby threat-related calls must come early in sequences. On the other hand, another potential case of non-compositionality (Colobus snort-roar sequences) might justify assigning non-compositional meanings to complex calls, but results are rather tentative. Finally, rich Titi monkey sequences in which two calls are re-arranged in complex ways so as to reflect information about both predator identity and location are argued not to involve a complex syntax/semantics interface, but rather a fine-grained interaction between simple call meanings and a complex ecology. With respect to call evolution, we suggest that the remarkable preservation of call form and function over millions of years should make it possible to lay the groundwork for an evolutionary monkey linguistics, which we illustrate with cercopithecine booms, and with a comparative analysis of Blue monkey and Putty-nosed monkey repertoires.

Throughout, we aim to compare possible theories rather than to fully adjudicate between them, and our claims are correspondingly modest. But we hope to convince the reader that the simple methods we apply to monkey calls make it possible to study them in greater detail than has been the case heretofore, and lead to new research questions as well as new predictions that might help decide among competing theories. We thus believe that these methods should lay the groundwork for a formal monkey linguistics combining data from primatology with formal methods from linguistics. We emphasize that our general methodological claims do not entail that the calls under study share non-trivial properties, let alone an evolutionary history, with human language; on the other hand, we think the precise approach we advocate is a precondition to a meaningful comparison.

In what follows, we freely apply linguistic terminology to monkey calls, with the belief – motivated below – that they can and should be studied as formal languages with a sound system, a lexicon, a morphology, a syntax, a semantics, and a pragmatics. Descriptively, whenever possible we use the term sentence to refer to a sequence of calls preceded and followed by a longer-than-normal pause; we use the term discourse to refer to the series of sentences triggered by an external event (for instance an eagle shriek or a leopard growl). 1 While we could say that a sentence is ‘adequate’ or ‘inadequate’ in a certain situation, we prefer to employ traditional logical terminology and use the terms ‘true’ and ‘false’, which are more familiar to linguists. These terminological moves are just intended to facilitate the discussion; crucially, they should not be taken to imply that the formal properties of monkey languages are similar to those of human language – in the cases we study below, for the most part they just aren’t.

1.1 Monkey calls

Observations and field experiments in primatology have yielded detailed information about vocal communication in primates in general and monkeys in particular (see for instance Zuberbühler 2003, 2009 for surveys). These pertain to the inventory, use, structure, and sometimes phylogeny and ontogeny of various calls, in particular alert calls 2 – with rare cases of apparent dialectal variation (Schlenker et al. 2014). Naturalistic observations make it possible to establish statistical correlations between (i) properties of the situations, such as the presence of predators or encounters between monkey groups, and (ii) calls used in those situations. Field experiments have typically been of two types. In trigger-to-call experiments, the presence of a disturbance is simulated, for instance by way of playback of leopard growls or eagle shrieks, or through the presence of a model leopard or eagle; in call-to-behavior experiments, alert calls are played back and the monkeys’ responses are assessed. Note that since calls may be viewed as ‘triggers’ or as ‘behaviors’, these two categories are not mutually exclusive; as a result, there is some overlap between the categories in (1)b and (1)c below.

(1)

Types of generalizations

a.

Naturalistic observations: correlations between (i) properties of the situation, and (ii) calls used in that situation.

b.

Trigger-to-call experiments: causal generalizations of the form: if (i) the situation has property P, then (ii) sequences S1, …, Sn of calls may/must be used.

c.

Call-to-behavior experiments: causal generalizations of the form: if (i) a sequence S of calls is used, then (ii) the target subjects behave as if the situation had property P.

For each of the species studied here, we will show that the patterns resulting from these studies show non-trivial properties from a morphological, syntactic, semantic, or pragmatic point of view. Let us briefly highlight some of these findings here.

The calls of male Campbell’s monkeys display patterns that pertain to lexical semantics, morphological composition, and pragmatic competition. Male Campbell’s monkeys were argued in Ouattara et al. (2009a, 2009b) to have at least three roots 3 (= boom, krak, hok) and one suffix (= -oo) that can be appended to both krak and hok. 4 The choice of call, and the optional suffixation of -oo, were shown to have systematic effects on the contexts in which the calls can be used: the authors argued that krak primarily has leopard-related uses, that hok primarily has eagle-related uses, and that krak-oo is used as a general alert call. Further refinements were offered in later work. Adding to the picture, Schlenker et al. (2014) discussed comparative data (due to Arnold and Keenan) suggesting that there is apparent dialectal variation between the Tai forest and Tiwai island: although krak has a primarily leopard-related meaning in Tai (where leopards exist), the same call is used as a general alert call on Tiwai, where leopards have been absent for decades. Schlenker et al. (2014) developed several possible analyses, one of which posited that krak was a general alert call on both sites, hok was related to non-terrestrial disturbances, boom involved non-predation situations, and -oo had a (compositional) attenuative function. The apparent dialectal variation was handled in this theory by positing that rules of competition among calls give rise to an optional enrichment of krak into something akin to krak and not hok and not krakoo; the resulting meaning was that of a ‘serious terrestrial disturbance’, which could single out leopards in Tai but would be essentially useless on Tiwai.

On the syntactic side, we will focus on a collection of interesting distributional patterns in the call sequences of Putty-nosed monkeys, Black-and-White Colobus monkeys, and Titi monkeys. In general, the syntax of primate call sequences is not well understood; for Campbell’s monkeys, for example, categorical distributional generalizations are hard to come by. 5 This is in sharp contrast with the syntactic generalizations that have been offered in the literature on birdsongs: these have led to sophisticated analyses that make it possible to ask where birdsongs are in the Chomsky hierarchy of formal languages (Berwick et al. 2011). The primate literature hasn’t reached anything like this level yet. Nevertheless, the cases discussed below show non-trivial syntactic patterns, where calls are structurally combined in some way, with a potentially compositional semantics. Combinations may in principle be of two kinds: phonological, in which case it makes no sense to seek a compositional meaning (since the parts combined – e. g. /n/ + /o/ in no – may have no meaning in the first place); or morphological/syntactic – in which case rules of semantic composition may be sought.

In a series of articles, Arnold and Zuberbühler have investigated how Putty-nosed monkeys combine two calls, pyow (=P) and hack (=H; see Arnold and Zuberbühler 2012). Pyows function as general alert calls, whereas hacks are usually related to aerial predators. On the syntactic side, these two calls may be combined in several ways, but it’s not the case that anything goes: for example, alternating sequences of single pyows and single hacks (e. g. PHPHPHPHPH) are not found. Similarly, one finds sequences made of a few pyows followed by a few hacks (e. g. PPPHHH), called ‘pyow-hack sequences’; but the opposite pattern – a few hacks immediately followed by a few pyows – isn’t common. Descriptively, these are relatively clear syntactic patterns – whatever their source may be. On the semantic side, pyow-hack sequences were shown in field experiments to trigger group movement. This immediately raises a non-trivial question: is the meaning of pyow-hack sequences somehow derived from the meaning of their parts? Arnold and Zuberbühler (2012) give a negative answer, but it is clear that one’s theoretical assumptions about the lexical meaning of the calls and the rules of combination might radically change the picture – which is precisely what we will argue below.

Black-and-White Colobus monkey calls studied by Schel et al. (2009) also display interesting patterns, with three kinds of sequences: individual snorts (henceforth s), roaring sequences, made of a series of roars (henceforth r+), and snort-roar sequences, made of a single snort immediately followed by a roaring sequence (henceforth sr+). Here too, the syntax is apparently constrained. Descriptively, snorts are never repeated without pause; if they appear in a sentence, it is always at the beginning; and snort-roar sentences are quite rare at the very beginning of discourses. There are also interesting semantic generalizations to be obtained. While snorts and roars occur in every context, snorts given singly only occur in ground predator (including chimpanzee) contexts; and while repetitions of roars occur in every context (though they are less common in leopard-related situations), when they occur at the beginning of discourses they are indicative of eagles. An obvious challenge is to account for these fine-grained semantic generalizations: is it possible to give a unified meaning to snorts on the one hand, and to roars on the other, or should we posit that in some cases the smallest units that convey information are sequences of several calls (so that the elementary calls play the role of phonemes)? While the patterns are very different from those we sketched for Putty-nosed monkeys, the foundational issues they raise are of the same nature. As we will see below, clear syntactic and semantic patterns also emerge in the Titi monkeys of South America, which have led to fascinating generalizations in recent work by Cäsar et al. (2012a, 2012b, 2013) – a point to which we return below.

While these data pertain to the form and meaning of monkey calls, their ontogeny (development from infanthood to adulthood) and phylogeny (evolutionary development) can be studied as well. In their pioneering study of monkey alert calls, Seyfarth and Cheney (1980) showed by way of field experiments that vervet monkeys use three alarm calls that carry information about the presence of eagles, snakes and leopards respectively. But they also investigated the development of these calls from infants to adults, and showed that “as infants grow older they sharpen the association between predator species and the type of alarm call”. For instance, the ‘eagle’ call is initially triggered by all sorts of birds, and gradually acquires a much narrower use related to dangerous eagles. These findings can presumably be interpreted in at least two ways: it might be that the meaning of the calls is initially broad and becomes narrower over time; alternatively, it might be that the meaning remains fixed and is broad (‘bird call’), but that the monkeys gradually learn that only predators are worth calling attention to.

The phylogeny of calls was traditionally studied indirectly. When DNA data were less prevalent and complete than they are today, some prominent primatologists used phonetic similarities among the calls of different species to help reconstruct their phylogenetic trees (see for instance Gautier 1988) – a methodology that has fallen out of favor in more recent years. But it is striking that the form and sometimes the function of some calls is remarkably well preserved over millions of years. To give but two examples: booms are found in various points of the phylogenetic tree of cercopithecines. This is compatible with the view (among others) that they were present in the most recent common ancestor of all cercopithecines, who according to the DNA study of Guschanski et al. (2013) lived more than 10 million years ago (see the detailed tree at http://sysbio.oxfordjournals.org/content/62/4/539/F1.expansion.html); we will remain agnostic below concerning such ancient periods, but we will make claims about what happened 2–3 million years ago. 6 Furthermore, as far as we know booms generally have to do with non-predation situations, hence part of their function might go way back as well. More recently, Putty-nosed and Blue monkeys separated approximately 2.5 million years ago (Guschanski et al. 2013), but they have very similar pyows and hacks. The prospect of an evolutionary linguistics of monkey calls is tantalizing, and we will come back to it below.

1.2 Initial questions

While the data are rich, their interpretation – especially with respect to their cognitive status – has been a matter of debate.

First, should the calls really be analyzed as communicative/‘linguistic’ signals? While monkeys usually react appropriately to the calls of conspecifics in call-to-behavior experiments ((1)c above), alert calls also seem to have a role in deterring predators, for instance by letting them know that they have been detected (e. g. Caro 2005; Zuberbühler 2009). Furthermore, the fact that monkeys extract information from the calls of conspecifics doesn’t necessarily place these in a different category from other regularities of nature. Hornbills (a brightly colored bird) were shown in field experiments to react appropriately to some calls of Diana monkeys: they discriminated between eagle Diana calls, which were indicative of a direct threat for them, and leopard calls, which were not (Rainey et al. 2004a, 2004b; Zuberbühler 2009). This certainly indicates that hornbills associate Diana calls to the presence of predators, but maybe they do so by the same mechanisms that allow them to associate eagle shrieks to the presence of eagles – there might be nothing specifically ‘linguistic’ about this inference. In this piece, we will take the view that calls have conditions of use that must be described and modeled irrespective of the mechanisms involved; it is by studying the generalizations and possible theories that one might come to the conclusion that these calls involve more or less specific abilities.

Second, is the production of calls in any way voluntary and/or attuned to the audience? Chimpanzee alarm calls were shown to be triggered differentially depending on the belief state of the audience; in an experiment discussed in Crockford et al. 2012, “chimpanzees were more likely to alarm call in response to a snake in the presence of unaware group members than in the presence of aware group members”. This rules out a low-level mechanism whereby the presence of a threat automatically triggers the production of the call. But we know of no such results in the monkeys under study here, which leaves open the possibility that call production is a fairly automatic process. In the present piece, we will take the position that call use gives rise to interesting semantic problems irrespective of whether call production is voluntary.

Third, should one analyze the proximate mechanisms by which calls are triggered (or understood), or should one study their evolutionary function? As pointed out in Fuller 2013, the two notions might in principle diverge. To give an example we will revisit below, the hacks of Putty-nosed monkeys might be triggered by situations of high arousal, as Arnold has emphasized (e. g. Arnold and Zuberbühler 2013). But it might be because these are strongly correlated with the presence of eagles that the call was selected: the evolutionary function might be that of an eagle call, even if the proximate mechanism and some of the uses are not eagle-related. Similarly, it could be that a call was selected because it had a variety of positive effects on conspecifics, and that for this reason searching for the function of the call is misguided. Here, we will take the position that a good understanding of the proximate mechanisms by which calls are produced and interpreted will help address more ambitious questions about call function in the future.

Fourth, when calls communicate information to conspecifics, what is their semantic content? One important distinction is whether calls directly encode information about a predator type, or about properties of the threat (level, directional origin, etc), or a combination of both. In the first case (information about predator types), ethologists often say that the calls are ‘functionally referential’. As will be seen below, whether calls make reference to types of objects or to arousal levels they might give rise to is sometimes hard to answer; for instance, if Putty-nosed hacks are triggered by high arousal, and it turns out that the latter is mostly or exclusively associated with the presence of eagles, should we say that hacks are functionally referential? These fine-grained distinctions are sometimes empirically important, but they are hard to draw. Fortunately, in several cases the structure of the informativity relations among calls matters more than their fine-grained lexical content, and for this reason some of our analyses will not require that we take a stand on such issues. More generally, in most cases the general form of our theories will not be dependent on the distinction between ‘referential’ and ‘non-referential’ meanings for calls in the primatologist’s sense, and we will correspondingly de-emphasize this question.

1.3 The phylogenetic landscape

In the present study, we will be primarily concerned with Old World and New World monkeys. As illustrated in (2), in some estimates the most recent common ancestor of Old World monkeys and apes (and humans) lived more than 30 million years ago. The most recent common ancestor of New World monkeys and apes (or for that matter New World monkeys or humans) lived more than 40 million years ago. These distances alone explain why we are currently agnostic about the relation between these monkey languages and human language. (Furthermore, when similarities are found among systems that are that distant, they may well be due to convergent evolution rather than common descent.) Of the species discussed in this paper, Putty-nosed monkeys, Blue monkeys and Campbell’s monkeys are all part of a subfamily of Old World monkeys called ‘cercopithecines’ (technically, ‘cercopithecini’, included in (2) under the larger family of ‘cercopithecinae’ 7). Putty-nosed monkeys and Blue monkeys have a most recent common ancestor that lived approximately 2.5 million years ago. In turn, their most recent common ancestor with Campbell’s monkeys lived approximately 7.5 million years ago (Guschanski et al. 2013). We will also be concerned with Black-and-White Colobus monkeys, which are part of another family of Old World monkeys, called ‘colobinae’. Colobus monkeys and cercopithecines probably have a most recent common ancestor that lived – very approximately – 18 million years ago (Perelman et al. 2011). Finally, we will be interested in the calls of Titi monkeys living in South America; since they are New World monkeys, their most recent common ancestor with cercopithecines and Colobus monkeys lived more than 40 million years ago. We will remind the reader of these phylogenetic facts when the relative proximity or distance among the species is relevant.

(2)

Primate phylogeny

Source: Perelman et al. 2011. Figure drawn with Lucie Ravaux’s help.

The next section describes our formal approach, and later sections illustrate it on several case studies.

2.1.1 A two-step approach

We believe that the questions introduced in Section 1.2 are both important and difficult, but that much can be gained by adopting a two-step approach: one should start by setting up explicit theories of the form and truth conditions of monkey sequences; one should then ask how the relevant rules are cognitively implemented.

(3)

A two-step approach

• a.

A detailed formal theory should be developed to predict in detail

• i)

the form of monkey sequences, i.e. which sequences are permissible in monkey productions (in linguistic parlance, this requires a phonological, morphological and/or syntactic analysis);

• ii)

the meaning of monkey sequences, understood as the conditions under which sequences of alert calls are applicable or ‘true’. The theory should in particular be constrained by naturalistic observations, trigger-to-call and call-to-behavior experiments ((1) above).

• b.

The cognitive implementation of rules coming out of (a) should be evaluated based on a combination of behavioral and/or neurological data (when available), and theory-internal considerations (e.g. the potential decision to establish a division of labor between ‘literal meaning’ and ‘pragmatics’).

To be clear, we believe that the ultimate goal is to have a detailed theory of what goes on in the monkeys’ minds when they produce or perceive calls. But we think that this goal will be better achieved once we have a detailed theory of the form and use of monkey calls. In fact, we believe that there are two relevant lessons to be gained from the history of formal linguistics. One is that much empirical and conceptual precision can be gained by treating natural communicative systems as interpreted formal languages. The second is that once a formal approach is in place, it can naturally be integrated into a cognitive framework in which precise questions can be asked about the division of labor among different modules.

A perennial question in the literature on animal communication is whether the communicative system of species X can ‘really’ be called a ‘language’. But from the perspective of contemporary linguistics, human language itself is treated as a formal language of a particular kind. And it takes extraordinarily little for something to count as a formal language. An example is given in (4), where we treat the expressions 0 (‘zero’), s0 (‘the successor of zero’), ss0 (‘the successor of the successor of zero’), etc. as a set of well-formed strings defined from a very simple lexicon with two atomic expressions, 0 and s. We can also endow this set of well-formed expressions with a compositional semantics, whereby 0 denotes the number zero, and for any well-formed expression sE denotes the number that follows what E denotes.

(4)

An example from arithmetic: {0, s0, ss0, sss0, …}

a.

Lexicon: {0, s}

b.

Syntax: 0 is an expression; if E is an expression, so is sE.

c.
 Semantics: 0 denotes zero; For all E, sE denotes the successor of what E denotes.

Should we call the set of strings 0, s0, ss0, … a language? It depends on what we mean by ‘language’. If a ‘language’ is just a set of well-formed expressions, the answer is positive. If a ‘language’ is a set of well-formed expressions with a semantics, it is only with the addition of (4)c that the set qualifies. There are all sorts of other reasonable terminological choices: we could require that a ‘language’ have certain designated syntactic or semantic properties; or that it should have live users and possibly a pragmatics, etc. Turning to natural systems, the question ‘Does species X have language?’ is of no particular interest unless one has said what one means by ‘language’. With the minimal definition of formal language theory, one will be able to give a trivially positive answer in countless cases – which only highlights the fact that with this definition the (trivial) question ‘Does species X have language?’ should be replaced with the (hard) question ‘What are the formal properties of the language of species X’. This is the approach we will take in this piece.

2.1.2 Lessons from formal linguistics

But isn’t such an approach overly formal? We would argue that even in the case of human language, where the existence of sophisticated cognitive abilities is uncontroversial, a precise characterization of the formal properties of a language is a good first step towards an analysis of its cognitive implementation. In syntax, various approaches broadly inspired by Chomsky’s work (e. g. 1957, 1965) developed a formal and a cognitive approach in tandem. In semantics, the model-theoretic tradition pioneered by Montague started out with an analysis of English as a purely formal language, with minimal commitments about its cognitive status (e. g. Montague 1970a, 1970b); later generations asked further questions about the cognitive reality and implementation of semantic knowledge – and fully built on results of the formal approach (e. g. Bott et al. 2011; Chemla et al. 2014a, 2014b).

The way in which formal models may yield insight into cognitive representations can be illustrated with the simple example of the meaning of the word or in English. One will quickly realize that in some cases or appears to be exclusive (S or S’ is true just in case exactly one of S, S’ is true), as in (5)a; while in others its meaning appears to be inclusive (S or S’ is true just in case at least one of S, S’ is true), as in (5)b.

(5)
a.

I will invite Ann or Mary.

⇒ usually gives rise to the inference that I will invite Ann or Mary but not both.

b.

I doubt that I’ll invite Ann or Mary.

⇒ usually equivalent to: I doubt that I’ll invite Ann or Mary or both (hence in particular it’s unlikely that I’ll invite both Ann and Mary).

But a deeper generalization can be stated: the sentence in (5)b, where the inclusive reading is dominant, is also one in which or appears in a ‘negative environment’, in a sense to be made precise. An elegant theory can be devised, in the spirit of Grice (1975): the meaning of or is inclusive, but a sentence with or automatically evokes (or ‘competes with’) the corresponding sentence with and. As a result, if one has reason to think that the speaker is maximally informative, one can infer the negation of the sentence with and in case the latter is strictly more informative than the sentence with or. The resulting mini-theory is stated in (6).

(6)
 a. Partial syntax: If S and S’ are well-formed clauses, so is [S or S’]. b. Partial semantics: [S or S’] is true if and only if S is true or S’ is true or both are. c. Partial pragmatics: Suppose a speaker uttered a sentence of the form [… S or S’ …]. If [… S and S’ …] is strictly more informative than [… S or S’ …] and if the speaker was maximally informative, he was not in a position to assert [… S and S’ …].

While it has been considerably refined in recent years, this mini-theory has the virtue of explaining why exclusive readings are dominant in (5)a while the inclusive one is dominant in the negative environment in (5)b: in (5)a, the sentence with and is more informative than the sentence with or; in (5)b, it’s the other way around: I doubt that I’ll invite Ann or Mary (or both) entails in particular that I doubt that I’ll invite both Ann and Mary). 8

In this brief discussion of human language, we were led on the basis of truth-conditional data to posit a division of labor between the literal, or ‘semantic’ meaning of sentences, and further ‘pragmatic’ inferences obtained by taking into account competition among possible utterances. Recent psycholinguistics has taken this question much further by showing that this division of labor is ‘psychologically real’ (see for instance Chemla et al. 2014a, 2014b). 9 For us, the key lesson is that detailed formal semantic studies can play a key role in delineating cognitive modules whose scientific reality can be further probed with other means.

2.2 Monkey fragments

Formal linguistics greatly benefited from the method of fragments, in which some subparts of the syntax and semantics of a language are explicitly defined, often in greatly simplified form – but with the idea that they can in principle be refined and extended to cover rich and sophisticated phenomena. As it happens, some of the monkey languages we will study are simple enough that a ‘fragment’ could aim to represent the entire language. But before we get into the complexity of actual data, it will prove helpful to develop a fragment for an imaginary language that shares some salient properties with the real monkey languages we will turn to shortly; this will allow us to make some useful notational points along the way.

Let us start with form. Our fictional language contains just two words, krak and hok, and they can be concatenated as pure series of kraks and pure series of hoks – and nothing else (this is meant as a much simplified version of the Campbell’s language we’ll study later). It can be defined as in (7), where we have used the ‘rewrite’ rules common in contemporary syntax (these rules are ‘context free’: the left-hand side does not need to specify in which contexts the rewriting is permissible).

(7)

Syntax

 a. Lexicon: Words are just: krak, hok b. Syntax: Sentences are of the form: krak+ (=arbitrary sequences of krak), hok+ (arbitrary sequences of hok). The language is defined by the (right-linear) grammar: S → K, H K → krak K, krak H → hok H, hok

A formal remark will prove useful later. While this very simple language can be defined by way of a context-free grammar, the latter is of a very special form, in which non-terminal nodes only appear at the far right of all productions, as in: krak K, hok H. Such ‘right-linear grammars’ can be shown to generate exactly the ‘regular languages’ (e. g. Hopcroft et al. 2001). These, in turn, can be defined by the simple operations in (8).

(8)

The regular languages based on a lexicon Lex are those that can be obtained from a-c:

• a.

The empty language Ø is a regular language.

• b.

For each a ∈ Lex, {a} is a regular language.

• c.

If A and B are regular languages, then A ∪ B (the union of A and B), AB (the set of strings obtained by concatenating any element from A with an element from B), and A* (the set of strings made of 0 or more concatenated occurrences of elements of A) are regular languages.

In our initial discussion of Black-and-White Colobus calls above, we defined languages by listing (or taking the union of) sublanguages involving singleton words, concatenation and repetition, with the patterns: s, sr+, r+. It is immediate that this is a regular language defined on the basis of the lexicon {s, r} by: s ∪ srr* ∪ rr*. 10 The only differences between this notation and that in (8) are that we listed patterns instead of using the sign , and that we used + to define patterns with at least one occurrence of a (possibly) repeated symbol, whereas * allows for 0 occurrences too; this is the reason the pattern sr+ corresponds to srr* and the pattern r+ corresponds to rr*. Thus the languages we can define with our earlier operations are all the regular languages. The languages we will discuss in this piece will all have this property, and hence we will henceforth use these simple operations rather than the more cumbersome formalism of context-free grammars.

Let us now turn to the semantics. Our toy language treats krak as a general alert call, and hok as an eagle alarm call, as seen in (9)a. We use standard notations from model-theoretic semantics; for instance, 〚krak〛a is the semantic value of the word (=call) krak under an alarm parameter a. We take individual calls to be appropriate or inappropriate in a given context, and use the standard terms ‘true’ and ‘false’ to characterize this bipartition. The alarm parameter will ensure that repetitions of a call aren’t entirely vacuous, as each call will have the effect of raising the value of the alarm parameter.

(9)

Semantics

Abbreviation: iff stands for ‘if and only if’.

• a.

If a is an alarm parameter (≥0):

• 〚krak〛a=true iff there is an alert and the alarm level is ≥ a.

• 〚hok〛a=true iff there is an eagle and the alarm level is ≥ a.

• b.

If a is an alarm parameter (≥0), for any word w and string of words S,

• 〚wS〛a=true iff 〚w〛a=true and 〚S〛a+1=true.

• c.

Truth

• A sentence S is true (in a given situation) just in case 〚S〛0=true.

The lexical rules for krak and hok in (9)a are self-explanatory, except for the part that concerns the role of the alarm parameter, which will matter when several calls are combined. How should combinations be treated? In the simplest analysis, we could define a (conjunctive) semantics whereby the concatenation wS of a word w and a sentence S is true just in case w is true and S is true as well; in effect, this would correspond to the null hypothesis that each call separately contributes its informational content to a sequence, with a trivial rule of combination (=each call counts as a separate utterance). But this would come at a cost: krak, krak krak, and krak krak krak would all end up with the very same meaning. In the primate literature, it is sometimes asserted that the calling rate is an increasing function of the level of urgency of the threat (e. g. Lemasson et al. 2010 on Campbell’s monkeys). We capture this fact in a simplified way in (9)b by postulating that each call raises the value of an all-purpose alarm parameter. Thus wS evaluated under a value a of the alarm parameter is true only in case S is true under the raised alarm parameter a+1. And each elementary call is true under a value a of the alarm parameter just in case the alarm level in the situation is at least a. The definition of truth in (9)c just adds that the initial value of the alarm parameter is taken to be 0.

Let us illustrate this fragment with two examples, hok hok and krak krak krak, discussed in (10).

(10)

• a.

〚hok hok〛0=true

• iff 〚hok〛0=true and 〚hok〛1=true,

• iff there is an eagle and the alarm level is ≥ 0 and the alarm level is ≥ 1,

• iff there is an eagle and the alarm level is ≥ 1.

• b.

〚krak krak krak〛0=true

• iff 〚krak〛0=true and 〚krak krak〛1=true,

• iff [krak〛0=true and 〚krak〛1=true and 〚krak〛2=true,

• iff there is an alert and the alarm level is ≥ 0 and the alarm level is ≥ 1 and the alarm level is ≥ 2, iff there is an alert and the alarm level is ≥ 2.

Our semantics in (9)a was set up in such a way that hok is strictly more informative than krak – at least on the natural assumption that eagles cause alerts, and that there are non-eagle-related situations that also do so. We will find this asymmetric pattern of entailment in all the natural systems to be discussed below, and it will be important to posit that when one call is strictly more informative than another, the most informative one is used whenever possible. This is in effect the very same rule of competition we stated above (in (6)c) to account for the ‘enrichment’ of the meaning of (inclusive) or by way of competition between or and and. Because this enrichment is based on differential patterns of informativity among competing sentences, we will label it the ‘Informativity Principle’. It is defined in two steps, as in (11)–(12): first, in (11) we define the competitors S’ of a sentence S – in essence, any sentence obtained from S by replacing kraks with hoks or hoks with kraks; second, in (12) we posit that if S is uttered and S’ is an alternative to S but is strictly more informative than S, one can infer that S’ is false. 11

(11)

Alternatives

Any sentence S’ is an alternative to a sentence S if S and S’ are both produced by the syntactic rules of the language, and S’ can be obtained from S by replacing any number of kraks with (the same number of) hoks and/or by replacing any number of hoks with kraks.

(12)

Informativity Principle

If a sentence S was uttered and if S’ is (i) an alternative to S, and (ii) strictly more informative than S (i.e. asymmetrically entails S), infer that S’ is false.

This simple pragmatic rule will already have an interesting consequence. Suppose that the main predators are leopards and eagles. While krak may have all sorts of uses in non-predatory contexts (because it is a general call), in most predatory ones it will single out leopards: if krak is uttered, (12) will trigger the inference that hok was not applicable, and hence that the situation is not eagle-related – hence if there is a predator, it must be a leopard. This logic will be used several times in the analyses we develop below.

A cautionary note is in order at this point. We saw above that, depending on one’s notion of language, it may take very little for a species to have a ‘language’. Even in the domain of semantics, we argued that as soon as a signal has some informational content, we can define a semantics for it –though of course the exercise may be more or less interesting depending on the situation. But one might think that when one posits a pragmatic rule such as the Informativity Principle, things are different: doesn’t that require sophisticated abilities to represent other minds? Not really. Grice’s analysis of or and neo-Gricean modifications of it were based on the assumption that the speaker is cooperative and thus selects from a set of alternatives the one that will be most informative to the addressee. Stated in this way, the analysis makes reference to a theory of other minds. But the logic of informativity requires far less: similar results can be obtained by just assuming that the semantics makes available a relation ‘is strictly more informative than’ on some sentences that are alternatives to each other, and that some mechanism (deterministically or probabilistically) selects the most informative sentence that is true in the situation at hand; the competition principle can be stated both for the speaker and for the hearer, as is done in (13).

(13)

Informativity Principle without a theory of mind

Assume that the semantics yields a relation ‘is strictly more informative than’ on some sentences that are alternatives to each other. Underinformative sentences are prohibited by the following rules:

• Speaker: Do not utter S in a situation w if a strictly more informative alternative S’ is true in w.

• Hearer: If you hear S in a situation w, infer that every strictly more informative alternative S’ is false in w.

Note that other pragmatic phenomena arguably do require a theory of mind. For instance, one does not usually go out uttering trivialities; while it might be informative to say ‘You are sitting on a broken chair’ (when this is not obvious), one doesn’t usually tell someone: ‘You are sitting on a chair’, as this is something the interlocutor is likely to know. In order to draw the distinction between trivial and non-trivial statements, the speaker must be able to represent the interlocutor’s belief state. 12 As mentioned in Section 1.2, data from Crockford et al. (2012) suggest that chimpanzees obey a version of the prohibition against trivial statements – which might be suggestive of a theory of other minds. But since no representation of other minds is required to implement the Informativity Principle, our analyses will be neutral on this issue.

3 Campbell’s monkeys: ‘Dialectal’ variation or call competition?

Here we will only summarize the main results of the analysis of Schlenker et al. (2014), which builds on Ouattara et al. (2009a, 2009b) as well as on original data collected by Arnold and Keenan. Our main goal is to highlight the theoretical mileage that one can get out of the interplay between a simple semantics and the Informativity Principle.

3.1 Introduction to Campbell’s semantics

Summarizing, the morphology can be defined as in (14) and (15). 13

(14)

Roots and affixes

 a. Roots: boom, hok, krak b. Bound affixes: -oo
(15)

Lexicon

• a.

Every root is a word.

• b.

If R is a root different from boom, R-oo is a word.

Of these calls, boom seems to have a regular syntax, as it usually appears as a single pair of calls at the beginning of sequences. It also seems to have a regular meaning of non-predation – a meaning which is clear enough to be understood by Diana monkeys, which live sympatrically with Campbell’s monkeys but do not themselves have booms (specifically, Zuberbühler 2002 shows that Diana monkeys stop being alarmed by Campbell’s predator sequences if these are prefixed with boom boom). The remaining (simple and complex) calls function as alert calls of various kinds. Of note for present purposes, Schlenker et al. (2014) analyzed data from two sites and found an apparent dialectal difference. In brief: data were collected in the Tai forest, where Campbell’s monkeys have as main predators leopards and eagles, and on Tiwai island, where eagles are present but leopards haven’t been seen for decades. Strikingly, what initially seems to be a leopard alarm in the Tai forest is used as general call on Tiwai island.

A very preliminary description of call use can be given as follows:

(16)

Informal description of the lexical meanings

• a.

boom boom: ‘this is not a situation of predation’

• b.

hok-oo: ‘there is an alert upwards’

• c.

hok: ‘there is an eagle’

• d.

• e.

krak: (i) ‘there is a leopard’ (Tai); (ii) ‘there is an alert’ (Tiwai)

We will come back in Section 7.2 to the uses of boom in Campbell’s monkeys and in other cercopithecines. Here we focus on two of the main analytical findings of recent research, pertaining to the uses of the suffix -oo and of the root krak, which seems to be subject to dialectal variation. To facilitate the discussion, we immediately provide a meaning for boom and especially hok. Here I is the lexical interpretation function, M is a model (which one can think of a site, e. g. the Tai forest or Tiwai island), and s is a situation of utterance.

(17)

Meaning of boom

IM,s(boom-boom)=1 iff there is a disturbance but no predator in s.

(18)

Meaning of hok (preliminary)

IM,s(hok)=1 iff there is an aerial predator in s.

3.2 The suffix -oo

Ouattara et al. 2009a analyzed -oo as a suffix attaching to the roots different from boom. Kuhn et al. (2014) develop a mini-phonetic analysis that suggests that -oo can reasonably be analyzed as a suffix, and Schlenker et al. (2014) develop two possible analyses according to which -oo modifies in a regular way the meaning of the root it attaches to.

Phonetically, Kuhn et al. (2014) argue that the production of -oo requires increased articulatory effort, a finding consistent with its analysis as a separate, meaning-bearing morpheme. Two acoustic facts are noteworthy: first, -oo is preceded by a short pause (averaging 0.060 s), indicating an obstruction of airflow; second, -oo lacks higher formants that are present in the call stems (krak or hok), thus displaying a spectral feature characteristic of nasalization. In light of these facts, Kuhn et al. (2014) argue that the production of -oo involves changing the passage through which air is flowing. Such a change would require an additional articulatory gesture, making it implausible that -oo is an indirect effect of independent articulatory pressures and thus strengthening the robustness of a morphological analysis.

Turning to the semantics, Schlenker et al. (2014) relied on data collected by Ouattara in field experiments (2009a, 2009b) and in naturalistic observations to argue (in their initial theory) that -oo has the effect of broadening the meaning of the root it attaches too, somewhat like the suffix -ish in greenish. Consider the counts in (19): oo-modified forms (in hatched bars) occur in many environments in which unmodified forms (open bars) do not occur. In particular, krak doesn’t occur in Eagle, Inter-group, and Tree fall situations, but krak-oo does; and hok doesn’t occur in Inter-group situations, but hok-oo does. The converse situation (with an unmodified form occurring in an environment in which the modified form doesn’t occur) doesn’t arise here. 14

(19)

Distribution of modified and unmodified forms in the dataset used in Ouattara et al. 2009a, 2009b

In their initial theory, Schlenker et al. 2014 posited for -oo the semantics in (20).

(20)

Meaning of -oo (initial theory)

for any root R different from boom-boom, for any site M and situation of utterance s, IM, s(R–oo)=1 iff there is a disturbance in s that licenses the same attentional state as if IM, s(R)=1.

Putting together the lexical rule for hok and the rule for -oo, we can obtain a lexical meaning for hok-oo, as shown in (21):

(21)

IM, s(hok–oo)=1 iff there is a disturbance in s that licenses the same attentional state as if IM, s(hok)=1, iff there is a disturbance in s that licenses the same attentional state as if there is an aerial predator in s.

In other words, hok-oo warns the hearer to be in the sort of attentional state that one should be in for an aerial predator, but without providing the information that there is in fact an aerial predator. So this could be a message to pay attention to what is going on upwards without a commitment to the presence of an aerial predator (but without excluding it either). It is immediate given this definition that the meaning of hok-oo is broader than that of hok: the latter is only made true if an aerial predator is present; while hok-oo is made true in the same situations, but also in ones in which there is no predator.

It is clear that in this analysis R-oo has a weaker meaning than R; on the assumption (made throughout) that R has a propositional meaning, R will asymmetrically entail R-oo. If this analysis is on the right track (which in view of the considerations below isn’t clear), it will be impossible to analyze the meaning of R-oo as the conjunction of the R and some hypothetical root -oo, as the latter analysis would entail that R-oo is stronger than R.

3.3 The meaning of krak

Schlenker et al. (2014) showed on the basis of comparable data collected in field experiments in Tai and on Tiwai that there was a major difference between the uses of the call krak on the two sites. In brief: krak was primarily used as a leopard call in Tai, but it was used as a general call on Tiwai. In particular, despite the fact that the main ecological difference between Tai and Tiwai pertains to the absence of leopards on the latter site, a significant difference was found in calling behavior to eagles: 15

(22)

Number of calls of different types in response to different types of playback stimuli (a) in Tai and (b) on Tiwai - aggregated version

 (a) Tai (b) Tiwai

Schlenker et al. (2014) went on to develop and compare two possible analyses of this ‘dialectal’ difference. The first model accounts for the difference between Tai and Tiwai by way of different lexical entries for krak. The second model gives the same underspecified entry to krak in both locations (= general alert call), but it makes use of a competition mechanism akin to scalar implicatures. In Tai, strengthening yields a meaning close to dangerous, terrestrial predator and turns out to single out leopards. On Tiwai, strengthening yields a nearly contradictory meaning due to the absence of ground predators, and only the unstrengthened meaning is used.

3.3.1 Dialectal differences in the use of krak?

The first theory explored in Schlenker et al. (2014) was that krak is genuinely subject to dialectal variation; the basic idea was that in Tai krak has a leopard-related meaning, whereas on Tiwai it is a general alert call. But a theory-internal problem and an empirical observation conspired to suggest that the real story is more complex.

First, on the assumption that the meaning of krak-oo is derived from the meanings of krak and -oo, it is unlikely that a leopard meaning for krak could yield a general meaning for krak-oo. To be concrete, given the lexical entry in (20), if krak is applicable just in case there is a leopard in the relevant situation, we would expect krak-oo to be used when one should pay attention to a threat coming from the ground, in accordance with the reasoning in (23), combined with the fact that Campbell’s monkeys are arboreal and leopards come from lower down (even when they climb).

(23)

Suppose that IM, s(krak)=1 iff there is a leopard in s. Then by the same reasoning as in (21), IM, s(krak–oo)=1 iff in s there is a disturbance that licenses the same attentional state as if there is a leopard in s.

Since leopards aren’t expected to fly, we would expect krak-oo not to occur in eagle alarms – which is an incorrect prediction: both in field experiments and in naturalistic observations, eagle alarms do trigger the use of krak-oo.

Second, even in the Tai forest there are some instances of krak that do not seem to be leopard-related. More specifically, on the assumption that in Tai krak has a leopard meaning and that hok has an eagle meaning, Schlenker et al. (2014) found significantly more ‘inappropriate’ uses of krak than of hok.

For both reasons, an analysis based on dialectal variation is more complex than meets the eye. While positing a general meaning for krak for Tiwai island would seem to be appropriate, in the Tai forest we would need to posit an ambiguity:

• a large number or possibly all instances of krak-oo should be derived from the same general-purpose krak as on Tiwai island, notated in (24)a as krak1;

• most (but not all) instances of the bare call krak should be realizations of the leopard-related root, notated in (24)b as krak2.

(24)

• a.

Tai and Tiwai:

• IM, s(krak1)=1 iff there is a disturbance in s.

• b.

Tai only

• IM, s(krak2)=1 iff there is a leopard in s.

3.3.2 Pragmatic strengthening in the use of krak?

These complexities led Schlenker et al. (2014) to explore an alternative analysis. In a nutshell, they proposed that pragmatic strengthening was responsible for the ‘leopard’ meaning of krak in the Tai forest. They noted that the strengthening operation would lead to a near-contradictory meaning on Tiwai island, and explained by a principle of ‘contradiction-avoidance’ the fact that on Tiwai this strengthening did not take place.

Crucially, in order to develop this analysis, Schlenker et al. (2014) had to revise the lexical entries discussed above. In a nutshell, they took krak-oo to refer to weak krak-licensing threats, and hok to function as a non-terrestrial alert call. The goal was to ensure that both would be more informative than krak, with the result that the Informativity Principle would enrich krak into krak and not krak-oo (hence a non-weak threat) and not hok (hence a terrestrial threat), as outlined in (25). This, in turn, was desirable in order to guarantee that the enriched meaning of krak could single out serious ground threats, and hence leopards. In (25) and subsequently, we use an underline (e. g. krak) to indicate the strengthened meaning of a call.

(25)

Desired result

 krak =krak and not krak-oo and not hok =disturbance and non-weak and terrestrial

The asymmetric entailments among calls are represented in (26). They obtain given the lexical entries in (27), which for reasons we will come to shortly were not just relativized to a site M and a situation of utterance s, but also a time of utterance t. Importantly, although krak-oo and hok-oo refer to weaker threats than krak and hok, they are logically stronger: if there is a weak krak-type threat, then a fortiori there is krak-type threat. The derivation of the meanings of the complex calls is given in (28).

(26)
(27)

Lexical Semantics

For any site M, situation s and time t,

• a.

IM, s, t(krak)=1 iff at t the caller of s is alert to a disturbance.

• b.

IM, s, t(hok)=1 iff at t the caller of s is alert to a disturbance whose source is non-terrestrial.

• c.

IM, s, t(boom-boom)=1 iff at t the caller of s is alert to a disturbance but not of a predator.

• d.

for any root R except boom-boom,

IM, s, t(R-oo)=1 iff at t the caller of s is alert to a disturbance that licenses R and isn’t strong among disturbances that license R.

(28)

For any site M, situation s and time t,

• a.

IM, s, t(krak-oo)=1 iff at t the caller of s is alert to a disturbance that licenses krak and isn’t strong among disturbances that license krak,

• iff at t the caller of s is alert to a disturbance that isn’t strong among all disturbances.

• b.

IM, s, t(hok-oo)=1 iff at t the caller of s is alert to a disturbance that licenses hok and isn’t strong among disturbances that license hok,

• iff at t the caller of s is alert to a disturbance whose source is non-terrestrial, and which isn’t strong among those whose source is non-terrestrial.

The Informativity Principle is exactly as it was in our earlier discussions in (12), but we will apply it to individual calls rather than to entire sentences and discourses. The reason is in part practical: apart from the fixed position of boom, we have little understanding of the syntax of Campbell’s calls, and even the distinction between sentences and discourses (if it is applicable) is not at all obvious. So at this point we will take each simple call (boom-boom, krak, hok) and each complex call (krak-oo, hok-oo) to function as a sentence; and we will assume that all non-boom calls are alternatives to each other. The Informativity Principle in (12) applies to the (one-word) sentences and their alternatives defined in (29). Combining the literal meaning of a call and the further inferences obtained from the Informativity Principle, we get the ‘strengthened meaning’ defined in (30). Crucially, enrichment happens at the level of entire calls, not roots, with the effect that -oo modifies the literal (and general) meaning of krak, as is desired.

(29)

Alternatives

Sentences are single (simple or complex) calls, and krak, krak-oo, hok, hok-oo are all alternatives to each other.

(30)

Strengthened meanings

For every word w, we (abuse notation and) write as w the strengthened version of w, obtained by adding to the literal meaning of w the inferences obtained by applying the Informativity Principle. Its meaning is given by:

for every situation s and time t,

wM, s, t=1 iff 〚w〛M, s, t=1 and for all w’ ∈ Alt(w), if w’ entails w, 〚w’〛 M, s, t=0

where Alt(w) is the set of alternatives of w.

With these rules in place, we do derive the desired result sketched in (25). In fact, not only does the strengthening of krak yield a ‘serious terrestrial disturbance’ meaning, but the strengthening of hok yields a ‘serious aerial disturbance’ meaning. Both appear to be appropriate for leopard and eagle situations respectively in the Tai forest.

(31)

Strengthening krak and hok

a.
 〚krak〛M, s, t =1 iff 〚krak〛M, s, t=1 and 〚krak-oo〛M, s, t=0 and 〚hok〛M, s, t=0 and 〚hok-oo〛M, s, t=0 (since krak-oo, hok and hok-oo all entail krak), =1 iff 〚krak〛M, s, t=1 and 〚krak-oo〛M, s, t=0 and 〚hok〛M, s, t=0 (since hok is weaker than hok-oo) =1 at t the caller of s is alert to a disturbance but not to one that is weak among all disturbances and not to one whose source is non-terrestrial, or roughly: =1 iff at t the caller of s is alert to a terrestrial disturbance which is serious among all disturbances.
b.
 〚hok〛M, s, t =1 iff 〚hok〛M, s, t=1 and 〚hok-oo〛M, s, t=0 (since hok-oo entails hok) =1 iff at t the caller of s is alert to a disturbance whose source is non-terrestrial but not to a disturbance that isn’t strong among all disturbances whose source is non-terrestrial, or roughly: =1 iff at t the caller of s is alert to a serious aerial disturbance.

In order to explain why strengthening fails to apply on Tiwai island, Schlenker et al. (2014) rely on an assumption pertaining to the environment, combined with a rule of ‘strengthening avoidance’ in case the result of strengthening is (nearly) contradictory. The environmental assumption is that there are very few serious ground threats for Campbell’s monkeys on Tiwai island (for lack of leopards). The rule of strengthening avoidance is given in (32); it just states that strengthening should be avoided if it gives rise to a contradiction.

(32)

Strengthening application and strengthening avoidance

• a.

For a given site M, if 〚wM, s, t=0 for every situation s and every time t, one should interpret an utterance of w without strengthening.

• b.

Otherwise, a word w should in most cases be interpreted as 〚w〛.

In this way, the apparent dialectal variation across Tai and Tiwai is explained on the basis of a system which, for all we know, might be entirely innate. In the end, the lexical entries, but also the strengthening rule and the rule of strengthening avoidance could be exactly the same on both sites, though they produce different results because strengthening avoidance is triggered by properties of the site at large.

Finally, the analysis isn’t without its technical problems. The heart of the matter is that krak-oo occurred in all sorts of environments that cannot be taken to correspond to weak threats – for instance eagle alarms. In many cases, it is preceded by more specific alert calls. But the combination of, say, hok (on its strengthened meaning) with krak-oo should be predicted to be a quasi-contradiction. The solution is to posit that the calls reflect the caller’s subjective state at the very moment at which it was uttered, with the auxiliary assumption that this subjective alarm state can easily change and typically decreases gradually after a threat is perceived. For this reason, the lexical semantics must be relativized not just to a site M and a situation of utterance s, but also a time of utterance t. In effect, this means that each call counts as a separate utterance. If we wish to compute the overall semantic contribution of an entire sequence, we must take the conjunction of calls evaluated at different times, as is done in (33)b. We simultaneously ensure in our rules that longer sequences are indicative of a higher alarm level; in effect, the time parameter in this analysis takes over the function of the alarm parameter in our initial (lexicalist) theory; specifically, the equivalent of the ‘alarm’ level at time t can be computed by considering the value of t–time(s), where time(s) is the time of the situation s (and thus the time at the start of the sequence).

(33)

Compositional Semantics (with a time parameter replacing the alarm parameter) For any site M, situation s (whose time of occurrence is time(s)), time t, word w, and string of words S,

• a.

〚w〛M, s, t=1 iff IM, s, t(w)=1 and the alarm level is 16 at least t–time(s).

• b.

〚wS〛M, s, t=1 iff 〚w〛M, s, t=1 and 〚S〛M, s, t+1=1.

3.4 Conclusions and perspectives

Three main conclusions can be drawn from the analysis of Campbell’s calls.

(i) Complex calls

First, -oo has a phonetics, a distribution and arguably a semantics that are consistent with a suffixal analysis. If correct, there are primitive elements of morphological complexity in male Campbell’s calls. Is this an isolated case?

Veselinović et al. (2014) suggest that further cases can be found. Following Candiotti et al. (2012), they show that the social calls of Diana monkey females arguably include complex calls as well. First, some of the calls are produced in what Candiotti et al. (2012) term merged association; this means that there is no discernible pause between the two calls. Thus in addition to individual calls L, H, R, A, the repertoire contains 2-call units LA, HA, and RA. Now of course this could be a low-level articulatory phenomenon, but Veselinović et al. argue that this is not so. One of their key arguments is syntactic: as is the case in all of the monkey languages we will consider, there are numerous patterns of repetition; but crucially, in some cases repetition targets a complex unit, such as LA LA LA LA etc. This suggests that there is some reality to the existence of such 2-call units. They argue that A functions both as a separate root and as a suffix, with a rule governing the distribution of the root A, which is always sequence-initial. While the authors do not exhibit a semantic difference between the root A and the affix A, they make an interesting evolutionary argument: female Campbell’s monkeys arguably have calls that are similar to and probably phylogenetically related to those of female Diana monkeys. Female Campbell’s monkeys have a counterpart of the A-suffixed LA call, as well as a counterpart of the L call, but they do not have the A call as an independent root – which might suggest that treating the two as different objects within Diana monkeys is not unreasonable. As we will see in Section 4, pyow-hack sequences have been analyzed by Arnold and Zuberbühler as complex calls of sorts, although there are alternative theoretical possibilities. And as we will see in Section 5, the snort-roar sequences of Black-and-White Colobus monkeys might also have to be analyzed as complex calls, with a non-compositional semantics. But these are admittedly controversial cases, and thus -oo-modified calls in male Campbell’s monkeys and the Candiotti/Veselinović cases in female Diana monkeys might provide stronger arguments for the existence of complex calls.

(ii) Variation

While krak is used differently in the Tai forest and on Tiwai island, positing that there is a dialectal difference between the two sites is just one of at least two possible theories. The existence of a bona fide dialectal difference would go against the claim – often taken as a null hypothesis – that monkey calls are entirely innate (see also Wheeler and Fischer 2012). We have nothing in principle against the possibility that their use can be in part learned, but in view of our analysis this conclusion is premature: a sophisticated mechanism of pragmatic enrichment might account for the appearance of dialectal variation without postulating that different meanings are acquired on the two sites (though pragmatic rules are applied differently because of the general principle of contradiction avoidance, which interacts with the ecology of the sites).

(iii) Semantics vs. Pragmatics

An essential theme in the debate between the two accounts of Campbell’s calls is the precise division of labor between semantics and pragmatics. The second theory made heavy use of the Informativity Principle, which produced interesting results because krak was enriched by competition with both hok and krak-oo. While Informativity will have less striking results in our other case studies, pragmatic principles will prove to be key to obtain empirically adequate theories: quite generally, we will find that monkey languages include a general alert call and one or several more specific ones, notably raptor-related ones; but crucially the general alert is almost never given at the beginning of an eagle-triggered discourse. The Informativity Principle offers a very natural explanation of why this is so.

A further remark should be added from a comparative perspective. Initially, the Tai data suggested that krak is a ground predator call while hok is an aerial predator call. Upon closer inspection, theory-internal considerations relating to krak-oo, as well as the data from Tiwai, suggest that krak might have a general alert meaning in the end. This dovetails nicely with typological considerations arising from comparative studies of primate calls. As Wheeler and Fischer (2012) note, “across species it tends to be the call associated with terrestrial predators that is given in other contexts, whereas the call associated with aerial predators tends to be context-specific and meet the criteria of functional reference” (Wheeler and Fischer 2012: 200). While this just seems to be a tendency, it is interesting to note that the Campbell pattern, which might initially have appeared as an exception, confirms the generalization in the end.

4 Putty-nosed monkeys: non-compositionality or pragmatic enrichment?

For lack of understanding of the syntax, our analysis of Campbell’s sequences almost entirely took place at the call level. By contrast, male Putty-nosed monkey calls display simple syntactic patterns, as shown by Arnold and Zuberbühler (2006a, 2006b, 2008, 2012, 2013). So we will now be in a position to ask more precise questions about the syntax/semantics interface in a monkey language.

4.1 The puzzle of pyow-hack sequences

In a nutshell, the main theoretical problem is as follows. Male putty-nosed monkeys have two main alert calls, pyows (=P) and hacks (=H). 17 While pyows have a broad distribution suggestive of a general call, hacks are often indicative of eagles. Arnold and Zuberbühler showed that putty-nosed monkeys sometimes produce distinct pyow-hack sequences made of a small number of pyows followed by a small number of hacks (P+H+); and these were shown both in quantitative observational data and in field experiments to be predictive of group movement. Arnold and Zuberbühler claimed that pyow-hack sequences are syntactically combinatorial but not semantically compositional because the meaning of the sequences can’t be derived from the meanings of their component parts. From the perspective of our earlier discussions, it would make equal sense to say that pyow-hack sequences are phonologically complex but lexically simple; or that they are morphologically complex but receive a meaning at the whole word level. The reason Arnold and Zuberbühler do not use this terminology and speak instead of syntactically complex sequences interpreted as idioms is that pyow-hack sequences are relatively slow, with pauses between calls; and that they are not fully stereotyped: they involve a varying number of pyows followed by a varying number of hacks.

Of course this very observation suggests that the whole-sequence analysis of the meaning of pyow-hack sequences might be incorrect to begin with. In this section, we briefly compare two theories of this phenomenon (see Schlenker et al. 2016 for further details). One formalizes and modifies the non-compositional theory. The other presents a semantically compositional alternative, based on weak meanings for pyow (‘general alert’) and hack (roughly, ‘non-ground movement’), combined with pragmatic principles of competition. As in our discussion of Campbell’s monkeys, we make use of an ‘Informativity Principle’, whereby more informative sequences are preferred to less informative ones. But a crucial innovation is an ‘Urgency Principle’ which mandates that calls that provide information about the nature/location of a threat must come before calls that don’t. Semantically, pyow-hack sequences are compatible with any kind of situation involving (moving) aerial predators or (arboreal) movement of the monkeys themselves. But in the former situation, hacks provide information about the location of a threat, and hence should appear at the beginning of sequences. As a result, pyow-hack sequences can only be used for non-risk-related situations involving movement, hence a possible inference that they (often) involve group movement. While it is too early to adjudicate this debate, we will argue that a formal analysis of the competing theories should help produce new predictions to be tested in future field studies.

Some of the main generalizations are summarized in (34). Eagle responses are predominantly of two types: pure hack discourses, made only of hacks; and transitional discourses that start with hack sentences and at some point transition to series of pyow sentences. Leopard-related discourses are predominantly made of pyows. But they may also include sentences (illustrated in (35)) with a small number of pyows followed by a small number of hacks – called pyow-hack sequences in the literature (with our terminology, they are pyow-hack sentences, although in this case we will often use the more established terminology of pyow-hack sequences). A few instances of this pattern are also found in Eagle-related contexts.

(34)

Discourse and sentence types

Notation: P represents a pyow, H a hack. X+ refers to a repetition of call X and _ represents a pause.

 a. Pyow discourses: P+_…_ P+ (e.g. leopard contexts) b. Hack discourses: H+_…_H+ (e.g. eagle contexts) c. Transitional discourses: H+_…_H+_P+_…_P+ (e.g. eagle contexts)
(35)

Pyow-Hack sequences’: …P+H+… (trigger group movement)

(these are sentences that include a small number of P’s and a small number of H’s, found within various discourse types)

Arnold and Zuberbühler convincingly established by observational data as well as field experiments that pyow-hack sequences are predictive of group movement. In particular, they established four important results, listed in (36).

(36)

Properties of pyow-hack sequences (Arnold et al. 2006a, 2006b, 2008, 2012, 2013)

• a.

Natural pyow-hack sequences induce group movement far more than either pyow or hack sequences.

• b.

The same result extends (in weakened form) to synthetic pyow-hack sequences, put together from pure pyow and pure hack sequences.

• c.

Keeping the length constant, the precise composition of pyow-hack sequences does not seem to affect the distance travelled. In particular, comparable behavioral results were obtained with playbacks of PPPHHH, PHHHHH, and PPPPPH.

• d.

In naturally occurring discourses containing pyow-hack sequences, there were indications of a positive relationship between the number of ‘pyows’ and/or the total number of calls in a pyow-hack sequence and the distance travelled by the group.

4.2 A non-compositional analysis

While Arnold and Zuberbühler’s positions on the meaning of individual pyows and hacks has evolved over the years, they have consistently maintained that pyow-hack sequences are syntactically combinatorial without being semantically compositional, and they compared them to “idiomatic phrases such as kick the bucket, in which the meaning of the expression is not derived from the meaning of its constituent words but must be learned as a convention” (Arnold and Zuberbühler 2012). As mentioned, it would make good conceptual sense from Arnold and Zuberbühler’s perspective to treat pyow-hack sequences as being phonologically but not morphologically or syntactically composed of individual pyows and hacks – just like irate is made of syllables found in I and rate without thereby being composed of these words. But two properties of pyow-hack sequences are surprising for this analysis: first, they come in many non-stereotyped forms, as mentioned in (36)c; second, their time course is slow, with long pauses between calls. A morphological analysis would be faced with the same difficulties, hence Arnold and Zuberbühler talk of ‘syntactic’ combination in this case. But the key point is that the semantics treats pyow-hack sequences as unanalyzed units, as sketched in (37)-(38).

(37)

Sentential syntax

Putty-nosed sentences are generated by the following rules: P+, H+, P+H+

(38)

Non-compositional analysis of pyow-hack sequences (initial attempt)

• a.

〚P〛= 1 iff there is an alert.

• b.

〚H〛= 1 iff there is an aerial predator.

• c.

〚PH〛=〚PPH〛=〚PHH〛= … =〚PPPPPPH〛=〚PHHHHHH〛= 1 iff the group is moving.

• d.

Sentence-internal composition rule

If w is a word and S is a string of words, 〚wS〛= 1 iff 〚w〛=〚S〛= 1.

In Schlenker et al. (2016), it was noted that the non-compositional analysis can be improved upon in four respects.

• (i)

As stated, the rule in (38)c is more disjunctive than needs be – it should more explicitly mention all sequences of the form PmHn for m, n ≥1.

• (ii)

The analysis in (38) gives rise to undesirable semantic ambiguities – e. g. a sentence PH can be treated as a pyow-hack sequence, interpreted by (38)c, or as the concatenation of P and H, interpreted by the other rules in (38).

• (iii)

The analysis fails to explain why we almost never find discourses that start with a series of pyows in response to an eagle stimulus: since pyows are general alert calls, they should be applicable in that context too.

• (iv)

The analysis fails to predict a meaning difference for sentences that are constructed on the same pattern – e. g. P+H+ – but contain a different total number of calls. However the number of calls sometimes matters, as mentioned in (36)d.

Schlenker et al. (2016) propose a reformulation of the non-compositional analysis, one on which all repetitions (not just in pyow-hack sequences) are semantically ignored except for purposes of computing the general level of alarm – as is shown in (39).

(39)

Non-compositional analysis of pyow-hack sequences (revised attempt)

For any n ≥ 1, k ≥ 1 and k < n,

• a.

for any sentence S of the form S=Pn, 〚S〛= 1 iff there is an alert and the alarm level is at least n.

• b.

for any sentence S of the form S=Hn, 〚S〛= 1 iff there is a serious raptor-related alert and the alarm level is at least n

• c.

for any sentence S of the form S=PkHn-k, 〚S〛= 1 iff the group is moving and the alarm level is at least n.

No rule is disjunctive; since entire sentences are interpreted, no ambiguities arise; and sentence length does provide information about the alarm level. This takes care of Problems (i), (ii) and (iv). To address Problem (iii), the Informativity Principle in (12) can once again be appealed to, combined with the natural definition of alternatives in (40).

(40)

Alternatives

Any sentence S’ is an alternative to a sentence S if S and S’ are both produced by the syntactic rules of the language, and S’ can be obtained from S by replacing any number of P’s with (the same number of) H’s and/or by replacing any number of H’s with P’s.

On this view, then, pyows don’t occur at the beginning of eagle-triggered discourses because in such situations hacks are more informative. But it remains to explain why pyows can still be found towards the end of eagle-related discourses. Schlenker et al. 2016 posit that the level of alarm typically decays over time, as stated in (41).

(41)

Alarm Decay

The seriousness of an alarm usually decays over time.

4.3 A compositional alternative

Schlenker et al. (2016) argue that a compositional analysis of pyow-hack sequences can be developed, but requires heavy use of pragmatic principles. 18 Their analysis is in three steps.

First, they take hacks to have weak meanings, involving non-ground movement. Second, they take the meaning of some sentences to be enriched by two pragmatic mechanisms instead of just one. As before, the Informativity Principle explains why P+ sentences cannot appear at the beginning of eagle-related discourses. But in addition, they posit an Urgency Principle according to which, within any sentence, calls that provide information about the location of a threat should come before those that don’t. In eagle-related discourses, this will mandate that hacks should come first. A pyow-hack sequence will carry the literal (=semantic) meaning that there is some non-ground movement, but it will also trigger the inference that this is not a threat-related one, for if so the hacks should have come first. Third, they assume that world knowledge will yield the further inference that an alert that involves non-ground movement but no threat has a good chance of being related to group movement.

The sentential syntax and semantics are given in (42)-(43). The syntax now allows for sentences of type H+P+ because these serve as alternatives to P+H+ when the Urgency Principle is applied to the latter. This is a technical assumption, but it does not appear to be entirely wrong-headed: while H+P+ sentences are less common than P+H+, they might exist nonetheless. As for the semantics, it makes use of an alarm parameter, which is handled in the same way as in our analysis Campbell’s calls.

(42)

Sentential syntax (revised)

Putty-nosed sentences are generated by the following rules:

P+, H+, P+H+, H+P+

(43)

Sentential semantics (compositional – with an urgency parameter)

For any alarm parameter a≥0,

• a.

〚P〛a=1 iff there is an alert and the alarm level is at least a.

• b.

〚H〛a=1 iff if there is a serious non-ground movement-related alert and the alarm level is at least a.

• c.

If w is any call and S is any sequence,

• 〚wS〛a=1 iff 〚w〛a=1 and 〚S〛a+1=1.

It bears mentioning that ‘movement-related’ in (43)b is vague and should probably be interpreted in terms of ‘impending movement’: this is useful for cases in which H is in the end indicative of group movement, but also for eagle-related cases, which need not involve a moving eagle 19 but might rather indicate an impending eagle attack. Note also that H is given a meaning of serious (non-ground movement-related) alert in order to explain why H outcompetes P at the beginning of eagle-related discourses but not at the end. The key assumption is that the interaction of Alarm Decay as in (41) and of the Informativity Principle is responsible for the difference. The competition with H-sentences guarantees that P-sentences come to trigger the inference that one is not faced with a serious non-ground movement-related alert (or else H would have been used). This is still compatible with the situation that arises at the end of eagle-related discourses, where the level of alarm has presumably diminished enough that H stops being applicable.

Now the key innovation of the analysis is the Urgency Principle in (44), combined with the assumption that the Informativity Principle takes into account (or ‘is fed by’) the Urgency Principle, as is stated in (45).

(44)

Urgency Principle

If a sentence S is triggered by a threat and contains calls that convey information about its nature or location, no call that conveys such information should be preceded by any call that doesn’t. As a result, if H’s provide information about a threat, they cannot follow any P’s.

(45)

Informativity Principle (revised)

The Informativity Principle in (12) takes into account the information conveyed by the literal (=semantic) meaning of sentences, combined with the effects of the Urgency Principle.

The derivation of the use of pyow-hack sequences can then be given as in (46). The Urgency Principle guarantees that pyow-hack sequences won’t be used in predator-related situations, as stated in (46)a, and assumptions about world knowledge in (46)c can ensure that these sequences are mostly used in cases of group movement. Finally, since after application of the Urgency Principle H+ and H+P+ turn out to be less informative than P+H+ sequences, we can derive the result that in group movement situations only pyow-hack sequences can be used.

(46)

Derivation of the use of pyow-hack sequences

• a.

P+H+ sentences can be in violation of the Urgency Principle (i.e., in threat situations), and hence their literal meaning is enriched by it. As a result, they are only applicable in situations in which there is a serious non-ground movement-related alert but not one which is due to a threat.

• b.

Unlike P+H+ sentences, H+ and H+P+ cannot be in violation of the Urgency Principle, hence their meaning is not enriched by it. They are thus strictly less informative than P+H+ sentences, and the Informativity Principle guarantees that in situations in which there is a serious non-ground movement-related alert but not one which is not due to a threat, only P+H+ can be used.

• c.

Assumption about World Knowledge: The most common situations in which there is a serious non-ground movement-related alert but not one which is due to a threat involve group movement.

The compositional analysis is arguably more explanatory than the non-compositional one, which simply stipulates the meaning of pyow-hack sequences. But the compositional analysis could also make different predictions, since it yields a much weaker meaning for pyow-hack sequences than ‘group movement’: world knowledge is needed to ‘bridge the gap’. Thus the present theory leads one to expect that pyow-hack sequences might be used for some non-threatening events that involve non-ground movement (but are ‘serious’ enough to license the use of the H call). One might for instance ask whether group encounters might trigger the use of pyow-hack sequences – if indeed such group encounters are not seen as threats.

4.4 A non-referential variant of the compositional alternative

Arnold and Zuberbühler (2013) significantly revise their earlier findings on the use of hacks; while they still treat pyows as being very general calls, they emphasize that hacks can be used in a variety of ‘high arousal’ contexts that do not involve aerial predators. We can formalize a version of their insights as in (47) (where the boldfaced parts highlight the changes with respect to (43); note that we have not modified our analysis of P, whose meaning is as general as before).

(47)

A non-referential semantics (compositional – with an alarm parameter)

wS is truea if and only if w is truea and S is truea+1

For any alarm parameter a≥0,

• a.

〚P〛a=1 iff there is an alert and the alarm level is at least a.

• b.

〚H〛a=1 iff if there there is an alert causing high arousal and the alarm level is at least a.

• c.

If w is any call and S is any sequence, 〚wS〛a=1 iff 〚w〛a=1 and〚S〛a+1=1.

It is immediate that, as before, H is strictly more informative than P, and thus the Informativity Principle could in principle apply to the new system in the same way as to the old one. On some assumptions, some of the same predictions could be derived as well. Formally, it might help to think of the non-referential analysis as replacing notions of ‘high movement’ in physical space with ‘high degree of emotional movement’ in internal cognition. On the additional assumption that ‘high arousal’ is often caused by things that are high in physical space – notably eagles and tree falls – we will get partly similar results to those we had before.

This exercise can be completed in two steps: first, we must connect ‘high arousal’ to environmental conditions that can be assessed in observation or in field experiments; second, we must revise the statement of the Urgency Principle (as noted, the Informativity Principle will apply in the same way to the new and to the old system).

(i) Assumptions connecting high arousal to the environment

We will assume that high arousal is caused by eagles but not by leopards, as the latter are less dangerous. In addition, high arousal might be caused by events in the monkeys’ immediate environment, which is usually arboreal. As a result, there will generally be a close correspondence between events that would have licensed H on the old and on the new theory.

(ii) Revision of the Urgency Principle

In our compositional (and referential) theory, the Urgency Principle prescribed that in a sentence triggered by a threat, calls that provide information about the nature/location of the threat must come before those that don’t. This idea won’t be applicable to the new, non-referential theory. But a different intuition might yield the same result: in case a hack is produced as a reaction to a threat, it is an emotive reaction and should thus come before other calls – and hence before pyows. This is stated in (48):

(48)

Urgency Principle (non-referential version)

If a sentence S is triggered by a threat, arousal-based calls in it must come before non-arousal based calls.

The intuition is that in a pyow-hack sequence, which is indicative of group movement, the speaker might be in a high arousal state (and thus produce a hack) because of his intention to move and/or because of an incoming event. By contrast, in eagle contexts a hack is produced as a reaction to a scary event and thus has to be produced first.

With these two assumptions, we can reproduce in an arousal-based system most of the results of our earlier compositional analysis, though of course there is now the possibility that hacks will be triggered by ground events or by events that don’t involve movement (though if our assumptions are correct this will be rare). While it is too early to fully adjudicate among these analyses, this exercise in theory comparison highlights an important methodological point: the precise semantic content of calls is often subject to much uncertainty, as different contents may interact with world knowledge and the environment to yield the same behavioral consequences. But our analyses are also based on more abstract notions, such as the entailment and competition relations among calls; and these may remain constant across theories that don’t quite posit the same contents at the lexical level. This is the reason we have de-emphasized the issue of the ‘referential’ nature of calls, and highlighted instead the division of labor between semantics and pragmatics and the role of the Informativity and of the Urgency Principle.

5 Colobus monkeys: complex calls?

In this section, we survey some interesting patterns in the calls of Black-and-White Colobus monkeys (‘Guerezas’) – and we will add some remarks on King Colobus monkeys (‘Polykomos’), which are closely related and have very similar calling patterns. While our data are preliminary, these patterns pose an interesting theoretical problem. Two call types, snort and roar, 20 can appear either singly, or together in sentences of the form snort-roar+, without pause (‘snort-roar sequences’, for short). But when we consider the use of these calls in various contexts, it seems that snort-roar sequences have the broadest distribution.

This poses a dilemma: in the semantics, if concatenation is interpreted as conjunction, snort-roar+ should be logically stronger than its component parts, and thus it should have a narrower distribution than single snorts and pure roar+ sequences. This is not what we find; in particular, snort-roar sequences have a much broader distribution than individual snorts. Given the general framework we have been adopting, this leaves three kinds of options open.

• (i)

One possibility is that the meaning of snort-roar sequences is not derived from the meaning of their component parts. Either snort-roar sequences should be taken as a different word from snort and roar – so that the issue of compositionality does not arise; or the complex unit should be interpreted non-compositionally, just as was proposed for pyow-hack sequences by Arnold and Zuberbühler. Given that there is no long pause between snort and roar in these cases, the first option would make excellent sense.

• (ii)

A second possibility is to derive the meaning of snort-roar sequences from the meaning of their component parts, but by a mechanism which is different from conjunction. Such an analysis bears similarity to our initial analysis of the -oo suffix of Campbell’s monkeys, which broadened the call meaning. For lack of constraints on such a mechanism, we will not explore this possibility further for the moment, although it might become a live option in the future.

• (iii)

A third option is to take the literal meaning of individual snorts and roars to be extremely weak, but to be enriched by pragmatic rules. This might allow us to preserve a mechanism of conjunctive combination for snort-roar sequences, while explaining why on the surface the distribution of individual snorts and roars is more constrained than that of snort-roar sequences. As we will see, it is difficult to get such an analysis to work, and in the end we will not have a good alternative to the treatment of snort-roar sequences as non-compositional complex calls.

5.1.1 Data

Let us concentrate for the moment on Guereza Colobus monkeys. We have data obtained in field experiments from two sites, Kaniyo Pabidi and Sonso. They differ primarily in that leopards are present in Kaniyo Pabidi but not in Sonso. However, this needn’t affect our discussions too much, for three reasons. First, we have no evidence of a linguistic or behavioral difference between the two sites (this might of course be because our data are not sufficiently rich, or because we haven’t analyzed them properly). Second, Schel and Zuberbühler (2009) argued that even in Sonso Colobus monkeys display an innate and appropriate reaction to leopard stimuli: although the leopard naïve monkeys at Sonso showed slightly more ‘exploration’ behavior (i. e. they approached the leopard stimuli more often compared to the monkeys at Kaniyo Pabidi), both populations exhibited acoustically similar vocal anti-predator behavior in the presence of leopard stimuli– which might dampen the effect of any cognitive difference across the two sites.

Raw data for Kaniyo Pabidi are given in (49). Each box represents a discourse, i. e. a sequence produced as a reaction to a stimulus. Within each box, each line corresponds to a separate sentence, i. e. a sequence of calls separated by longer-than-normal pauses. Snorts are represented as black dots, and roars as red + signs. As can be seen, there are multiple series of roars, whereas snorts either appear singly, or at the beginning of snort-roar sequences. This graph is reproduced in a larger format in the Supplementary Materials (in (91)), together with entirely similar-looking Guereza Colobus data from Sonso (in (92)) as well as King Colobus data (in (93)).

(49)

Guereza data from Kaniyo Pabidi

Guereza Colobus monkey data from Kaniyo Pabidi. In each case, each box represents the response to one variant of a specific stimulus, as described at the top of the columns. The first line in the column header represents the predator type (e.g., Eagle, Leopard) and the second line the way it was induced, e.g., through a playback of its “shrieks” or “growls”, or through the playback of calls from Diana monkeys or Black and White Colobus monkeys (bwC) as produced in response to such a predator’s acoustic manifestation. For conciseness and legibility, no more than 10 groups (or sentences) of calls are represented, and no more than 15 calls within each of these groups/sentences are represented. Red + signs represent roars, black dots represent snorts.

5.1.2 Syntax

We will start with some generalizations about the syntax. There are just three types of sentences: individual snorts, snort-roar sequences, made of a single snort and a series of roars; and sentences made of roars only. No clear patterns emerge at the level of discourse syntax, except that when single snorts (i. e. snorts that don’t appear in a snort-roar sentence) appear, they do so at the beginning of discourses. The main generalizations are stated in (50), and can be encoded by the rules in (51).

(50)

Syntactic generalizations

• a.

Sentences are of three types: s, sr+, r+

• b.

In discourses, if individual snorts appear, they usually do at the beginning.

(51)

Syntax

We write _ for intersentential pause.

• a.

Sentential syntax: sentences are of three types, which we divide into two categories:

• S=s

• S’=sr+, r+

• b.

Discourse syntax: discourses are of the form (S_)*(S’_)*, where * is the Kleene star.

The sentential syntax is self-explanatory. The discourse syntax just encodes the fact that if single snorts appear, they do so at the beginning of discourses.

As was the case in our other descriptions of monkey syntax, we remain agnostic about the ultimate explanation of the syntactic restrictions we find. In particular, it might be for articulatory reasons that snorts do not appear in series without pauses, and only appear at the beginning of discourses – possibly because they function like sneezes, which might require a fairly long recovery phase before the same sound can be produced again. Importantly, however, in snort-roar sentences, snorts are immediately followed by roars, possibly because the latter are produced very differently and the recovery phase might for this reason be much shorter. Since the acoustic coherence of snort-roar sentences will matter in our discussions, we provide quantitative data in (52); the crucial data point is in (52)b, where we see that when a snort is followed by a roar, the pause between them is on average extremely brief – which makes it plausible that they could be analyzed as a single unit. 21

(52)

• a.

Average time from snort to snort: 14,072ms

• b.

Average time from snort to roar: 95ms

• c.

Average time from roar to snort: 3,085ms

• d.

Average time from roar to roar: 1,182ms

5.1.3 The semantic puzzle

Let us come to the semantics. The generalizations are not as sharp as one might want, but they are nonetheless suggestive:

(53)

Semantic generalizations

• a.

Individual snorts are only given to terrestrial animals, whether predators or not.

• b.

Snort-roar sequences appear in every context.

• c.

Sentences made of roars only appear primarily – but not exclusively – in aerial predator contexts. They are strongly indicative of aerial predators if they appear at the beginning of discourses.

As mentioned at the outset, if concatenation is interpreted as conjunction, snort-roar sentences should have a stronger meaning than individual snorts and pure roar-sentences – and thus they should appear in a subset of the environments in which snorts appear, and of those in which roars appear; this is the opposite from what we find. We explore two main solutions: one is to take snort-roar sentences to have a non-compositional meaning, just like pyow-hack sequences in Arnold and Zuberbühler’s analysis; the other solution is based on very weak meanings combined with rules of pragmatic strengthening. In a nutshell, snort competes with snort-roar+ and is thus strengthened to: snort and not (snort-)roar+; and by the same logic roar+ is strengthened to roar+ and not snort(-roar+). As a result, snort-roar+ is not stronger than the strengthened meanings snort and roar+, which might solve the dilemma we started out with. But as we will see, the implementation of this general idea is unsatisfactory at this point, although it could be improved in future research. (For simplicity, we disregard the issue of sentence length and alarm parameters in the present discussion, although these should presumably play in the end the same kind of role as in our other monkey analyses.)

5.2 A complex call analysis of Colobus sequences

As mentioned before, it would have been simplest for Arnold and Zuberbühler to treat pyow-hack sequences as being phonologically but not syntactically made of pyows and hacks, as such an analysis would not even have raised the possibility that their meanings were compositionally derived. But the non-stereotyped character of pyow-hack sequences and, more importantly, the relatively long pauses found between their component parts, made this analysis rather implausible.

The situation is partly similar and partly different with Colobus snort-roar sentences. On the one hand, they too come in different varieties, since the number of roars present in snort-roar sentences is variable (although the snort part is fixed). On the other hand, snort-roar sentences are produced almost without pause between the various elements, which makes it rather more plausible that they receive a meaning as wholes.

Taking a hint from our non-compositional treatment of pyow-hack sequences in (39), we posit as a first approximation the lexical entries in (54).

(54)

Lexicalist semantics (1st try)

(As before, r+ represents any non-empty string of roars.)

• a.

〚sr+〛=1 iff there is an alert.

• b.

〚r+〛=1 iff there is a serious (or: a non-ground) alert.

• c.

〚s〛=1 iff there is a serious ground alert.

The key here is that sr+ is, by brute force, given the weakest meaning – and it is thus unsurprising that sr+ can appear in all contexts. An individual s is given the plausible meaning of ground-related alerts (we wouldn’t want to say that they are ground predator-related, since numerous single snorts are found in (moving) cow-related contexts [see (92)]. Importantly, r+ could be given the meaning of non-ground or of serious alert. The distinction is not easy to draw since non-ground predators are probably the most dangerous predators as well. Importantly, we should not posit that r+ is reserved for aerial predators, as we find quite a few r+ in chimpanzee-related contexts; but since chimpanzees are presumably both dangerous and (often) non-ground predators, this fact alone does not suffice to decide between the two choices.

As was the case in our earlier discussions, we must still explain why in most cases we find specific rather than non-specific calls at the beginning of predator-related discourses; for instance, few leopard-related and eagle-related discourses start with sr+ rather than s or r+. These patterns can be captured when two assumptions are made.

(i) First, s, r+ and sr+ (viewed as complete propositional elements) compete with each other. While it’s not clear that there should be an entailment relation between s and r+ (in our initial analysis, there isn’t), sr+ should come out as the least informative of the competitors, as represented in (55). The Informativity Principle in (12) will thus predict that sr+ can only occur when s and r+ are inapplicable.

(55)

Informativity scale according to (54)

(ii) If we stopped here, we would predict that, whenever the Informativity Principle is applied, sr+ fails to occur in predator-related situations. This seems too strong. But we can once again make use of our assumption about Alarm Decay in (41): both s and r+ are presumably specified for serious alerts, and consequently when the alert becomes less serious, only the default call sr+ can be used.

Note that we might refine our analysis a bit by positing that s is more informative than r+. If so, it would be by competition with s that r+ gets its ‘non-ground predator’ meaning, as is represented in (56)-(57).

(56)

Lexicalist semantics (2nd try)

• a.

〚sr+〛=1 iff there is an alert.

• b.

〚r+〛=1 iff there is a serious alert.

• c.

〚s〛=1 iff there is a serious alert and it is a ground alert.

(57)

Informativity scale according to (56)

Our analysis could make a more subtle prediction: since individual snorts only appear at the beginning of discourses, one would expect that it is only in such positions that r+ has a non-ground meaning – which would account for the observation, mentioned in (53)c, that roars are strongly indicative of aerial predators if they appear at the beginning of discourses. We can achieve this result with the definition in (58), which specifies that a sentence S1 competes with a sentence S2 only in the case the latter could replace S1 in the discourse in which S1 appears.

(58)

Alternatives

For any sentences S1 and S2 (defined by (51)a), if S1 appears in a discourse D of the form D=AS1B.(defined by (51)b, for any strings A and B), then S2 is an alternative to S1 just in case AS2B is a well-formed discourse.

When combined with the Informativity Principle in (12), this restrictive definition of alternatives will have the effect that r+ will be potentially enriched to a ‘non-ground alert’ meaning only if (i) it appears at the very beginning of a discourse, or (ii) it is preceded by single snorts – since it is only in these two cases that replacing r+ with s will yield a well-formed discourse. In case (ii), enrichment to not s will yield a contradiction and thus will not be applicable (this is thus another case in which it matters greatly that pragmatic enrichment is an optional operation). By contrast, some instances of case (i) should yield the desired enrichment to a ‘non-ground predator’ meaning.

5.3 A compositional alternative?

We now briefly explore a compositional alternative to the complex call theory. As will be recalled, our initial problem was that on a conjunctive compositional treatment, sr+ should be logically stronger than its individual component parts s and r+. But from a purely semantic perspective (i. e. without rules of pragmatic enrichment), it is then unexpected that sr+ should serve as a general alert call, whereas s and r+ have a more limited distribution. The solution we will explore is in two steps.

• (i)

First, we posit that s and r have extremely weak meanings – so much so that even their conjunction could be expected to occur in almost every environment. Specifically, we will posit that s is applicable just in case there is a possibility that there is a ground alert, and r just in case there is a possibility that there is a non-ground alert. 22

• (ii)

Second, we will posit a mechanism of pragmatic competition in which sentences of the form s and r+ compete with sr+ sentences. Since concatenation is interpreted as conjunction, sr+ is more informative than s and also more informative than r+. By the Informativity Principle, s yields the inference that not sr+, hence (given that s is asserted) not r+; and by parity of reasoning, r+ yields the inference that not sr+, hence (given that r+ is asserted) not s. On the assumption that calls are usually produced because the caller has information about a ground or a non-ground threat, s will end up being used (on its strengthened meaning) just in case the caller believes that there is a ground threat while r will be used in case the caller believes that there is a non-ground threat. On the assumption that the required theory of mind is available (a non-trivial assumption in this case), hearers would then adjust their beliefs to take into account those of the speaker, hence the fact that they display appropriate reactions, looking up more often than down when they hear r+, and conversely when they hear r.

Let us see in greater detail how the analysis can be developed. We posit the same syntax as before, but s and r will now be treated as words in all their occurrences, including in sr+ (whereas before they were just treated as phonemes in the latter); their lexical meanings are given in (59).

(59)

Word meanings

• a.

〚r〛=1 iff there is a possibility of a non-ground alarm.

• b.

〚s〛=1 iff there is a possibility of a ground alarm.

Once we have these word meanings, they can be assembled into sentences and interpreted conjunctively, as stated in (60).

(60)

Sentence meanings

Concatenation is interpreted as conjunction.

We continue to apply the Informativity Principle in (12), and for simplicity we take it to apply to the alternatives in (61), with the informativity relations in (62) (the definition of alternatives would have to be significantly refined if sentence length were taken into account).

(61)

Alternatives

s, sr+ and r+ are alternatives to each other.

(62)

Informativity scale according to (59)-(60).

When we combine the literal meanings with the enrichment obtained through the Informativity Principle, we obtain the strengthened meanings in (63), with the convention (already used for Campbell’s calls) that〚r〛and〚s〛(with underlined letters) refer to the strengthened meanings of r and s respectively, obtained by adding to the literal meanings 〚r〛and〚s〛the inferences that are triggered by the Informativity Principle. 23

(63)

Strengthened meanings

• a.

s〛=1 iff there is a possibility of a ground alarm but no possibility of a non-ground alarm.

• b.

r+〛=1 iff there is a possibility of a non-ground alarm but no possibility of a ground alarm.

• c.

sr+〛=1 iff there is a possibility of a ground alarm and there is a possibility of a non-ground alarm.

In this way, we derive the result that s is almost only used for ground threats and r+ is almost only used for non-ground threats (we add ‘almost’ because pragmatic strengthening need not apply in all cases).

While various refinements could be envisaged, we will now explain why we take this theoretical direction to be somewhat unsatisfactory anyway.

5.4.1 Assessment

Although our compositional analysis succeeds in deriving a general meaning for sr+, it has flaws. The heart of the matter is that despite initial appearances the meaning we predict for sr+ is not that of a general alert call. Rather, sr+ comes with the positive entailment that there is a possibility of a ground threat and there is a possibility of a non-ground threat. Consider for instance one of the many discourses for the form (s_)+ (sr+_)+, in which a series of individual snorts separated by pauses is followed by a series of snort-roar sentences separated by pauses as well. The pragmatic meaning obtained in (63)a for snorts yields an inference that there couldn’t be a non-ground threat. But this is directly contradicted by the meaning of sr+ obtained in (63)c (pragmatic enrichment didn’t matter there because sr+ was already the most informative sentence among the set of competitors). Undesirable contradictions already arose in our pragmatic analysis of Campbell’s monkey semantics in Section 3.3.2. As will be recalled, we had analyzed krak-oo as contributing the information that there is a weak alarm; but then we were hard pressed to explain why krak-oo can co-occur with hok, which is indicative of eagles – a serious threat. Our solution was to relativize the contribution of a call to the caller’s state at the precise time at which the call is uttered. Thus we could imagine that a hok could be followed by a krak-oo if the caller had become less alarmed than it initially was when uttering hok. We could adopt a similar line of explanation with Colobus calls, but this would have unwelcome consequences; for in our analysis it is not the seriousness of the alarm that would need to be re-assessed in the course of the utterance of a discourse, but the nature of the alarm. And we cannot think of a natural reason why a snort, indicative (on its strengthened meaning) of the possibility of a ground threat and the impossibility of a non-ground threat, might after a while be followed by a snort-roar sentence that positively entails the possibility of a non-ground threat.

Pending further investigation, we take the compositional analysis to be somewhat unsatisfactory, which could lend credence to the non-compositional analysis. This is of course a conclusion out of lack of imagination, hence a weak one; but it is at least compatible with the acoustic data, which suggest that sr+ sequences might really form a phonological unit, as shown in (52)b. We believe, however, that more detailed data and further theoretical investigations might well cast doubt on this conclusion.

5.4.2 Comparison with King Colobus data

One remark is worth adding concerning Colobus phylogeny. Strikingly, King Colobus monkeys have calls that are very similar to those of Guerezas, despite the fact that in one recent estimate (see Ting 2008; Grubb et al. 2003), Guereza Colobus monkeys and King Colobus monkeys separated approximately 1.6 million years ago (see Schel et al. 2009 for a brief comparison of their acoustic properties). Furthermore, in the limited dataset we have, their uses are rather similar as well – although the generalizations we find in King Colobus calls are less clear and might be more subject to noise than in Guerezas. Specifically, sr+ sentences appear in all environments, while individual snorts only appear at the beginning of discourses and are usually (but not invariably) indicative of ground threats; while r+ sentences are usually (but not invariably) associated with non-ground threats, especially when they appear in initial positions. Data can be found in (93) in the Supplementary Materials, in the same format as the Guereza data from Kaniyo Pabidi (in (91)) and from Sonso (in (92)).

These similarities highlight the preservation of call form and function over long periods of times – an issue we come back to in Section 7. It is also to be hoped that more fined-grained data in the future will make it possible to study differences among the calls of these two species.

6 Titi monkeys: semantics or cognition?

The previous discussions highlighted the importance of the division of labor between syntax, semantics and pragmatics. For Campbell’s monkey calls, one theory posited a dialectal difference in the lexical semantics, while another theory posited a unified meaning selectively enriched by the Informativity Principle. For Putty-nosed monkey calls, one theory postulated that pyow-hack sequences are interpreted as whole units, whereas the other theory interpreted them call-by-call – but crucially relied on the Informativity Principle and on the Urgency Principle to yield further inferences. We now turn to the case of Titi monkey sequences, which will highlight the importance of the division of labor between semantics/pragmatics and properties of the environmental context (we only summarize the main argument, which is laid out in much greater detail in Schlenker et al., to appear).

We start from Cäsar et al. (2013), who showed in striking field experiments that Titi monkeys can encode information about both predator type and predator location, using just two calls (A and B) rearranged in complex ways. While this might initially appear to argue for a complex syntax/semantics mapping, we argue instead for a very simple semantic analysis, crucially complemented by non-trivial assumptions about the environmental context. Specifically, we first show that the simplest behavioral assumptions make it challenging to provide lexical specifications for A- and B-calls: B-calls rather clearly have the distribution of general alert calls; but A-calls are also found in highly heterogeneous contexts (e. g. they are triggered by ‘cat in the canopy’ and ‘raptor on the ground’ situations). We discuss two possible solutions to the problem. The first analysis posits that entire sequences are endowed with meanings that are not compositionally derived from the individual calls they contain. The second analysis combines a very simple compositional analysis with some more sophisticated assumptions about predator behavior and context change.

6.1 Titi sequences

In Cäsar et al.’s field experiments, the two factors predator={cat; raptor} and location={on the ground; in the canopy} gave rise to four types of sequences obtained with just two calls, the A-call and the B-call, as represented in slightly simplified form in (64). (Notation: if X is any call, X+ represents a series of at least one X-call, and X++ a series of at least two X-calls. We write X+ and X++ for series that display these patterns with up to 3 extraneous calls interspersed.)

(64)

Model predator experiments

 a. Raptor in the canopy: A++ (4/5 sequences; the 5th contains interspersed O-calls; average length of the 5 sequences=26.8 calls) b. Raptor on the ground: A++B++ (5/7 sequences; 2/7 have the form A+) c. Cat in the canopy: A B++ (4/6 sequences; 1/6 has the form X A B++ with an unidentified call X) d. Cat on the ground: B++ (5/5 sequences)

In addition, Cäsar collected naturalistic data, which provide further useful information and are summarized in simplified form in (65).

(65)

Naturalistic observations

 a. Flying raptor: A+ (19/20 sequences; average length of the 20 sequences=2.2 calls) b. Calling or perched raptor: A++ (9/9; average length of the 9 sequences=15.8 calls) c. Capuchin in tree: Diverse: A++, A++ B++ A++, etc with C-calls interspersed d. Non-animal-related (foraging/descending/feeding): B++ (13/16)

Simple inspection shows that the stereotyped call sequences in (64) encode information about both predator threat and predator location. Despite initial appearances, we will argue that this does not argue for a complex syntax/semantics interface; rather, the generalizations in (64) and (65) are compatible with an analysis in which each call has a simple meaning that pertains to the precise moment at which it is uttered; and the complexity of the call sequences is due at least as much to properties of the environment (and the fact that the context changes as calls are uttered) as to the Titi linguistic system per se.

6.2 Initial problems

The Titi generalizations in (64) and (65) pose a challenge for a semantic analysis based on simplistic assumptions about the environment. In a nutshell, the problem is that the B-call occurs in environments that are so diverse that it seems to be a general alert call, with a very weak lexical specification. But it turns out that within predator contexts the A-call also occurs in environments that do not seem to form a natural class, and hence that it seems to function as a general predator alarm call. But if this is so, within predator contexts the difference between A-calls and B-calls becomes hard to analyze.

It is immediate that the lexical contribution of B-calls must be extremely weak, since they appear both in non-predation-related ((65)d) and in predation-related situations ((64)b, c, d; (65)c); furthermore, within predation contexts they occur both in eagle- ((64)b) and in cat-related situations ((64)c,d), and in situations in which the threat is on the ground ((64)b,d) as well as in the canopy ((64)c). In addition, Cäsar et al. (2012b) note that B-calls were produced “sometimes in the absence of external events, especially when monkeys were descending or foraging close to the ground, when an observer was blocking their intended path, during inter-group encounters and, for unhabituated groups, in response to humans”. This suggests the lexical entry in (66).

(66)

B-call

B is applicable if and only if there is a noteworthy event.

But A-calls also occur in heterogeneous environments. While in our data A-calls are not used in non-predatory situations, within predator-related situations they occur both in ‘raptor on the ground’ situations ((64)b) and in ‘cat in the canopy’ situations ((64)c). It is hard to see what these two situations could have in common besides being situations of predation. So it would seem reasonable to posit the lexical entry in (67):

(67)

A-call (initial attempt)

A is applicable if and only if there is a predator-related alert.

Although A- and B-calls have different lexical specifications according to the rules in (66) and (67), within predator contexts it is unclear what could distinguish them. We could of course make use of the Informativity Principle in (12), for instance at the level of individual calls – using our previous terminology, this would treat each call as a sentence, with the assumption that A and B are alternatives to each other, as stated in (68). But this makes incorrect predictions: the B-call is now predicted not to arise in predator-related environments.

(68)

Alternatives

Each individual A- and B-call is a sentence, and they are alternatives to each other.

(69)

Prediction of (68) given (66)-(67) and (12)

When pragmatic strengthening is applied, the B-call should only be applicable when there is a non-predator related alert (since in predator-related alerts the A-call is more informative).

The heart of the matter is that while the A-call is more informative than the B-call, its lexical specification is still very weak. And since the Informativity Principle in (12) has the effect of enriching the meaning of B with the negation of A, the result is an enriched meaning for B which is just too strong.

We believe there are two natural directions to explore to solve the problem we just laid out. One is to posit that the meaning of an entire sequence is not compositionally derived from the meaning of the individual calls it contains, but rather is obtained ‘holistically’. As an alternative, we will posit that the meaning of a sequence is compositionally derived, but we will make more sophisticated assumptions about predator hunting strategies and context change.

6.3 A non-compositional theory

Just as we did in our non-compositional analysis of pyow-hack sequences, we will now assign meanings to whole sentences. Since we do not have information on the role played by sequence length, we omit the alarm parameter from the present analysis (it would be easy to add one if necessary). We posit the syntax in (70) and make use of the Informativity Principle to posit relatively simple meanings, as in (71), with the natural definition of alternatives in (72). As in our statement of the generalizations in (64)-(65), we write X++ to abbreviate XX+ (= the set of strings made of at least two X’s (and only X’s)). Note that the sentences we posit mirror those found in model predator experiments in (64), but also (in simplified form) those obtained from observational data in (65) – with the exception of the (chaotic) Capuchin sequences, to which we return below.

(70)

Syntax

B++, A++, A++ B++, AB++

(71)

Non-compositional Titi meanings

If S is a complete call sequence, then

• a.

if S of the form B++, 〚S〛=1 iff there is a noteworthy event.

• b.

if S of the form A++, 〚S〛=1 iff there is a non-ground predator.

• c.

if S of the form A++B++, 〚S〛=1 iff there is a non-ground predator on the ground.

• d.

if S of the form AB++, 〚S〛=1 iff there is a ground predator in a non-ground position.

(72)

Alternatives

The alternatives to a Titi sentence S are all the sentences obtained by replacing, call for call, any number of A’s with (the same number of) B’s and any number of B’s with (the same number of) A’s.

(71) gives rise to the relations of informativity in (73); combined with the Informativity Principle in (12) (and the definition of alternatives in (72)), this gives rise to the strengthened meanings in (74); as in our earlier discussions, we write as〚S〛the strengthened meaning of a sentence S (whose literal meaning is just〚S〛). 24

(73)

Informativity relations among Titi sentences

(74)

Strengthened Titi meanings

If s is a complete call sequence, then

• a.

if S is of the form B++,〚S〛=1 iff there is an alert but no non-ground predator.

• b.

if S is of the form A++,〚S〛=1 iff there is a non-ground predator in the canopy.

• c.

if S is of the form A++B++,〚S〛=1 iff there is a non-ground predator on the ground.

• d.

if S is of the form AB++,〚S〛=1 iff there is a ground predator in a non-ground position.

Still, this theory has several flaws. First, it is not explanatory: it just stipulates in slightly improved form (thanks to the Informativity Principle) the generalizations it was supposed to derive. Second, the time course of Titi sequences also makes it unlikely that these are treated as whole units. As discussed in Schlenker et al. to appear, the average inter-call interval is of 1.4s. In ‘raptor on the ground’ situations, when B-calls are produced, the first B-call appears on average after the 12th position in the sequence (average position: 12.6); very roughly, this gives an average waiting time of 16-17s or so before one hears the first B-call after the first call is produced – hence Titis would sometimes have to wait for something like 16s to tell whether the sequence they are hearing is of the form A++ or A++B++. Finally, Cäsar et al. (2012b) notes that after hearing an A-call the Titi monkeys look upwards. Presumably this also applies to the A++B++ sequences, which according to (71)c provide information about non-ground predators on the ground (see Schlenker et al. to appear for details). But this suggests that what is crucial about A-calls is not so much their raptor-related content as the information they provide about a non-ground threat.

6.4 A compositional theory based on more sophisticated environmental assumptions

We will now circumvent these problems by assuming (i) that the A-call does not provide information about the nature of the predator and its location, but rather about the appropriate reactions to the relevant threat, and (ii) that the composition of sequences in part reflects the way in which the environment changes as a sentence is uttered. Specifically, we will assume that the A-call is specified for serious non-ground alerts, and that the reason we find an initial A++ sequence in ‘raptor on the ground’ situation is that raptors attack by flying, and thus that even when a raptor is on the ground the threat isn’t a ground one. As for the fact that B++ sequences can follow A++ sequences, this will be taken to reflect a drop in threat level after the appearance of an initial trigger, with the result that the ‘serious non-ground alert’ content of the A-call stops being applicable, leaving B as the only contender.

To develop the analysis in greater detail, we will need three assumptions that have some independent motivation (see Schlenker et al., to appear for details and references).

(75)

• a.

A raptor hunts by being perched or by flying; a raptor on the ground is not in a hunting position.

• b.

Cats become less dangerous once they have been detected (they hunt by ambush rather than pursuit).

• c.

Capuchins are dangerous even if they have been detected (they are pursuit hunters).

Our semantics is now extremely simple: each call contributes a simple meaning to a sequence; when several calls are present, they are composed conjunctively. Importantly, however, we take into account the fact that call rates are relatively slow, and relativize the truth conditions of calls to different times. In effect, this means that each call is treated as a separate utterance. As a result, call repetition is not semantically vacuous: each token makes a new claim, namely that the relevant alarm holds at the time of utterance of that token. If we wish to compute the overall semantic effect of a sequence, we must conjunctively combine the calls while adapting the value of the time parameter with each call.

(76)

Semantics of Titi discourses (partial: A- and B-calls only)

For any time t,

• a.

〚B〛t=1 iff there is a noteworthy event at t.

• b.

〚A〛t=1 iff there is a serious non-ground alert at t.

• c.

If w is a call and S is a sequence of calls,

• 〚wS〛t=1 iff 〚w〛t=〚S〛t+1=1.

(76)a is just the statement that B has a highly underspecified semantics. (76)b encodes the treatment of A as a serious non-ground alert call. And (76)c specifies that calls are combined in a conjunctive way, and that calls that follow each other are evaluated at different times. An example is given in (77).

(77)

〚AB〛0=1 iff 〚A〛0=〚B〛1 =1, iff there is a serious non-ground alert at time 0 and there is a noteworthy event at time 1.

Unsurprisingly, the distribution of the B-call should be constrained by appealing to the Informativity Principle, combined with the assumption that calls individually compete with each other, as is already stated in (68). (As in our initial discussion in Section 6.2, we take single calls to be complete utterances, with the result that the Informativity Principle enriches meanings on a call-by-call basis.)

(78)

Consequence of (12) and (68)

If B is uttered at time t, one can infer that at time t (i) there is a noteworthy event, but not (ii) a serious non-ground alert (or else the A-call would have been used).

Let us now see how our hypotheses can derive the patterns we observe. We start with the data in (64)-(65), repeated in condensed form in (79).

(79)

Simplified Generalizations

 a. Raptor b. Cat c. Capuchin d. Non-predator related Experimental Canopy (i) A++ (mean=26.8 calls) (i) AB++ Ground (ii) A++B++ (ii) B++ Naturalistic (iii) Flying: A+ (mean=2.2 calls) (iv) Calling or perched: A++ (mean=15.8 calls) In tree: A++, A++B++A++, etc deer, foraging/descending B++

Raptor situations: The generalizations (79)a(i), (ii) and (iv) are unsurprising in view of our hypotheses about eagle hunting techniques in (75)a. First, a model raptor in the canopy or a perched raptor present serious non-ground threats, and the threats should be taken to persist in time since these are typical hunting positions – hence the fact that the sequences are long. Second, sequences of A-calls are shorter in the naturalistic ‘flying raptor’ situations in (79)a(iii) (Mean number of As: 2.2) than in the naturalistic ‘calling/perched raptor’ (M=15.8 As, W=13, p=.00021) 25 situations in (79)a(iv) or in the experimental ‘raptor in canopy’ situations in (79)a(i) (M=26.8, W=0, p=.00048); this is presumably because a raptor that flies away is a briefer threat than immobile raptors in typical hunting position (it might also be that in experimental situations the model raptor remains perched longer than in naturalistic ones, although it might become clear at some point that it is not a normal living raptor). Finally, the pattern in (79)b(ii) is expected if we remember that a raptor on the ground will attack (if it does) by flying, as stated in (75)a. Therefore the initial A++ we find is unsurprising and provides the most urgent message first, as a danger may be coming from above.

Since immobility on the ground is not a typical hunting position, it is also unsurprising that after a while the threat stops being considered as serious – presumably because a raptor would not normally remain motionless on the ground for long periods of time. As mentioned above, the first B-call in ‘raptor on the ground’ situations occurs on average in position 12.6 in the sequence – with a possible estimate of 16–17s after the first call. This might give the caller enough time to decide that the threat isn’t too serious any more; if so, it is because the A-call is specified for serious non-ground threats that it stops being applicable after that time, with the result that B-calls start being used instead. By the logic of pragmatic competition, this is the only case in which we see B-calls for raptor-related threats.

Cat situations: In ‘cat on the ground’ situations, only the B-call can be used, hence the B-sequences in (79)b(ii). The production of an A-call (specified for a ‘serious non-ground alert’) at the beginning of ‘cat in the canopy’ situations should give us pause. Given our assumptions, it can be explained:

• When a cat is detected in the canopy, it represents a serious non-ground threat, hence the production of an A-call.

• As a consequence of this A-call, it can be assumed that the cat has been detected by conspecifics. 26 As a result, the threat level diminishes, in accordance with (75)b. Because the A-call is specified for serious non-ground threats, it can’t be used any more, with the result that only the B-call can be used.

Capuchin situations: Naturalistic capuchin situations give rise to a diversity of calling sequences, some of them with quite a few A-calls. This is strikingly different from the stereotyped AB++ sequences we found in ‘cat in the canopy’ model experiments. It is thus notable that two mammal predators in non-ground situations give rise to such different calling behaviors. Now one source of the difference might be that real capuchins move in ways that model cats don’t. But an additional explanation might lie in the difference between leopard and capuchin hunting strategies outlined in (75)b,c: capuchins continue to be dangerous even after they have been detected, hence we have no reason to expect the AB++ pattern we found in ‘cat in the canopy’ situations. While this doesn’t explain the details of the complex patterns we find in capuchin situations, it does give us a way to address an initially surprising difference.

Non-predatory situations: It is clear that in situations that do not involve serious threats the A-call cannot be used, hence the fact that we only find B-calls in these cases.

6.5 Conclusions

With our initial hypothesis that calls directly convey information about predator type and/or predator location, we were not able to provide a coherent meaning for A- and B-calls, and we had to resort to a somewhat unappealing theory in which entire sequences had a non-compositional meaning. Arguably, a better and simpler theory can be obtained if we make use of more sophisticated assumptions about context change and the environment. Two proved particularly crucial: first, we assumed that a raptor on the ground still signals the presence of a non-ground threat; second, we assumed that the meaning of calls is relativized to the precise time at which they are uttered, with the result that the composition of a sequence sometimes reflects the way in which the cognitive situation changes as the sequence is uttered. It should be added that besides semantics and the environmental context, pragmatic competition among calls played an important role in our explorations, since this mechanism was crucial to explain why B-calls are not found in all situations. Thus, as was the case in our earlier explorations, the division of labor between syntax, semantics, pragmatics and properties of environmental context is essential to understand Titi sequences; but in this case assumptions about context change and the environment proved particularly crucial.

7 Evolutionary monkey linguistics

The previous sections have considered the calls of various species as independent systems. But it is also important to consider that a species’ calls are parts of a suite of related, genetically constrained characters. Across monkey species, vocal signals generally appear to be fixed traits, especially in terms of acoustic structure and production (Seyfarth and Cheney 1997). Vocal repertoires, therefore, reflect the evolutionary histories of species and should, in principle, exhibit predictable patterns across taxa. As discussed in Section 5.4.2, Guereza and King Colobus monkeys diverged approximately 1.6 million years ago (see Ting 2008; Grubb et al. 2003), yet appear to display essentially the same calls, with what currently seem to be minor phonetic differences, and no clear differences in use. It is remarkable that, in some cases at least, the form and function of calls seem to be preserved over rather long periods. 27

In this section, we suggest that a comparative approach to the calls of different monkey species could lay the groundwork for an evolutionary monkey linguistics.

• First, by integrating phylogenetic information about different monkey species with a comparison of their call systems, we offer methods for investigating patterns and processes related to repertoire development over millions of years.

• Second, this study would add some empirical depth to our formal theories. One key limitation of our enterprise is that the data are hard to get and leave theories very much underspecified. We could constrain them further by developing parametrized theories that seek to account for similarities and differences among call systems, as well as call evolution.

• Finally, when similar call systems are found in very distant species, they could help specify the forces that might have led to convergent evolution. We mentioned before (in Section 3.4) Wheeler and Fischer’s (2012) observation that across species it tends to be the call associated with terrestrial predators that is given in other contexts, whereas the call associated with aerial predators tends to be more specific. They cite examples from New World monkeys as well as lemurs. Given the date of separation between these species, it is unlikely that common descent is responsible for these similarities, and it is more likely that we are dealing with a case of convergent evolution – hence an interesting question for the future: why do there seem to be more specific calls for aerial predators than for ground predators? We trust that many other similar questions will arise when a comparative approach is adopted.

7.1 Comparative methods

At least two main approaches could help investigate the evolutionary history of monkey languages. First, one could compare whole repertoires of related species and, using principles of phylogenetic analysis, test whether similarities in calls’ acoustic structure and usage are better explained by convergent evolution or by common descent. This enterprise could build on rich data on call comparison originally gathered to help reconstruct phylogenetic trees, as we will see shortly. Second, one could draw inferences about repertoire evolution using a method of internal reconstruction proposed by Fuller (2013). The main idea is that stereotyped calls of a species’ repertoire emerged from ancestral call types through repeated processes of fission and modification. Fuller further posits that in such cases the daughter calls are used in situations that are a subset of those in which the ancestral call was used – in effect, he posits that the division of a call on the phonological side corresponds with a division of semantic labor between the daughter calls. If this idea is on the right track, we can to some extent reconstruct the evolutionary history of a repertoire by considering its internal acoustic and possibly semantic structure. While this method has been applied only to Blue monkey calls, it could yield a powerful tool if combined with cross-linguistic comparisons: one would expect the results of both enterprises to converge. We now turn to a brief discussion of each method.

7.1.1 Cross-species comparisons

Phylogenetic reconstructions have long relied on comparison of phenotypic characters (e. g. cranial morphology, dental patterns) among species, though the utility of behavioral elements in systematics remains debated. Some researchers have explicitly argued that calls are well-preserved aspects of the phenotype, making them appropriate for reconstructing phylogenetic trees (e. g. Cap et al. 2008; Gautier 1988). Evidence that the acoustic structures of many primates’ calls are largely genetically determined (Newman and Symmes, 1982; Seyfarth and Cheney 1997) supports this argument. It is also notable that trees derived using DNA sequences only are similar to those obtained using vocal signals. For example, trees for cercopithecines recently derived using the most up-to-date DNA sequencing data (Guschanski et al., 2013) are remarkably similar to one proposed by Gautier (1988) that included vocal signals along with other phenotypic and genotypic characters. An illustrative example is displayed in (80), where we have put side-by-side part of the phylogenetic tree recently obtained on the basis of DNA data by Guschanski et al., and the tree reconstructed by Gautier (1988) using acoustic similarity among calls. 28

(80)

Simplified phylogenetic trees for the Cercopithecini tribe derived using mitochondrial DNA sequence data (right; Guschanski et al. 2013) compared to one that included vocal signals as well as genetic and morphological characters (left; Gautier 1988). Note that the superspecies C. mona includes Campbell’s monkeys (which appear as * in the two trees).

Phylogenetic reconstructions today rely on DNA sequencing results more than phenotypic characters, and for this reason systematic call comparison seems to have partly fallen into oblivion. We believe that it should be revisited from a different perspective. Instead of using call data to reconstruct phylogenies (e. g. Gautier 1988; Gautier et al. 2002), well-resolved phylogenies might be used to reconstruct the evolutionary history of repertoires and individual calls. The model proposed here builds on the hypothesis that species’ vocal repertoires are inherited from ancestral species, and that interspecific differences in repertoires reflect changes that occurred after speciation. The method could involve the following components.

• (i)

First and foremost, comparisons must use well-resolved phylogenies and complete repertoires. Though questions remain, research in these areas continues to improve the availability of both. Where phylogenetic and call data are adequate, it will be most useful to examine the most closely related congeners (i. e. sister taxa) first, and then expand in a stepwise fashion across taxa.

• (ii)

One should then identify potentially homologous call types among related species – for instance by way of a cluster analysis using acoustic measures of call samples pooled from different species. The results could then be used to construct hypothetical “call trees” from which to infer shared call types among species.

• (ii)

One could then determine similarities in the uses (syntax, semantics and possibly pragmatics) of these calls. Although the form of a call might have been preserved while the uses changed or conversely, the correspondence will be established more strongly if it involves both form and function. Nothing in the logic of this enterprise would prevent one from starting from calls that are semantically similar without thereby being acoustically similar. But establishing semantic similarity is far more difficult than establishing acoustic similarity, and for this reason it seems advisable to start from the acoustic side and add semantic considerations when possible. In addition, in the absence of common descent there might be powerful socio-ecological similarities that explain why some very distantly related species have signals with the same semantics (‘function’) – e. g. many birds, rodents, and primates have calls associated with snakes, and it is unlikely that all of these calls were derived from a single common ancestor.

• (iv)

Finally, one should explore at least four hypotheses to account for the similarities: (a) they might be a simple accident; (b) they might be the product of convergent evolution; (c) they might be explained by common descent; and/or (d) they might be the product of co-evolution of several species (note that sympatric associations are common and could lead to mutual influences in calling behavior). In case (c), one may start to reconstruct the evolutionary history of the relevant part of the language by using the dates of divergence across species to postulate ancestral forms of (part of) a monkey language.

7.1.2 Fuller’s method of internal reconstruction

Fuller (2013) proposes that repertoires evolve by repeated fission of ancestral calls, and that two things typically happen when fission takes place: (i) the daughter calls share some acoustic properties with each other because they are variations on the same ancestral call; (ii) the situations in which the new calls can be used are subsets of the situations in which the ancestral call could be used; in other words, the meanings of the new calls are refinements of the meaning of the ancestral call.

Concretely, Fuller applies clustering algorithms to the acoustics of calls, and takes the resulting tree to reflect their evolutionary history: if call CA and CB are part of a sub-tree that excludes CC, then Fuller posits that CA and CB resulted from the fission of an ancestral call CAB which did not give rise to CC. He then posits that the situations in which CAB could be used were a superset of the situations in which CA and CB can currently be used.

Fuller’s method could be refined along several dimensions. First, in Fuller (2013) this method is applied to all calls, which seems excessively strong: when two calls are produced by entirely different articulatory means, it seems unlikely that they evolved by fission of an ancestral call. For instance, in Blue monkeys and Putty-nosed monkeys, booms are produced with air sacs whereas other calls (e. g. pyows and hacks or kas) aren’t, and it does not seem very plausible that all these calls evolved from one ancestral one. But nothing in Fuller’s logic prevents one from refining the analysis and applying his algorithms to calls that are produced by comparable means – which would set booms aside from all other calls.

Second, several additional operations could be added to Fission in order to explain how repertoires evolve. Logically, a repertoire R (seen as a list of calls) could evolve by any of the following operations:

• (i)

disappearance of a call from R;

• (ii)

de novo emergence of a call that has no part in common with any of the calls of R;

• (iii)

modification of a call of R, for instance by (a) removing a part of it (hence a shortening); (b) adding a new part to it (hence a lengthening); (c) neither (a) nor (b) (e. g. changing duration, lowering or raising frequency components, etc.)

• (iv)

combination of several calls of R into a new call;

• (v)

any combination of (i) through (iv).

Fuller focuses on Fission, but if his method is on the right track the other conceivable processes listed could be explored as well.

7.2 Putty-nosed vs Blue monkey comparison

Blue monkeys and Putty-nosed monkeys share a common ancestor approximately 2.5 million years ago (Guschanski et al. 2013). The vocal repertoires of these species, and particularly those of adult males, are strikingly similar, with both using apparently homologous booms, pyows, and hacks (called kas in Blue monkeys). This fact could be informative about call evolution, and it might also be important to understanding pyow-hack sequences: the key question is whether Blue monkeys have a counterpart of these sequences, and to explain why.

7.2.1 Male Blue monkey calls

Fuller (2013, 2014) evaluated acoustic and functional characteristics of the complete vocal repertoire of adult male Blue monkeys (Cercopithecus mitis stuhlmanni). To diagnose call types, this study applied ordination 29 and cluster analyses to acoustic measurements of hundreds of recordings from 20 different males. Results indicate a repertoire that comprises six distinct call types: boom, pyow, ka, katrain (a rapid string of kas), ant, and nasal scream, as seen in (81).

(81)

Repertoire of male Blue monkeys clustered according to acoustic similarity (from Fuller 2014)

Distinct call types include: boom, nasal scream, pyow, ant, ka, katrain. Note that, acoustically, booms are extremely distinct from the cluster of other call types, within which the nasal scream is separate from the other four calls.

Fuller’s clustering-based method has two advantages. First, it establishes a clear, objective method for identifying distinct categories (i. e. call types) within the system. Second, the results of hierarchical cluster analysis identify natural grouping patterns among call types that can serve as a starting point in evaluating the evolutionary history of the calls.

Fuller further studied the distribution of these calls in naturalistic contexts and in field experiments. The graphs in (82) display the observed frequency of calls in various contexts, compared to the expected frequency if distribution were random.

(82)

Use of different call types by male Blue monkey calls (adapted from Fuller 2013 p. 171)

• -

As in Campbell’s monkeys, booms are strongly associated with undisturbed and affiliative contexts, not predators. 30

• -

Kas and katrains typically occur together in the same call sequence and have similar distributions; we will treat them as a single call arranged in different ways (thus the syntax will have to ensure that ka can come singly or in trains). It is clear that kas primarily occur in situations of aerial threats; other contexts include sudden, loud disturbances such as falling trees. This is compatible with a version of the specification for non-ground threats that we posited for Putty-nosed hacks (for hacks, however, we had postulated the more specific meaning of serious non-ground movement-related threat).

• -

Pyows are used in all contexts and occur more than expected by chance with all but falling branches, aerial predators, and undisturbed. Their distribution is compatible with that of a general alert call, although one would need to explain why they are less common in contexts of aerial threats. Competition with kas is a natural explanation.

• -

Ants might initially seem to be terrestrial predator calls – and this is indeed how Fuller describes their evolutionary function (Fuller 2013). But their occurrence in situations of Tree fall disturbances suggests that their use might reflect a more general reaction of fear. Making full use of the Informativity Principle, we will posit that ants are general calls, but that unlike pyows they are used only in situations of relatively serious alert. If we simultaneously postulate that kas are specified as being used for serious non-ground alerts, we will plausibly derive the result that kas are more informative than ants, and hence the Informativity Principle will explain why we do not find ants in contexts involving aerial predators. 31

• -

Finally, the nasal scream is clearly an agonistic/distress call, used only in extreme fights between males and then usually by the loser (Fuller 2013). 32

We tentatively posit the lexical entries in (83), which must be complemented by the Informativity Principle in (12). At this preliminary stage, we will make the assumption that individual calls are complete utterances, and that all non-boom calls are alternatives to each other, as stated in (84). We treat pyow as a general alert call, ant as a serious alert call, and we borrow the entry for ka in (83)b1 from our analysis of Putty-nosed hacks (we have provided an alternative version as (83)b2, inspired by our arousal-based treatment of hack in Section 4.4). Informativity relations among calls are summarized in (85). We continue to use an alarm parameter to ensure that repetitions have some semantic effect, although this will not play a role in our discussion.

(83)

Blue monkey semantics

For any alarm parameter a≥0,

• a.

〚pyow〛a=1 iff there is an alert and the alarm level is at least a.

• b1.

Version 1: 〚ka〛a=1 iff there is a serious non-ground movement-related alert and the alarm level is at least a.

• b2.

[Version 2: 〚ka〛a=1 iff there is an alert causing high arousal and the alarm level is at least a.]

• c.

〚ant〛a=1 iff there is a serious alert and the alarm level is at least a.

• d.

〚nasal-scream〛a=1 iff there is distress in a situation of male-to-male aggression and the alarm level is at least a.

• e.

If w is any call and S is any sequence, 〚wS〛a=1 iff 〚w〛a=1 and 〚S〛a+1=1.

(84)

Blue monkey alternatives

Individual calls are complete utterances, and all non-boom calls are alternatives to each other

(85)

Informativity relations among non-boom Blue monkey sentencesaccording to (83) (higher=strictly more informative). Note that we assume that if there is a serious non-ground movement-related alert, then there is a serious alert – which is why ka appears above ant.

In (86), we provide three possible theories of the meanings of pyow and ant (where Theory I corresponds to the one sketched above). It is worth noting that if the Informativity Principle were absolute, the three options in (86) would make equivalent predictions.

(86)

Three theories of pyow and ant (informally stated, i.e. without alarm level)

• a.

• b.

• c.

Theory III (with neither pyow nor ant more informative than the other)

In Theories I and II, the logic of competition ensures that the weaker call gets enriched by applying the Informativity Principle, hence the strengthened meanings in (87). In Theory III, neither call entails the other, and thus the strengthened meanings are identical to the literal meanings in (86)c; but these are already equivalent to the meanings in (87).

(87)

Strengthened meanings of pyow and ant on all three theories in (86)

Importantly, however, we needn’t assume that the Informativity Principle applies systematically. In fact, in our pragmatic analysis of Campbell’s monkey calls, we posited that in the Tai forest the call krak has a literal meaning of general alert but a strengthened meaning of ‘serious ground threat’; and we took strengthening to apply in most but not in all cases, as there seemed to be residual ‘general alert’ uses of krak in that site. Something similar seems to hold of Blue monkey pyows in (82): this call occurs in all contexts, unlike ant, which is absent from Undisturbed and Branch contexts. This gives an advantage to the main analysis we develop in the text, namely Theory I in (86)a: because pyow is lexically specified as a general call, we expect it to occur in all contexts when its meaning is not strengthened. Theories II and III would have to explain why pyow can arise in serious threat contexts – though this could be done if it turned out that in such cases pyow appears late in sequences, when the seriousness of the alarm could have decayed. Theory III has the additional drawback that it must explain why the general alert call ant fails to occur in Branch and Undisturbed situations. We tentatively conclude that the analysis we develop in the text is preferable.

We leave for future research a discussion of the role of the nasal scream in the system, and in particular of its effect on competition relations when the Informativity Principle is taken into account. If the nasal scream enters in these relations, its effect will be limited anyway: since its use is so specific, no other call will be strictly more informative. And for the same reason, the enrichment it will give rise to will be very weak; specifically, with the informativity relations obtained in (85), its sole effect will be to enrich pyow with an inference that one is not in a situation of male-to-male aggression with distress. By contrast, the left-hand side of the informativity hierarchy in (85) will be strongly affected by the Informativity Principle. We already saw that pyow and ant should have a strengthened meaning that blocks their use in situations of serious aerial alert. In addition, pyow will be enriched with a not ant component, which predicts that it should not generally be used in cases of serious alert.

7.2.2 Comparison with Putty-nosed monkeys

Several interesting questions are raised by the comparison between Blue monkey and Putty-nosed monkey calls: (i) how similar are their repertoires and what does this tell us about their evolutionary history? (ii) how similar is their syntax?

7.2.2.1Repertoire comparison and evolution

Let us first point out the acoustic and semantic similarities between the two repertoires.

Acoustic structure: the call types used by male Blue monkeys are remarkably similar to those used by male Putty-nosed monkeys, as seen in (88). Booms and pyows, for example, are acoustically matched between species, and the Blue monkey ka and katrain are clearly counterparts to the Putty-nosed hack, which is used singly or in a rapidly repeated string.

(88)

Spectographs of Blue monkey calls (top line) vs. Putty-nosed monkey calls (bottom line).

Representative spectrographs (produced in Raven 1.4, by JF) for each call type in the vocal repertoire of adult male Blue monkeys (Cercopithecus mitis) and adult male Putty-nosed monkeys (Cercopithecus nictitans), showing considerable similarity between the two species. Samples are from field recordings (by JF and KA) in wild populations. Names for call types are from published accounts (Fuller 2014; Arnold et al. 2006a).

Although published reports of male Putty-nosed calls do not name a counterpart to Blue monkey ants, Arnold and Zuberbühler (2006b) describe Putty-nosed pyows as having easily distinguishable long and short variants. The “short pyows” of Putty-nosed monkeys are extremely similar to Blue monkey ants, suggesting a homology possibly overlooked due to differences in categorization methods. If so, only the Blue monkey nasal scream is left without an obvious counterpart in the Putty-nosed repertoire. This may be an actual difference between the two repertoires; however, given the extreme rarity of the Blue monkey nasal scream (see above), the call might simply have been missed in Putty-nosed monkeys.

Semantics: Despite differences in field methods and data coding among researchers, there are several obvious similarities in how Blue monkeys and Putty-nosed monkeys use their calls. The clear association with aerial predators of the Blue monkey ka/katrain is observed in the acoustically similar Putty-nosed hack. In both species, pyows are used across a wide variety of contexts. Importantly, the Informativity Principle might make it possible to posit a relatively stable semantics across Blue monkey and Putty-nosed pyows despite the fact that in Blue monkeys they are not used at much more than chance in ground predator situations; if this turns out to be a difference with Putty-nosed pyows, this could be entirely due to the competition with ants rather than to an intrinsic difference in the lexical semantics of pyows across the two species.

Should we conclude that calls that “match” between species are homologous traits, i. e. are inherited by common descent? This is a plausible hypothesis in view of the close genetic relatedness of these two species, but more work is needed – in particular to compare these repertoires to those of related cercopithecines. In addition, the results of cross-species comparison should in principle converge with Fuller’s method of internal reconstruction, as Fuller’s method should make it possible to determine which calls resulted from the Fission of earlier calls. But it is too early to report any real results, and thus the application of these methods is left for future research.

7.2.2.2Syntax

Two questions about the syntax would be of great interest.

• -

First, in diagnosing the repertoire of male Blue monkeys, Fuller 2013 distinguishes between kas and katrains. Are the latter just sequences of the former, possibly with induced acoustic differences? And if so, should a similar distinction be drawn among Putty-nosed hack sequences?

• -

Second, and most importantly, do we find Blue monkey counterparts of Putty-nosed pyow-hack sequences? Murphy et al. 2013 cautiously answered in the negative, though the data were from 20 vocal episodes only. Fuller’s data from natural observations and field experiments indicate that there are instances of pyows followed by kas or katrains, though such sequences were quite rare. 33 But the crucial question is whether these are correlated with group movement as pyow-hack sequences are in Putty-nosed monkeys. It would thus be particularly interesting to assess the correlation between group movement and the appearance of putative pyow-ka sequences, if possible using the same methods and criteria as in Arnold and Zuberbühler’s Putty-nosed studies. 34

7.3 Calls for the ages: the example of boom

Having seen the potential interest of ‘local’ comparisons among closely related monkey species, we turn to a case in which it might be possible to reconstruct the ‘deep history’ of a call, boom, which is found across cercopithecines.

7.3.1 Form and function

Three spectograms of booms are given in (89)a-b, from the species Cercopithecus pogonias, wolfi ((89)a) and neglectus (= de Brazza monkey) ((89)b). As Gautier-Hion et al. (1999) emphasize, the production of booms in de Brazza monkeys is accompanied with a specific postural behavior, as well as tree shaking; postural changes are also represented in (89)b.

(89)

Production of boom

Spectrographic analyses of loud call sequences of three Cercopithecini species beginning by booms (from Gautier-Hion et al. 1999). (a) Pairs of booms of Cercopithecus pogonias (top) and Cercopithecus wolfi (bottom). (b) In this graph, a male Cercopithecus neglectus (= de Brazza monkey) is represented as sitting on a branch while producing pair of booms, separated by a third one, which corresponds to the filling of the air sacs.

Superficial inspection reveals that across species booms are highly distinct compared to other calls, since (i) their production requires highly developed air sacs; (ii) they often come in pairs, in triples or in quadruples, whereas most other calls come in sequences of varying lengths; (iii) they are produced at a very low pitch (120–140 Hz) and have low attenuation rates, as emphasized by Waser and Waser (1977) (one of their functions might relate to intergroup spacing). Finally, (iv) across cercopithecines, they are used in situations that do not involve predation. This can be seen rather clearly in the data used by Ouattara et al. (2009a, 2009b) in (19): booms entirely fail to occur in Eagle and in Leopard contexts; by contrast, they are found in situations of Intergroup encounters, Tree fall, and Coherence and travel. The same trend is seen in (82) for Blue monkeys booms: they are almost absent from situations involving aerial predators, and uncommon in situations involving terrestrial predators. By contrast, they are particularly common in undisturbed situations and in ones involving falling branches. Specific functions that have been suggested in the literature include group cohesion; intergroup spacing; and mate attraction (see for instance Fuller 2013 and Gautier-Hion et al. 1999).

7.3.2 Distribution in cercopithecines and evolution

Can the current distribution of booms in cercopithecines tell us something about the evolutionary history of these calls? Based on sources collated by Gautier, we have highlighted on the phylogenetic trees from Guschanski et al. (2013) those cercopithecines which are currently believed to have booms. The result, in (90), suggests two polar hypotheses. One is that booms were present in the common ancestors of Cercopithecus hamlyni, pogonias, mitis and nictitans, which lived approximately 7 million years ago – and if so they might have been lost in other species of cercopithecines (or might have been missed in extant descriptions of repertoires); or booms didn’t exist in these ancestors, and evolved independently at least three times in cercopithecines.

(90)

The distribution of boom and of air sacs

Phylogenetic tree of cercopithecines (from Guschanski et al. 2013), with boldfaced names for species that have booms, # for species that have air sacs (all species with booms have air sacs, but some species that have air sacs don’t have booms), and ? if it is unknown whether the species has air sacs. If nothing is indicated, the species has no booms, and only undeveloped air sacs.

It is of course possible to have intermediate theories. For instance, there may have been two separate evolutions of booms, one in an ancestor of Cercopithecus Hamlyni, and one in a common ancestor of the Mona group and of the Nictitans group (Mitis/Albogularis-Subgroups). On this view, one would need to explain how in several subfamilies that are also descended from this ancestor booms disappeared.

On the other hand, it does seem likely that the common ancestor of the mitis group, which according to this phylogeny lived approximately 2.5 million years ago, had booms: since all extant mitis subgroups have booms, it is parsimonious to posit that their common ancestor had it as well, and that it wasn’t lost. Similarly for the group that includes Pogonias, Mona, Campbell and Neglectus: it would seem parsimonious to posit that their common ancestor had booms, more than 5 million years ago. 35 We take it to be an extraordinary fact about monkey languages that part of their history can apparently be reconstructed over millions of years.

Still, the study of booms should not be based solely on information that pertains to cercopithecines. One key prerequisite for booms is the existence of air sacs. This can be determined biologically and distributionally, as in (90): all species with booms have air sacs, but some species with air sacs (e. g. Cercopithecus preussi) don’t have booms. In order to assess different theories of the evolution of booms, one would need to determine (i) how likely it is for air sacs to appear or disappear in a certain time period, and (ii) how likely booms are to emerge on the assumption that air sacs are present, a relevant question since Fitch (2006) notes that air sacs appear to have evolved independently in several animal lineages. In the case of great apes, all have air sacs, but these are only embryonic in humans – which leads Fitch (2006) to posit that they existed in the ancestors of all apes and were lost in humans. Addressing a similar question for air sacs in cercopithecines might be the first step towards an understanding of the evolution of booms.

7.3.3 Comparative syntax of booms

We expect that future research will explore the comparative syntax of booms across cercopithecines. In Campbell’s monkeys, nearly all the booms we have in our data appear as a single pair at the beginning of sequences. But interestingly the situation is different in Mona monkeys, which are part of a sister group to Campbell’s monkeys (although in the phylogeny of Guschanski et al. (2013) their most recent common ancestor lived more than 5 million years ago). In data collected by Glenn (1996), booms given alone usually appear as a single pair; but booms that precede what she calls ‘low hack series’ usually come in pairs of pairs, and sometimes also in triple and quadruple pairs. An exploration of the data and of possible analyses would be of great interest.

8.1 Methodology

On a methodological level, the development of a ‘formal primate linguistics’ should help sharpen existing analyses and precipitate the emergence of new ones. At this very early stage, it seems wise to pit several theories against each other and to assess the advantages and drawbacks of each, rather than to jump to grand conclusions on the basis of shaky generalizations. We take the first order of business to be to establish a clear methodology, one in which formal models make it possible to derive crucial predictions, which can then be tested on the basis of experimental and observational data. While the present paper pertains to monkey calls, it should be clear that the same methodology can be applied to other systems of animal communication; our explorations can thus be seen as a contribution to ‘primate linguistics’, and possibly to a broader field of ‘animal linguistics’ (see for instance Yip 2006; Berwick et al. 2011).

8.2 Linguistic modules

On a substantive level, our syntactic generalizations were modest and could be handled with very simple finite state grammars. It would be interesting to explore in future research (i) whether all monkey languages can indeed be described in such simple terms, especially when larger databases are considered, and (ii) if so, which subset of finite state grammars best characterizes the syntax of these languages (see for instance Pullum and Rogers 2006 and Rogers and Pullum 2011).

Our semantics mostly relied on simple propositional meanings, although we did posit a non-trivial semantics for the Campbell’s suffix -oo. While it would have been tempting to posit predicative meanings rather than propositional ones for some complex calls, we believe that this should only be done in the face of strong evidence: predicative types are more expressive and thus less constrained than propositional ones, hence theories might lose in explanatory force if predicative meanings are multiplied without strict necessity. The device of an alarm parameter added a bit of complexity to our analyses, but it had the benefit of offering a semantic distinction among sentences that only differed in the number of repetitions of some calls. Still, the same result could be achieved within a more modular analysis in which the calling rate has a semantic/pragmatic effect per se, independently from the semantic contribution of sentences; we leave this possibility for future research.

Our pragmatics was largely based on implicature-like rules of informativity-based competition among calls or sequences, although we did explore in our analysis of Putty-nosed semantics the possibility of using competition based on Urgency.

A constant theme hovered over our investigations: in each case, we had to ask in detail what was the division of labor between syntax, semantics, pragmatics, and properties of the environmental context. We gave rather different answers in different analyses; for instance, our lexicalist theory of Campbell’s calls had little place for pragmatics, whereas in our pragmatic alternative, rules of competition among calls played a crucial role. And when we turned to Titi calls, properties of the environmental context turned out to be crucial to understand their complex distribution. We believe that the division of labor among linguistic modules, which has played an important role in recent human linguistics, will turn out to be crucial in monkey linguistics as well.

Finally, phonetics and phonology were largely absent from our investigations, but this is primarily an effect of the particular areas of interest and competence of the authors. It is clear that a more detailed understanding of monkey phonetics and phonology will prove essential to all aspects of our enterprise – in particular to the analysis of monkey morphology, but also to the ‘evolutionary monkey linguistics’ we sketched at the end of this paper.

8.3 Theory of truth

Within our semantic investigations, we developed a very standard theory of truth, and thus accounted for generalizations of the form: If sentence S is uttered in situation c, then S is true if and only if ____ holds in c. On the assumption that sentences are only uttered if they are (thought to be) true, we make predictions of the form: If sentence S is uttered in situation c, then ___ holds in c. Importantly, such a theory does not seek to predict which sentences are uttered in a given situation; this would require the converse conditional, of the form: If ___ holds in situation c, then sentence S is uttered in c. This is also not something that semantic theories for human languages seek to deliver; whether it is a reasonable goal for monkey languages could be explored in the future.

We also remained quite conservative in assuming that the same theory of truth applies to the speaker and to the hearers. In human languages, this is a reasonable assumption since the same individuals can be speakers and hearers. But for monkey languages this need not be the case: females do not usually produce the same alert calls as males, and thus it is conceivable that their comprehension of meaning is not quite the same as that of the males. Here too, it is too early to tell whether it would be fruitful to explore different theories of truth for speakers and hearers.

More radically, one might also ask in the future whether a theory of monkey truth is the right way to go. We argued that, at a minimum, monkeys must know under what conditions a call is or isn’t applicable – and the bipartition between ‘applicable’ and ‘inapplicable’ is just the distinction between ‘true’ and ‘false’ under a different name. Still, our analyses remained linguistic in that they provided an analysis of calls that does not obviously extend to other information-bearing phenomena in nature. One could explore a more minimal analysis in the future. In a nutshell, the motivation for a less linguistic analysis could go like this:

• There is widespread inter-species comprehension among forest species – as noted, Diana monkeys react appropriately to Campbell’s calls and conversely (e. g. Zuberbühler 2002). It is likely that general cognitive abilities are responsible for this capacity: just like an eagle shriek is indicative of eagle presence, any eagle-related call from species X heard by species Y is understood by Y to be indicative of eagles.

• But this might suggest that there is nothing specific about the comprehension of the alert calls of species X by members of species X. This is of course an empirical question; one key element is whether members of species X understand the calls of species X in a more fine-grained fashion than the members of species Y do (especially if species Y is only distantly related to species X). If no difference is found, this might suggest that there is nothing specifically linguistic about a theory of call comprehension.

• This might still leave open the possibility that the conditions under which calls are produced are quite specific, and possibly not related to general reasoning abilities. If so, we might posit a kind of speaker-bound theory of truth as we have done in this paper, with no corresponding hearer theory of truth. If not, it might be that both the speaker and hearer sides need to be analyzed using general reasoning abilities.

Of course it remains to be seen how such a semantics based on general reasoning alone could deal with the data we have adduced; but this would be an interesting problem for future research.

8.4 Evolution

We were cautious not to claim that monkey languages share non-trivial properties with human language. What counts as ‘non-trivial’ lies in the eye of the beholder, and one could take the possible existence of complex calls and of suffixes, or the existence of implicatures, to be such ‘non-trivial properties’. Arguably, however, these are extremely natural properties for any linguistic system which (i) is severely limited by the size of the vocabulary, as is likely the case for monkey languages, and which (ii) conveys information that has greater utility when it is more specific. While we hope, like everybody else, that studies of monkey languages will eventually provide insights into the biological evolution of human language, we think that this will first require a good understanding of the systems to be compared – hence the importance of the development of a formal monkey linguistics. On the other hand, we are convinced that it is now possible to approach the simpler question of the ‘local’ evolution of monkey languages: there are enough diverse species with partly shared call systems that one can hope to gain real insights into their evolutionary history. The development of an ‘evolutionary monkey linguistics’ would thus seem to us to be a topic of great interest, and it should offer a fertile testing ground for theories of language evolution, in particular for game-theoretic analyses of the evolution of meaning (e. g. Skyrms 2010 and Franke and Wagner 2014).

If one is interested in the evolution of human language, one should apply the present methods to the development of an ape linguistics, which should serve as a particularly useful point of comparison for human linguistics; and since apes have not just vocalizations but also rich gestural inventories (e. g. Genty et al. 2009, Hobaiter and Byrne 2011), both modalities should be relevant for this further project.

8.5 Linguistics

Finally, we hope to have shown that non-human primate communication systems can be illuminated with methods from formal linguistics. Even if the properties of these systems are quite distinct from those of human language, linguists can bring to the table their understanding of general issues of modularity as well as particular techniques they have developed to deal with them. Primate data could also allow linguists to ask in an empirically satisfying fashion questions that are currently rather speculative in human language, pertaining to the typology of linguistic systems within related species, and to their evolutionary history. In the short term, we call the field to follow the lead of several linguistics journals in displaying openness to primate work that uses linguistics methods. 36 In the long term, we hope that primate linguistics will become a sub-area of linguistics simpliciter, just as primate cognition is now a sub-area of cognitive psychology.

Supplementary materials: Colobus data

We include below Guereza Colobus data from Kaniyo Pabidi (as in (49)) and from Sonso, as well as King Colobus data. Conventions are the same as in (49). Thus in each case, each box represents the response to one variant of a specific stimulus, as described at the top of the columns. The first line in the column header represents the predator type (e. g., Eagle, Leopard) and the second line the way it was induced, e. g., through a playback of its “shrieks” or “growls”, or through the playback of calls from Diana monkeys or Black and White Colobus (bwC) as produced in response to such a predator’s acoustic manifestation. For conciseness and legibility, no more than 10 groups (or sentences) of calls are represented, and no more than 15 calls within each of these groups/sentences are represented.

(91)

Guereza Colobus data from Kaniyo Pabidi

(92)

Guereza Colobus data from Sonso

(93)

King Colobus data

Acknowledgments

Special thanks to Lucie Ravaux for considerable practical help with the manuscript. Among others, Lucie Ravaux prepared the references and some of the figures. We are very grateful to Hans-Martin Gärtner and Manfred Krifka for detailed suggestions, and to Shane Steinert-Threlkeld for sending us a list of typos found in the penultimate version. We are also grateful to Derek Murphy and Sarah Papworth for sharing with us some of their Blue monkey data, pertaining to the potential absence of pyow-hack sequences in that species. We hope the latter topic, which is only alluded to in this piece, will be the object of further studies in the future.Grant acknowledgments:Arnold: The research leading to these results received funding from the Leverhulme Trust.Cäsar: The research leading to these results received funding from the CAPES-Brazil, FAPEMIG-Brazil, S.B. Leakey Trust and the University of St Andrews.Chemla, Kuhn and Schlenker: Research by Schlenker and Chemla was conducted at Institut d’Etudes Cognitives, Ecole Normale Supérieure – PSL Research University. Institut d’Etudes Cognitives is supported by grants ANR-10-LABX-0087 IEC et ANR-10-IDEX-0001-02 PSL.Schlenker: The research leading to these results received funding from the European Research Coucil under the European Union’s Seventh Framework Programme (FP/2007–2013) / ERC Grant Agreement n°324115-FRONTSEM (PI:Schlenker).Ryder: Ryder was partly funded by a CNRS-PSL PEPS grant.Zuberbühler: The research leading to these results received funding from the European Research Council under ERC grant ‘Prilang 283871’ and also from the Swiss National Science Foundation under grant ‘FN 310030_143359/1’. The project also benefited from the support of the Centre Suisse de Recherches Scientifiques en Côte d’Ivoire and Taï Monkey Project.

References

• Arnold, Kate & Klaus Zuberbühler. 2006a. The alarm calling system of adult male putty-nosed monkey Cercopithecus nictitans martini. Animal Behavior 72. 643–653. Google Scholar

• Arnold, Kate & Klaus Zuberbühler. 2006b. Semantic combinations in primate calls. Nature 441. 303. Google Scholar

• Arnold, Kate & Klaus Zuberbühler. 2008. Meaningful call combinations in a non-human primate. Current Biology 18(5). R202–R203.

• Arnold, Kate & Klaus Zuberbühler. 2012. Call combinations in monkeys: Compositional or idiomatic expressions? Brain and Language 120(3). 303–309. Google Scholar

• Arnold, Kate & Klaus Zuberbühler. 2013. Female putty-nosed monkeys use experimentally altered contextual information to disambiguate the cause of male alarm calls. PLoS One 8(6). e65660. doi:.

• Berwick, Robert C., Kazuo Okanoya, Gabriel J.L. Beckers & Johan J. Bolhuis. 2011. Songs to syntax: the linguistics of birdsong. Trends in Cognitive Sciences 15(3). 113–121. Google Scholar

• Bott, Oliver, Sam Featherston, Janina Radó & Britta Stolterfoht. 2011. The application of experimental methods in semantics. In C. Maienborn et al. (eds.), Semantics. An International Handbook of Natural Language and Meaning, 303–319. Berlin: Mouton de Gruyter. Google Scholar

• Cap, Henri, Pierre Deleporte, Jean Joachim & David Reby. 2008. Male vocal behavior and phylogeny in deer. Cladistics, 24. 917–931. Google Scholar

• Caro, Timothy M. 2005. Antipredator defenses in birds and mammals. Chicago: University of Chicago Press. Google Scholar

• Candiotti, Agnes, Klaus Zuberbuhler & Alban Lemasson. 2012. Context-related call combinations in female Diana monkeys. Animal Cognition 15. 327–339. Google Scholar

• Cäsar, Cristiane, Richard W. Byrne, Robert J. Young & Klaus Zuberbühler. 2012a. The alarm call system of wild black-fronted titi monkeys, Callicebus nigrifrons. Behavioral Ecology and Sociobiology 66(5). 653–667. Google Scholar

• Cäsar, Cristiane, Richard W. Byrne, William Hoppitt, Robert J. Young & Klaus Zuberbühler. 2012b. Evidence for semantic communication in titi monkey alarm calls. Animal Behavior 84. 405–411. Google Scholar

• Cäsar, Cristiane, Klaus Zuberbühler, Robert J. Young & Richard W. Byrne. 2013. Titi monkey call sequences vary with predator location and type. Biology letters 9(5). 20130535. Google Scholar

• Chemla, Emmanuel & Raj Singh. 2014a. Remarks on the experimental turn in the study of scalar implicature: Part I. Language and Linguistics Compass 8(9). 387–399. Google Scholar

• Chemla, Emmanuel & Raj Singh. 2014b. Remarks on the experimental turn in the study of scalar implicature: Part II. Language and Linguistics Compass 8(9). 373–386. Google Scholar

• Chevallier, Coralie, Ira A. Noveck, Tatjana Nazir, Lewis Bott, Valentina Lanzetti & Dan Sperber. 2008. Making disjunctions exclusive. Quarterly Journal of Experimental Psychology, 61(11). 1741–1760. Google Scholar

• Chomsky, Noam. 1957. Syntactic Structures. The Hague: Mouton. Google Scholar

• Chomsky, Noam. 1965. Aspects of the Theory of Syntax, Cambridge, MA: MIT Press. Google Scholar

• Collier, Katie, Balthasar Bickel, Carel P. van Schaik, Marta B. Manser & Simon W. Townsend. 2014. Language evolution: syntax before phonology? Proceedings of the Royal Society of London, Series B: Biological Sciences 281. 1788. doi: .

• Crockford, Catherine, Roman M. Wittig, Roger Mundry & Klaus Zuberbühler. 2012. Wild chimpanzees inform ignorant group members of danger. Current Biology 22(2). 142–146. Google Scholar

• Fitch, Tecumseh W. 2006. Production of vocalisations in mammals. Visual Communication 3. 145. Google Scholar

• Franke, Michael & Elliott O. Wagner. 2014. Game theory and the evolution of meaning. Language and Linguistics Compass 8/9(2014). 359–372, 10.1111/lnc3.12086. Google Scholar

• Fuller, James. 2013. Diversity of form, content, and function in the vocal signals of adult male blue monkeys (Cercopithecus mitis stuhlmanni): An evolutionary approach to understanding a signal repertoire. PhD thesis, Columbia University.

• Fuller, James. 2014. The vocal repertoire of adult male blue monkeys (Cercopithecus mitis stulmanni): a quantitative analysis of acoustic structure. American journal of primatology 76(3). 203–216. doi: .

• Gautier, Jean-Pierre. 1988. Interspecific affinities among guenons as deduced from vocalizations. In A. Gautier-Hion et al. (eds.), A Primate Radiation – Evolutionary Radiation of the African Guenons, 194–226. Cambridge, UK: Cambridge University Press.

• Gautier, Jean-Pierre, R. V. Drubbel & Pierre Deleporte. 2002. Phylogeny of the Cercopithecus lhoesti group revisited: combining multiple character sets. In M. Glenn & M. Cords (eds.). The Guenons: Diversity and Adaptations in African Guenons, 34–48. New York, USA: Plenum press.

• Gautier-Hion, Annie, Marc Colyn & Jean-Pierre Gautier. 1999. Histoire naturelle des primates d’Afrique Centrale. Ecofac editions. 162 pages. Libreville, Gabon: Ecofac Editions.

• Genty, Emilie, Thomas Breuer, Catherine Hobaiter & Richard W. Byrne. 2009. Gestural communication of the gorilla (Gorilla gorilla): repertoire, intentionality and possible origins. Animal cognition 12(3). 527–546. Google Scholar

• Glenn‚ Mary E. 1996. The Natural History and Ecology of the Mona Monkey (Cercopithecus mona Schreber 1774) on the Island of Grenada‚ West Indies‚ Ph.D. Dissertation‚ Northwestern University‚ Evanston‚ Illinois‚ USA.

• Grice, Paul. 1975. Logic and conversation. In P. Cole & J. Morgan (eds.), Syntax and Semantics, 3: Speech Acts. New York: Academic Press. Google Scholar

• Grubb, Peter, Thomas M. Butynski, John F. Oates, Simon K. Bearder, Todd R. Disotell, Colin P. Groves & Thomas T. Struhsaker. 2003. Assessment of the diversity of African primates. International Journal of Primatology 24(6). 1301–1357. Google Scholar

• Guschanski, Katerina, Johannes Krause, Susanna Sawyer, Luis M. Valente, Sebastian Bailey, Knut Finstermeier, Richard Sabin, Emmanuel Gilissen, Gontran Sonet, Zoltán T. Nagy, Georges Lenglet, Frieder Mayer & Vincent Savolainen. 2013. Next-generation museomics disentangles one of the largest primate radiations. Systematic Biology 62(4). 539–554. Google Scholar

• Hobaiter, Catherine & Richard W. Byrne. 2011. The gestural repertoire of the wild chimpanzee. Animal cognition 14(5), 745–767. Google Scholar

• Hopcroft, John, Rajeev Motwani & Jeffrey Ullman. 2001. Introduction to automata theory, languages, and computation, 2nd edn. Addison-Wesley.

• Keenan, Sumir, Alban Lemasson & Klaus Zuberbuhler. 2013. Graded or discrete? A quantitative analysis of Campbell’s monkey alarm calls. Animal Behavior 85. 109–118. Google Scholar

• Kershenbaum, Arik, Ann E. Bowles, Todd M. Freeberg, Dezhz Z. Jin, Adriano R. Lameira & Kirsten Bohn. 2014a. Animal vocal sequences: not the Markov chains we thought they were. Proceedings of the Royal Society of London B: Biological Sciences 281(1792). 20141370. Google Scholar

• Kershenbaum, Arik, Daniel T. Blumstein, Marie A. Roch, Çağlar Akçay, Gregory Backus, Mark A. Bee & Veronica Zamora‐Gutierrez. 2014b. Acoustic sequences in non‐human animals: a tutorial review and prospectus. Biological Reviews. doi: . Crossref

• Kuhn, Jeremy, Sumir Keenan, Kate Arnold & Alban Lemasson. 2014. On the /-oo/ ‘suffix’ of Campbell’s monkeys (C. Campbelli). Manuscript. http://jeremykuhn.net/papers/Kuhn-oo-suffix-10-2014.pdf.

• Lemasson, Alban, Karim Ouattara, Hélène Bouchet and Klaus Zuberbühler. 2010. Speed of call delivery is related to context and caller identity in Campbell’s monkey males. Naturwissenschaften 97(11). 1023–1027. Google Scholar

• Montague, Richard. 1970a. English as a formal language. In B. Visentini et al. (eds.), Linguaggi nella società e nella tecnica, 189–224. Milan: Edizioni di Comunità. Google Scholar

• Montague, Richard. 1970b. Universal grammar. Theoria 36. 373–398. Google Scholar

• Murphy, Derek, Stephen E. Lea & Klaus Zuberbühler. 2013. Male blue monkey alarm calls encode predator type and distance. Animal Behaviour 85(1). 119–125. Google Scholar

• Newman, John, D. & David Symmes. 1982. Inheritance and experience in the acquisition of primate acoustic behaviour. In: C. T. Snowdon et al. (eds.), Primate communication, 259–278. New York, NY: Cambridge University Press. Google Scholar

• Ouattara, Karim, Alban Lemasson & Klaus Zuberbühler. 2009a. Campbell’s monkeys use affixation to alter call meaning. PLoS ONE 4(11). e7808. Google Scholar

• Ouattara, Karim, Alban Lemasson & Klaus Zuberbühler. 2009b. Campbell’s monkeys concatenate vocalizations into context-specific call sequences. PNAS 106(51). 22026–22031. Google Scholar

• Perelman, Polina, Warren E. Johnson, Christian Roos, Hector N. Seuánez, Julie E. Horvath, Miguel A. M. Moreira, Bailey Kessing, Joan Ponitus, Melody Roelke, Yves Rumpler, Maria Paula, C. Schneider, Artur Silva, Stephen J. O’Brien & Jill Pecon-Slattery. 2011. A molecular phylogeny of living primates. PLoS genetics 7(3). e1001342. Google Scholar

• Pullum, Geoffrey K. & James Rogers. 2006. Animal pattern-learning experiments: Some mathematical background. Ms. Radcliffe Institute for Advanced Study/Harvard University.

• Rainey, Hugo J., Klaus Zuberbühler & Peter J. B. Slater. 2004a. Hornbills can distinguish between primate alarm calls. Proceedings of the Royal Society B: Biological Sciences 271. 755–759. Google Scholar

• Rainey, Hugo J., Klaus Zuberbühler & Peter J. B. Slater. 2004b. The responses of black-casqued hornbills to predator vocalisations and primate alarm calls. Behaviour 141. 1263–1277. Google Scholar

• Rogers, James & Geoffrey K. Pullum. 2011. Aural Pattern Recognition Experiments and the Subregular Hierarchy. Journal of Logic, Language and Information 20. 329–42. Google Scholar

• Schel, Anne M., Sandra Tranquilli & Klaus Zuberbühler. 2009. The alarm call system of two species of black-and-white colobus monkeys (Colobus polykomos and Colobus guereza). Journal of Comparative Psychology 123(2). 136–150. Google Scholar

• Schel, Anne M. & Klaus Zuberbühler. 2009. Responses to leopards are independent of experience in Guereza colobus monkeys. Behaviour 146(12). 1709–1737. Google Scholar

• Schlenker, Philippe. to appear. The Semantics/Pragmatics Interface. To appear in M. Aloni & P. Dekker (eds.) Cambridge Handbook of Semantics. Cambridge: Cambridge University Press.

• Schlenker, Philippe, Emmanuel Chemla, Kate Arnold, Alban Lemasson, Karim Ouattara, Sumir Keenan, Claudia Stephan, Robin Ryder & Klaus Zuberbühler. 2014. Monkey semantics: two ‘dialects’ of Campbell’s monkey alarm calls. Linguistics and Philosophy 37(6). 439–501. Google Scholar

• Schlenker, Philippe, Emmanuel Chemla, Kate Arnold, & Klaus Zuberbühler. 2016. Pyow-Hack revisited: Two analyses of putty-nosed monkey alarm calls. Lingua 171(2016). 1–23. Google Scholar

• Seyfarth, Robert M. & Dorothy L. Cheney. 1980. The ontogeny of vervet monkey alarm calling behavior: A preliminary report. Zeitschrift für Tierpsychologie 54(1). 37–56. Google Scholar

• Schlenker, Philippe, Emmanuel Chemla, Cristiane Cäsar, Robin Ryder, Klaus Zuberbühler. to appear. Titi semantics: Context and meaning in Titi monkey call sequences. Natural Language and Linguistic Theory.

• Seyfarth, Robert M. & Dorothy L. Cheney. 1997. Some general features of vocal development in nonhuman primates. In M. Husberger & C. T. Snowdon, (eds.). Social influences on vocal development, 249–273. Cambridge: Cambridge University Press.

• Skyrms, Brian. 2010. Signals: evolution, learning, and information. Oxford: Oxford University Press. Google Scholar

• Stalnaker. Robert C. 2002. Common ground. Linguistics and Philosophy 25(5–6). 701–721. Google Scholar

• Singh, Raj, Ken Wexler, Andrea Astle, Deepthi Kamawar & Danny Fox. 2015. Children interpret disjunction as conjunction: Consequences for theories of implicature and child development. Unpublished manuscript.

• Ting, Nelson. 2008. Mitochondrial relationships and divergence dates of the African colobines: evidence of Miocene origins for the living colobus monkeys. Journal of Human Evolution 55(2). 312–325. Google Scholar

• Veselinović, Dunja, Agnes Candiotti & Alban Lemasson. 2014. Female Diana monkeys (Cercopithecus Diana) have complex calls. New York University, ms.

• Waser, Peter M. & Mary S. Waser. 1977. Experimental Studies of Primate Vocalization: Specializations for Long‐distance Propagation. Zeitschrift für Tierpsychologie 43(3). 239–263. Google Scholar

• Wheeler, Brandon C. & Julia Fischer. 2012. Functionally referential signals: a promising paradigm whose time has passed. Evolutionary Anthropology 21. 195–205. Google Scholar

• Yip, Moira. 2006. The search for phonology in other species. Trends in Cognitive Sciences 10(10). 442–446. 10.1016/j.tics.2006.08.001. Google Scholar

• Zuberbühler, Klaus. 2002. A syntactic rule in forest monkey communication. Animal Behaviour 63(2). 293–299. Google Scholar

• Zuberbuhler, Klaus. 2003. Referential signalling in non-human primates: cognitive precursors and limitations for the evolution of language. Advances in the Study of Behavior 33. 265–308. Google Scholar

• Zuberbühler, Klaus. 2009. Survivor signals: the biology and psychology of animal alarm calling. Advances in the Study of Behavior 40. 277–322. Google Scholar

Footnotes

• 1

In human language, a further distinction is needed between sentences and clauses that might appear within sentences. This distinction won’t be needed here.

• 2

We try to use the term ‘alert call’ with a broader meaning than ‘alarm call’: the latter only pertains to dangers, whereas the former can also be used for other noteworthy events.

• 3

A potential fourth call, wak, was argued by Keenan et al. (2013) to just be a variant of hok, on the basis of acoustic and clustering analyses. More precisely, they argued that if one wishes to treat wak as being different from hok, then one should also subdivide krak into distinct categories.

• 4

On the basis of an acoustic analysis, Kuhn et al. (2014) confirmed that the phonetic realization of -oo is indeed consistent with the suffixal hypothesis.

• 5

The one exception is the boom call, which Zuberbühler (2002) argues obeys a syntactic rule according to which it may only occur at the very beginning of sequences. It might be that the articulatory properties of boom are responsible for this fact (it is produced with filled air sacs and might require considerable energy, and hence a resting phase).

• 6

Importantly, different DNA methods may lead to different results – e. g. divergence dates may differ between Guschanski et al. (2013), based on mitochondrial DNA, and Perelman (2011), based on nuclear DNA. When details matter (as in the evolutionary considerations of Section 7), such differences should be revisited in future research.

• 7

Cercopithecinae include cercopithecini and papionini (the latter include macaques and baboons, among others).

• 8

See for instance Schlenker (to appear) for a survey of some standard approaches to the analysis of or.

• 9

For instance, it was found that the ‘enriched’, exclusive reading of or takes time to process, as is natural if it involves negating the alternative with and (see for instance Chevallier et al. 2008); and also that children acquire that ‘bare’ meaning before the ‘enriched’ one – again a natural result if comparing the sentence with its alternatives is somehow taxing (see for instance Singh et al. 2015).

• 10

We follow standard practice in writing s for the language made of the singleton {s}, and similar for r and {r}.

• 11

In our pragmatic analysis of ‘real’ Campbell’s calls, we will make use of similar rules of competition, but solely at the single call level, for reasons to be outlined below. On the other hand, competitions among entire sentences will play a role in our pragmatic analysis of Putty-nosed calls.

• 12

For human languages, a more sophisticated capacity is usually assumed, namely the ability to determine what is common belief between the speaker and addressee; see for instance Stalnaker (2002) for discussion.

• 13

As noted in footnote 3, we leave wak out of our discussions because it is not clearly an independent call (Keenan et al. 2013 argued on the basis of a clustering analysis that it might be better treated as an instance of hok).

• 14

Still, Schlenker et al. (2014) noted that in the two more recent datasets they focused on, some facts went in the opposite direction: in Tai ‘Diana Leopard’ contexts and in Tiwai ‘predator Eagle’ contexts, there are respectively 21 and 66 hoks, but no hok-oos.

• 15

Neither Schlenker et al. (2014) nor the present summary explain why hok-oo is rare on Tiwai.

• 16

We could relativize this further to the caller’s epistemic state, by having: … the caller of s is aware of an alarm level of at least t – time(s).

• 17

They also have a boom call, which is produced very differently and is not indicative of alerts; it will play no role in the present discussion.

• 18

See Collier et al. (2014) for alternative pragmatic analysis, and Schlenker et al. (2016) for a comparison between these two approaches.

• 19

In field experiments, model eagles are of course stationary, and ‘real’ eagles can be as well. Still, hacks are produced.

• 20

In Schel’s work, what we call here a ‘roar’ and transcribe as r was named a ‘roaring phrase’. We avoid the latter term in this piece due to its unwanted syntactic implications.

• 21

We note that this result is not altered by excluding transitions longer than a certain threshold, to ensure that in all cases we are looking at transitions within a phrase. For instance, restricting attention to intervals below 200ms gives the values in (i). Furthermore, as is expected given our syntactic generalizations, when such a threshold is added, there are very few transitions from roar to snort, as seen in (i)b.

(i)

a. Average time from snort to snort: no such transition anymoreb. Average time from snort to roar: 76ms (116 cases)c. Average time from roar to snort: 119ms (7 cases)d. Average time from roar to roar: 121ms (331 cases)

• 22

A potential argument in favor of this weakening is that s and r both appear in morning choruses, where their informational content is presumably very weak or non-existent.

• 23

The desired results could also be obtained with the sole alternatives $\left\{$$\left\{$s, r+$\right\}$$\right\}$ if the Informativity Principle worked by negating the non-weaker meanings rather than just the stronger ones. On this view, s would trigger the inference that not r+ and r+ would trigger the inference that not s. This modified version of the Informativity Principle is in fact used in much recent work on implicatures in human languages (see Schlenker to appear for a recent survey). But the problems mentioned in Section 5.4.1 would still arise; one would need to take into account information about entire discourses to circumvent these difficulties.

• 24

Note that due to the definition of alternatives in (72), only sequences that have the same number of calls can be alternatives to each other. Thus a sequence of the form A++ need not be an alternative to a sentence of the form B++, although the relations of informativity in (74) will still hold. See Schlenker et al. (to appear) for more details.

• 25

We used Mann-Whitney tests using each call as an independent data point. The structure of the data forces us to merge all groups into one for these analyses, rather than studying a possible group effect (there is no group for which there is more than three calls for two of the situations). But we have no reason to believe that there exists such a group effect.

• 26

See Schlenker et al. (to appear) for a discussion of the cognitive implications of this reasoning, and for further data.

• 27

By contrast, we did discuss instances of variation among groups of Campbell’s monkeys, but these raised the same general issues as dialectal or cross-linguistic variation in human languages, where differences are unlikely to have a biological/genetic explanation.

• 28

Note that Gautier had access to other data as well, including DNA ones; hence one cannot assert that he predicted on the basis of call comparison a phylogenetic tree that was only obtained later; rather, Gautier showed that there is a reasonable way to set up a similarity measure among calls that dovetails nicely with DNA data, and this similarity measure might in some cases help decide among competing phylogenies.

• 29

Ordination is a statistical method that uses variation in multiple variables to arrange objects in such a way that more similar objects are nearer each other and more dissimilar objects are further apart.

• 30

For example, Fuller (2013) notes that booms are typically associated with males approaching or being approached by females, suggesting they function to facilitate affiliative interactions.

• 31

We write ‘plausibly’ because the precise assumption we need is that if there is a serious aerial-related alert, then there is a serious alert. This follows if aerial alerts are on average at least as serious as alerts in general – which is probably the case.

• 32

“Nasal screams were rare, observed only 11 times during this study (<1 % of all vocal episodes). Usage was unambiguously associated with intense aggression between males.” (Fuller 2013: 74–75).

• 33

Of 10,321 observed vocal episodes, ~1 % were pyows followed by kas or katrains; looking at call combinations only (i. e. episodes in which a male produced more than one call type), however, the pyow-katrain sequences constitute ~10 % of combinations.

• 34

• (i)

Since ants could also be the counterparts of some subspecies of pyows, one could in principle study the effect of ant-ka sequences as well.

• (ii)

Different theories of Putty-nosed pyow-hack sequences don’t make quite the same predictions about other species. For non-compositional theories, there could be another species with the very same call system and environment as Putty-nosed monkeys but without pyow-hack sequences, since the semantic properties of these are not derived from anything else in the system (on this theory, one could still ask where in a phylogenetic tree pyow-hack sequences arose, essentially treating those as a complex call). For compositional/pragmatic theories, by contrast, the use of pyow-hack sequences follows from other properties of the semantics/pragmatics of the system, and thus if Blue monkeys have the same calls as Putty-nosed monkeys but lack pyow-ka sequences, one would expect that the semantics or pragmatics of these calls is in fact slightly different from that of their Putty-nosed counterparts.

• 35

At this point, it would be essential to combine the information we have about calls with the genetic information available in order to assess the probability of various scenarios. After all, the distribution of booms is a relevant fact when we seek to reconstruct the evolutionary history of cercopithecines; for instance, in case two phylogenies A and B are equally likely on the basis of genetic data, but A makes for a much more parsimonious analysis of the evolutionary history of booms, then A should presumably be preferred over B. Thus one should seek to develop a combined analysis of possible phylogenies based both on genetic information and on the distribution of booms.

• 36

So far, Linguistics & Philosophy, Lingua, Natural Language & Linguistic Theory, and Theoretical Linguistics have published (or accepted for publication) articles devoted to primate linguistics.

Published Online: 2016-07-05

Published in Print: 2016-07-01

Citation Information: Theoretical Linguistics, ISSN (Online) 1613-4060, ISSN (Print) 0301-4428,

Export Citation

Citing Articles

[1]
Toshitaka N. Suzuki, David Wheatcroft, and Michael Griesser
Current Biology, 2017, Volume 27, Number 15, Page 2331
[2]
Isaac Schamberg, Dorothy L. Cheney, Zanna Clay, Gottfried Hohmann, and Robert M. Seyfarth
Behavioral Ecology and Sociobiology, 2017, Volume 71, Number 4
[3]
Robert M. Seyfarth and Dorothy L. Cheney
Psychonomic Bulletin & Review, 2017, Volume 24, Number 1, Page 79
[4]
Robert M. Seyfarth and Dorothy L. Cheney
Animal Behaviour, 2017, Volume 124, Page 339
[5]
Charles Yang, Stephen Crain, Robert C. Berwick, Noam Chomsky, and Johan J. Bolhuis
Neuroscience & Biobehavioral Reviews, 2017
[6]
Philippe Schlenker, Emmanuel Chemla, Cristiane Cäsar, Robin Ryder, and Klaus Zuberbühler
Natural Language & Linguistic Theory, 2017, Volume 35, Number 1, Page 271
[7]
Philippe Schlenker, Emmanuel Chemla, and Klaus Zuberbühler
Trends in Cognitive Sciences, 2016, Volume 20, Number 12, Page 894
[8]
Robert M. Seyfarth and Dorothy L. Cheney
Theoretical Linguistics, 2016, Volume 42, Number 1-2
[9]
Philippe Schlenker, Emmanuel Chemla, Anne M. Schel, James Fuller, Jean-Pierre Gautier, Jeremy Kuhn, Dunja Veselinović, Kate Arnold, Cristiane Cäsar, Sumir Keenan, Alban Lemasson, Karim Ouattara, Robin Ryder, and Klaus Zuberbühler
Theoretical Linguistics, 2016, Volume 42, Number 1-2