In 1999, things were simple. In my 1999 book, I argued that only humans have culture, language, social institutions, and everything else because only humans understood others as intentional agents (Tomasello 1999). But then new data started rolling in suggesting that great apes also understood that others have goals (e.g. they reacted to what another was trying to do not what he actually did) and that others perceive things (e.g. they knew when a competitor saw food and when he did not). But of course they still did not have culture, language, and all the rest. Something was amiss.
In 2001 I attended a workshop here in Leipzig (hosted by Georg Meggle) on Margaret Gilbert on social facts. Reading her work, I decided to present a paper on “Can chimpanzees take a walk together”? The answer was of course that they cannot in the relevant sense, that is, they cannot make a joint commitment to take a walk together. And this provided a new way of looking at things. Previously I had thought that understanding others as intentional agents brought with it – somehow naturally and for free – the skills and motivations to share intentional states with one another in collaborating and communicating with them. The new ape data suggested that this was not the case: one can understand others as intentional agents for purposes of competing with them. So the answer must be that only humans both understand others as intentional agents and have the skills and motivations to share intentional states with them.
Consequently, my colleagues and I began to apply the basic concepts of philosophers of action such as Gilbert, Bratman, Searle, and Tuomela to our empirical problems. The result was one theoretical paper in Mind and Language(Tomasello and Rakoczy 2003) and another in Behavioral and Brain Sciences (Tomasello et al. 2005) arguing that indeed what most clearly distinguishes humans from other great apes, from a psychological point of view, is that humans operate with skills and motivations of shared intentionality. These papers were followed by several dozen studies conducted by my colleagues and I directly comparing great apes and human children on a variety of cognitive and social tasks. The resulting data provided general support for the theory, but at the same time generated some empirical surprises that required theoretical adjustments.
My goal in the current book was to summarize these new data and the current theoretical framework as it now stands. Because my goals were simultaneously empirical and theoretical, I decided to develop the account in the context of a hypothetical evolutionary scenario (broadly consistent with the paleoanthropological record). To focus things I decided to zero in on the process of thinking, as an occurrent phenomenon comprising three key sets of cognitive processes: (i) off-line cognitive representations; (ii) inferential simulations based on an understanding of causal, intentional, and logical relations; and (iii) cognitive self-monitoring. The resulting theoretical account thus represents an application of philosophical concepts of shared intentionality to empirical phenomena. But in making this application, I had to adjust and extend these concepts in ways that could potentially make a philosophical contribution, although it is also possible that the application is nothing more than that.
If the book does make a theoretical and/or philosophical contribution it is in distinguishing between joint intentionality and collective intentionality (the superordinate category being shared intentionality). Many people have characterized human uniqueness in terms of such things as language, culture, and social institutions, but it is difficult to imagine how human evolution might have jumped directly from chimp-like creatures to cultural creatures. Chimpanzees are constantly competing with one another and working against one another – cooperating on occasion, but in fairly limited and limiting contexts, such as teaming up in a fight – making it difficult to see how they could have made the leap. But perhaps there was some middle step in which humans’ ancestors became more cooperative, but not yet fully cultural. This would not solve the evolutionary problem in toto, but it would break it down into smaller and more manageable sub-problems. A hint that such a middle step is empirically possible is the fact that pre-linguistic human infants – well before they have become cultural creatures in any active sense – already collaborate and communicate with a partner in ways that other great apes cannot.
The theoretical move is to distinguish cases such as Bratman’s house painters and Gilbert’s walkers – who are essentially collaborating as dyadic partners with joint goals and joint commitments – from cases such as Searle’s patron at a French café enmeshed in the institutional reality of money, café owners, and social norms of café behavior, all in the context of governmental laws, licenses, and restaurant inspectors. The former cases embody what Darwall (2006) calls second-personal social relations: I have a joint goal with you; I am jointly committed with you; I am trying to communicate something to you. In the modern world second-personal interactions always take place in the context of an institutional reality, but perhaps there was an evolutionary moment during which things were only second-personal because there was not yet any significant form of cultural and/or institutional organization. Again, although the analogy should not be pushed too far, prelinguistic and just-linguistic human infants provide a kind of existence proof that such creatures can exist and interact in ways that already differentiate them from other apes.
And so the central theoretical structure of the book contrasts the three forms of intentionality: individual intentionality, as characteristic of great apes in general; joint intentionality, as characteristic of some early humans before culture, and, to some degree, of prelinguistic and just-linguistic human infants; and collective intentionality, as characteristic of modern human adults in their cultural and institutional realities. It may be that for philosophical purposes these distinctions are of no great theoretical moment. But to provide a compelling evolutionary account, we need to distinguish at least these three forms of intentionality (and perhaps, at some point, more).
Focusing on the occurrent process of thinking, in Chapter 2 I argue and present evidence that great apes do indeed think. Much evidence suggests that they cognitively represent entities and situations in the world in schematic fashion, that is, as perceptual abstractions in iconic format. These representations are created from voluntary acts of attention, and voluntary acts of attention are always about things relevant to the individual’s goals. There is therefore no problem of referential indeterminacy of the kind sometimes attributed to iconic representations because the original acts of attention were from the beginning interpreted with respect to goals. In addition, I argue that perhaps the most important representations for great apes are of “situations” or states of affairs. Although many theorists seem to consider representations of objects and simple events as basic, when the individual is facing a behavioral decision – for example, whether to climb a tree to take and eat some bananas – what one must attend to are “facts” relevant to the decision, for example, that the bananas are ripe, that there are no predators in the tree, that the tree will be easy to climb, etc. I argue, following Davidson and others, that the reason for this is the fact that goals are represented in such fact-like format already – my goal is not the banana but that I have the banana or that I eat the banana – and so I attend to fact-like situations that are relevant to that goal. If the cognitive representation of such fact-like situations is primate-wide, the evolution of propositional thought is a bit less mysterious.
Inspired by Bermudez (2003), I also argue that great apes not only make inferences but they make patterns of inferences that fit the most basic logical paradigms. The trick is that the inferences are based on, and only on, causal and intentional relations. Great apes make inferences all day every day of the Modus ponens type: Poking with a stick causes ants emerge. I poke. This causes ants to emerge. They also make exclusion inferences in contexts analogous to disjunctive syllogisms (If I know there is food in one of two cups, and cup A has no food, then cup B must have it), and also something like Modus tollens in which they use a proto-form of negation in terms of mutually exclusive contraries (presence-absence, noise-silence, etc.). Something like these forms also apply to their thinking in social contexts in terms of practical syllogisms such as: if he wants a banana, and he sees it in location A, then he will go to location A (and If he did not go to location A then he either does not want the banana or he did not see it). A number of experiments also show that great apes self-monitor their own cognitive processes, so that, in some sense, as they are thinking they know what they are doing.
A colleague opined to me that many of the most difficult and contentious philosophical problems are already embedded in this great ape starting point. Agreed. But my goal in the book is only to explain uniquely human thinking, and my contention, backed by experimental studies (albeit, of course, experiments interpreted in certain ways) is that this is the great ape starting point. It presents many mysteries, but they are bracketed. Our question is how, from this starting point, to get full-blown human thinking.
The middle step between great apes and modern human thinking, as alluded to above, arose in some new forms of collaborative activity, specifically those structured by joint intentionality. In acts of joint intentionality, some early humans (the paleoanthropological details are not crucial here) began to form with one another joint goals toward mutually beneficial ends, structured by joint attention. In pursuing their joint goals structured by joint attention these early humans also recognized simultaneously different individual roles in the collaborative activity and different individual perspectives on their joint focus of attention. This dual-level structure of joint agency, I hypothesize, created a new form of practical activity spawning new types of experiences. In effect, collaborative activities pursued by a joint agent created for the individuals involved a shared world comprising distinct individual perspectives. This structure means that collaborating individuals create, through their acts of voluntary joint attention, perspectival cognitive representations, a step on the road to linguistic aspectuality.
In this middle step, early humans also begin to communicate cooperatively with one another in unique ways, that is, to inform one another of things helpfully, through the natural gestures of pointing and pantomiming (Tomasello 2008). This created the basic ostensive-inferential structure of uniquely human communication in which a communicator intends that a recipient attend to something and infer something else as a result (e.g. attend to the crack in the branch as a potential danger for it breaking and you falling), and the recipient is motivated and capable of doing exactly that. This process involves recursive inferencing of the form: he intends that I attend to the crack in the branch (and that I infer its relevance to my current activities). This process can go smoothly only if there is common ground between the communicative partners, for example, that cracked branches may break, and this may cause falling and injury, and injury is bad, and so forth. In this process of communication, the communicator, in Meadian fashion, self-monitors her communicative act by simulating the recipient’s potential acts of attention and inference and then chooses her communicative means (e.g. whether to point to the crack in the branch or to pantomime a breaking branch or a falling person) as a result. Beyond great apes’ cognitive self-monitoring of individual acts, early humans now began to socially self-monitor, that is specifically, to self-monitor their social and communicative acts from the point of view of the partner, a step on the road to the kind of normative self-monitoring characteristic of modern human cultural beings.
In sum, then, the early humans who began interacting with one another based on skills and motivations of joint intentionality engaged in acts of thinking based on (i) perspectival cognitive representations; (ii) recursive inferences; and (iii) social (dyadically normative) self-monitoring. This was a radical break from the purely individual intentionality and thinking of other great apes.
My account of modern human cultural beings is perhaps not so novel – though it, of course, makes some theoretical choices with which not everyone would agree – and so here I will be brief. As modern humans transitioned to culture they became group-minded creatures whose collective intentionality included all kinds of things not just in their personal common ground with other individuals, but in their cultural common ground with the group, such supraindividual things as cultural conventions, norms, and institutions. Early humans’ dyadic collaboration scaled up to modern humans’ collectively known cultural practices – including those constituting the conventional symbols and constructions (in some cases situations symbolized as propositions) of the local linguistic community – to which anyone who would be one of “us” must conform. This designation in principle – “anyone who would be one of us” – led to an objectification of the group’s social and institutional norms, including the way that we in this group understand the objective world. This comes out clearly in acts of pedagogy in which mature members of the culture teach youngsters generic facts like “Elders know best” or even “Chestnuts are found under these kinds of trees”, in which the instructor is not just stating her personal opinion but rather representing the authoritative voice of the culture. Internalizing the voice of collective intentionality constituted something like normative self-monitoring or self-governance. The outcome, then, was that the modern humans who began interacting with one another based on skills and motivations of collective intentionality engaged in acts of thinking based on (i) “objective” cognitive representations; (ii) self-reflective inferences; and (iii) generically normative self-monitoring. This was the culmination of early humans’ new ways of putting their heads together with others.
My most immediate goal with this book was to provide an evolutionarily plausible and satisfactory explanation for the emergence of uniquely human cognition and thinking. But, to do this, I needed to adapt and extend the seminal work on shared intentionality by Gilbert, Bratman, Searle, Tuomela, and others. From a philosophical point of view, the question is whether these adaptations and extensions make sense and are productive. I have already received criticism from a number of philosophers for picking and choosing isolated concepts from philosophers divorced from their overall philosophical programs, and indeed in some cases I have not been true to the philosophers’ original intentions. I plead guilty to this charge, but defend myself by saying that I am looking for concepts that will be useful for my explanatory enterprise, and in some cases I not only borrow but borrow and modify concepts from particular philosophical systems. Whether or not this is justified and/or useful is not for me to judge.
Bermudez, J. (2003): Thinking without Words. New York: Oxford University Press.
Darwall, S. (2006): The Second-Person Standpoint: Respect, Morality, and Accountability. Cambridge, MA: Harvard University Press.
Tomasello, M. (1999): The Cultural Origins of Human Cognition. Cambridge, MA: Harvard University Press.
Tomasello, M. (2008): Origins of Human Communication. Cambridge, MA: MIT Press.
Tomasello, M. and H. Rakoczy (2003): “What Makes Human Cognition Unique? From Individual to Shared to Collective Intentionality”. In: Millennial Perspective Series in Mind and Language 18, p. 121–147.
Tomasello, M., M. Carpenter, J. Call, T. Behne and H. Moll (2005): “Understanding and Sharing Intentions: The Origins of Cultural Cognition”. In: Behavioral and Brain Sciences 28, p. 675–691.