Kogi Demonstratives and Engagement

: While demonstratives typically signal aspects of the spatial configuration of speech act participants and objects in the speech situation, intersubjective parameters, such as the attentional state of the interlocutor, have recently gained importance in the analysis of such forms. Several systems have been described in which the use of certain forms is conditioned by shared vs. non-shared attention towards a referent. Phenomena of this kind have recently been considered under the notion of ‘engagement’, i.e. the expression of a speaker’s assumptions about the knowledge or attention of their interlocutor (Evans et al. 2018a, b). The present study contributes to the ongoing investigation of engagement by a descriptive account of demonstratives in Kogi (Chibchan). It is argued that the use of certain (ad)nominal forms that were initially associated with addressee proximity cannot be accounted for in merely spatial terms. The paper proposes a novel analysis in terms of engagement and shows that the forms apply when a referent is in the attention of, or is known to both interlocutors. Evidence in support of this comes from elicited data as well as an interactive matching game in which attentional states of participants can be observed.


Introduction
Judging from descriptive as well as typological work on demonstratives, it is evident that the most prominent parameter encoded in such forms relates to spatial deictics, e.g. the distance at which a referent is located relative to the deictic center (e.g. Anderson & Keenan 1985;Diessel 1999;Dixon 2003). Further parameters that are signaled by demonstratives correspond to, for example, the visibility of a referent, or its location on the vertical axis (at a higher or lower elevation, Diessel 1999:41). All of these parameters are concerned with the physical configuration of speech act participants and objects in the speech situation. By contrast, other accounts point to the importance of intersubjective factors in the analysis of demonstratives (e.g. Hanks 1990Hanks , 2005 and in recent years, more and more language descriptions have surfaced that challenge a merely spatial account of the respective demonstrative systems. Intersubjective aspects of demonstratives relate to the attentional state of speech act participants in face-to-face conversation. The choice of a certain demonstrative form may be conditioned by shared vs. non-shared attention between speaker and addressee towards an object, rather than by spatial conditions. Languages that have been shown to exhibit such distinctions include, for example, Yucatec (Mayan; Hanks 2005, Bohnemeyer 2018), Tiriyó (Cariban; Meira 2018), Turkish (Turkic; Özyürek & Kita 1998, n.d.), Jahai (Mon-Khmer; Burenhult 2003) and Yélî Dnye (isolate, Papua New Guinea; Levinson 2018). Evans et al. (2018) have recently linked such demonstrative contrasts to more grammatical means for encoding speakers' assumptions about the cognitive state of their interlocutors and subsume these under the notion of 'engagement'. In the initial definition of engagement, the concept of cognitive state covers different types of mental access a person has to a referent or state of affairs, for example in terms of attention or knowledge. With regard to demonstratives, the notion of attention is most central in the descriptions of the languages listed above. However, in some instances, the deciding factor prompting the use of a certain form is whether a referent is known by one's interlocutor (e.g. from previous mention in discourse), rather than whether it is in their focus of attention.
The aim of the present paper is to detail the (ad)nominal demonstrative system of Kogi, a Chibchan language spoken in Northern Colombia. Previous, partial descriptions of the language (e.g. Ortiz Ricaurte 2000) mention some demonstrative forms, yet lack a detailed discussion of their semantics and conditions of use. This study addresses this gap and provides an analysis of Kogi demonstratives with special reference to the role of attention and knowledge.
It is argued here that a distance-based analysis may be rejected for a subset of Kogi (ad)nominal demonstratives, and that their use is instead conditioned by visual or cognitive accessibility. Observations that led to this analysis were initially made in elicited data where relevant forms are discussed with native speakers. While the author analyzed twẽhié as an addressee-anchored proximal demonstrative in the first stage of description, it later became apparent that the form can also be used for distant referents, the crucial factor being that the addressee has correctly identified the object pointed out by the speaker. Thus, a new proposal for the function of twẽhié (and two related forms) associates the demonstrative with the alignment of the speaker's and the addressee's attention toward a referent. This revised hypothesis was further confirmed by data from an interactional matcher-director task in which attentional contrasts can readily be observed.
The proposed analysis of demonstratives as signaling joint attention, or knowledge parallels the already mentioned descriptions from the literature (see above). While the paper provides a mainly descriptive account of the relevant forms, the findings potentially reveal another example of demonstratives with engagement semantics that can be considered in the ongoing efforts of describing and comparing systems of engagement marking.
Section 2 provides background information about the Kogi language with special reference to the domain to be investigated, namely demonstratives. The notion of engagement and its role in demonstrative systems is outlined in more detail in Section 3. Section 4 discusses the role of addressee attention in the use of the demonstrative twẽhie. The function of twẽhié and two related forms is furthermore explored in a matcher-director task, the method and results of which are detailed in Section 5. Finally, Section 6 summarizes the findings of this study.

Aspects of the Kogi language
Kogi is spoken by roughly 10,000 individuals in the region of the Sierra Nevada de Santa Marta, a mountain range in the north of Colombia. Kogi belongs to the Chibchan language family and is part of the Arwako subgroup together with the closely related languages Ika and Damana spoken in the same area. The language can be considered potentially endangered (Crevels 2012). While the use of Spanish is becoming more widespread, there remain many monolingual speakers and Kogi is still used by all generations.
A distinctive trait of Kogi grammar is the expression of engagement which was introduced above as a system for signaling a speaker's assumptions about the knowledge or attentional state of the addressee. Engagement is reflected in a set of four mutually exclusive verbal prefixes, ni-, na-, shi-, sha-, which signal (a)symmetries between speech act participants in epistemic access to a state of affairs. The basic semantic distinction between ni-/ shi-and na-/ sha-can be described in terms of shared vs. non-shared access. The former two forms express that an event is accessible to both speech act participants, while the latter signal that access is exclusive to one of them. Moreover, the forms reference epistemic authority, i.e. whose knowledge is targeted. The forms ni-/na-concern information that is primarily known by the speaker, whereas shi-/sha-target addressee knowledge. The prefixes are not obligatory and are used by speakers when they wish to epistemically qualify a proposition. More precisely, they serve as a resource for, for example, requesting information, argumentation, signaling unexpected information or claiming epistemic primacy. For a more detailed description of these prefixes, the reader is referred to Bergqvist (2016;cf. also Evans et al. 2018 b). As argued in the present paper, the speaker's assumptions about the addressee's cognitive state, in particular their attention, is a relevant parameter in the use of demonstratives. The Kogi demonstrative system features (ad)nominal as well as adverbial forms. Among the adverbial demonstratives, one can distinguish between those that express location (e.g. here) and those that denote manner (e.g. like this). Most forms across the different sets share common roots, i.e. h(e)-, tw-and kw-. While the h(e)-and kw-forms in all sets correspond to speaker-proximity and distance respectively, the tw-forms cannot conclusively be associated with a location, as will be elaborated during the course of the paper.
For endophoric reference (i.e. reference to entities mentioned in discourse), a separate set of demonstratives involving the base e-/ẽ-is typically used. Given that the focus of the present study lies on the exophoric use of demonstratives (i.e. reference to objects and places in the extra-linguistic world), the remainder of this section introduces the inventory of relevant forms.
The semantic parameters reflected in exophoric demonstratives are distance (proximal vs. distal), the deictic center (speaker-or addressee-anchored), and visibility of a referent. Moreover, a special set of (ad) nominal forms is used when a speaker singles out one object among several similar objects.
Locative adverbial demonstratives (Table 1) point to the place where a referent is located. Two of the visible forms have counterparts that are marked with the locative marker -ka; both forms appear to be used interchangeably. A second set of adverbial demonstratives (Table 2) applies in contexts in which a location is out of sight, either for the addressee, the speaker, or both. Note that this visibility distinction is attested in locative adverbial, but not in (ad)nominal demonstratives. Manner adverbial demonstratives (Table 3) indicate the manner in which an action is performed. While contemporary Spanish, for instance, has a single manner demonstrative así 'like this', Kogi exhibits three different exophoric forms which can be translated as 'like this (like I am doing)', 'like that (like you are doing)' and 'like that (like someone else is doing)'. The ensuing discussion will primarily be concerned with (ad)nominal demonstratives. Previous descriptions (e.g. Ortiz Ricaurte 2000) translate these forms with equivalent Spanish words, pointing to an analysis in terms of distance. A more detailed discussion of the semantics and examples of their use, however, is lacking.
(Ad)nominal demonstratives include two sets that can be used as modifiers of nouns (adnominal use) or constitute noun phrases on their own (pronominal use). The set of contrastive demonstratives, i.e. set II, are employed in contexts in which one object is singled out in a group of similar objects, in this way emphasizing a contrast to other potential candidates (e.g. 'I want this apple [not another one in a pile of apples].'). Set I, i.e. non-contrastive forms, are less specific in meaning as they can be used when singling out one of several objects (with no emphasis on contrast), as well as more general contexts. Based on elicited data (i.e. from the discussion of different hypothetical scenarios involving referents at varying distance from both speaker and addressee), an initial distance-based analysis of the forms was proposed, as shown in Table 4. The variants of the forms in set I (e.g. hẽhié and hẽ) can be used interchangeably according to speaker judgements, and the status of -hié as a morpheme is unclear at present. It seems that there are individual preferences among speakers for each of the variants. As noted above, unlike in the locative adverbial paradigm, there is no separate set of (ad)nominal forms for invisible referents. In fact, a referent that is out of sight for the speech act participants cannot be referred to with an (ad)nominal demonstrative, but rather a construction with an adverbial form (e.g. uni 'over there, where we cannot see') must be used. As indicated in Table 4, hẽ(hie) and halde denote objects in the speaker's vicinity, and kwe(hie) refers to objects that are located at a distance. While these forms appear to reflect contrasts in distance, a revised analysis for the addressee proximal demonstratives is presented in Section 4.
It may be noted here that the contrast among locative adverbial demonstratives is indeed one of distance. In contrast to (ad)nominal twe(hie) or twalde, the adverbial forms twai, tweka and tweni seem to correlate with addressee proximity, i.e. they denote locations close to the addressee and thus far, no counterevidence for this analysis has emerged.
Before detailing the (ad)nominal demonstratives of Kogi in Section 4, the notion of engagement and its role in demonstrative systems is outlined in the following section.

Engagement in demonstrative systems
As stated, engagement refers to a "grammatical system[s] for encoding the relative accessibility of an entity or state of affairs to the speaker and addressee" (Evans et al. 2018a:118). The concept of accessibility in this definition applies in a broad sense in that it can refer to perceptual access (e.g. visually) to a referent or state of affairs, cognitive access to entities or states of affairs that are activated in a person's mind, or epistemic access to knowledge. In more simple terms, engagement marking signals whether an event or an entity is attended to or known by either of the speech act participants. Evans et al. (2018a, b) show that grammaticalized expressions encoding contrasts in access are attested in unrelated languages from different parts of the world, and that they can surface in different grammatical domains (e.g. in demonstratives and in the verb morphology). They further note that the scope of a morpheme, be it semantic (e.g. referent/location vs. proposition) or syntactic (e.g. noun phrase vs. clause), varies across systems. In the case of the Kogi prefixes mentioned above, engagement is encoded in the verb morphology and takes scope over an entire proposition. By contrast, the expression of engagement can also target a single entity referenced by a noun phrase. This is the case with demonstratives, which are generally considered to be a device for aligning the speech act participants' focus of attention (Diessel 2006).
Considering the exophoric use of demonstratives, its basic task is drawing one's interlocutor's attention to a concrete referent in the speech situation (Diessel 1999:94). It is evident that the speaker's consideration of the addressee's attentional state is crucial in this task. That is, at the beginning of such a communicative act, the speaker attempts to introduce a referent that previously was not in the addressee's attention. The communicative act is successful once the addressee shifts their focus of attention to the intended referent. Evans et al. (2018a) discuss languages that reflect such attentional contrasts in their demonstrative systems. Two of them, namely Turkish and Jahai (Mon-Khmer), are discussed in the remainder of this section to exemplify the role of engagement in demonstrative systems.
Turkish exhibits a three-way contrast in its (ad)nominal demonstratives: bu, o and şu. Descriptions of bu and o generally are in agreement in that the forms contrast in terms of distance, referring to objects close to and far from the speaker. The uses of şu, by contrast, have prompted contradictory proposals. For example, Lyons (1977) described it as an addressee-anchored proximal demonstrative, whereas Kornfilt (1997) suggested that the form signals medium distance from the speaker. These earlier studies mainly based their descriptions on written data and consequently neglected the basic functions of demonstratives in faceto-face communication. Özyürek and Kita (1998, n.d.) propose a revised analysis drawing on recordings of face-to-face interactions (e.g. in a pottery class) where it was possible to monitor attentional cues such as eye-gaze and pointing gestures alongside demonstrative use. The findings revealed that şu is employed by speakers when attempting to draw their interlocutor's attention to a specific object, irrespective of its location. Once the addressee has shifted their gaze to the relevant object and joint attention is established, the speaker switches to refer to it with one of the distance-encoding forms, bu or o. That is, the Turkish demonstrative system, shown in Table 5, expresses an attentional contrast in addition to distance: while şu implies that the addressee has yet to align their attention with that of the speaker, bu and o signals that shared attention is established and, furthermore, encode a contrast in distance. A comparable distinction has been described for Jahai, an Aslian language of Malaysia (Burenhult 2003(Burenhult , 2008. Jahai has a set of eight (ad)nominal demonstratives, presented in Table 6. Seven of these forms express spatial contrasts and carve up the space of a speech situation in a complex way (see Burenhult 2003 for details). Of special interest here is the addressee-anchored accessible ton. While earlier descriptions had associated ton with addressee proximity (Burenhult 2002:113-117), it later became evident that the use of the form depends on shared attention rather than a spatial distinction (Burenhult 2003).
This revised analysis was based on data obtained in an interactional picture matching task (i.e. the "Shape Classifier Task" (Seifart 2003), see Section 5) which involves a set of object stimuli of different shapes and sizes. In the task, a 'director' has access to photographs that show a subset of these objects in a specific arrangement, and instructs a 'matcher' to search for specific objects. The matcher does not have access to the photographs and must rely solely on the director's verbal descriptions in order to rebuild the depicted arrangement. In the case of Jahai, these instructions consisted of a number of guiding sequences in which the director introduced a referent with a description and further guided the matcher's attention by the use of space-encoding demonstratives. Once the matcher had identified the correct object and the director confirmed their choice, such a guiding sequence was typically completed with the use of ton. This is exemplified in (1) (Burenhult 2003:373) It became evident that the demonstratives used in the task, with the exception of ton, were employed to direct matchers to the location of objects, which previously had not been in their attention. Ton, by contrast, occurred at the end of these guiding sequences once both participants have aligned their attention and was used irrespective of the referent's location. The data obtained in this task thus revealed that the function of ton can more adequately be described in terms of joint attention, rather than an object's spatial location. While the context of shared attention is predominant in the study, Burenhult (2003:377-78) also mentions instances, in which ton refers to entities that cannot be construed as currently having the addressee's attention, but are mutually known as they have been introduced previously into discourse. This suggests that ton is not only associated with shared attention but also shared knowledge. Burenhult (2003:378) consequently proposes a more accurate definition of ton's function in terms of general 'cognitive accessibility'.
To sum up, we may note that the systems of Turkish and Jahai are similar in the way that the addressee's attentional state, as estimated by the speaker, determines the use of some demonstrative forms. In addition to spatial distinctions, both systems signal whether joint attention is established ('+ joint attention') or not ('− joint attention'). However, the systems differ with regard to the combination of attentional state and spatial contrasts: In Turkish, on the one hand, the form şu indicates '− joint attention' and does not specify location. It contrasts with two distance-encoding forms that additionally indicate '+ joint attention'. Jahai ton, on the other hand, expresses '+ joint attention' and does not specify location. The remaining forms of the paradigm encode spatial distinctions in addition to '-joint attention'.
Engagement may serve as an overarching notion under which demonstrative systems such as the ones found in Turkish and Jahai can be subsumed and potentially be compared to other grammatical means that are sensitive to contrasting attentional or epistemic access. In particular for Jahai, it is evident that the accessibility of referents can be construed in terms of both attention and knowledge on part of the addressee.
Finally, an important point to note is that for both cases discussed above, the consideration of interactional data, be it from natural face-to-face conversations or an interactional elicitation task, has contributed remarkably to the understanding of the systems.

The (ad)nominal demonstrative twẽ hié
As noted in Section 2, on first inspection, the (ad)nominal demonstratives appear to express distinctions in distance. During elicitation, an initial discussion of the first set suggested that hẽ hié refers to objects close to the speaker, twẽ hié to objects close to the addressee, and kwẽ hié to referents far away from both. However, other examples revealed that distance is not the decisive factor in the choice of twẽ hié.
Firstly, it appeared that twẽ hié is not applicable when referring to objects close to the addressee in certain contexts. In one scenario that was discussed, the speaker points out a referent located in the vicinity of the addressee, yet invisible to them (e.g. located behind them). The (ad)nominal demonstrative twẽ hié is not accepted in such a context, and instead, a different strategy must be used involving the adverbial demonstrative tweka in a relative clause (2).
(2) plato tweka té nuk-ká na-gé-gwa! bowl loc.adv.addr sit be.located-prs 1sg.obj-hand-imp.sg 'Hand me the bowl that is there (near you)!' Once the addressee has shifted their gaze to the object and asks hehié ? 'This one (near me)?', the speaker may confirm their choice with aha, twehié 'Yes, that one'.
Secondly, it became evident that twẽ hié can also refer to objects located at a distance from both interlocutors. In the following hypothetical scenario, which was provided by a consultant, two speakers are in the same location talking about distant objects.
( The speaker (S) points to an object using the distal demonstrative. When the addressee (A) identifies the object and asks for confirmation, the speaker approves the choice with twẽ hié. As the location of the object and speech act participants remained the same, distance is not at stake, only the addressee's shift of attention. This example, as well as the one in (2), point to the fact that the use of twẽ hié is motivated by joint attention, i.e. it is used once both speaker and addressee have shifted their focus to the same referent. A further instance supporting this hypothesis was observed in a more naturalistic setting of an interactional story telling task (i.e. the "Family Problems Picture Task", cf. San Roque et al. 2012). The  following images depict two participants in the second part of the task in which they are asked to organize images of a picture story in a coherent order, thereby referring to different cards in front of them.
In Figure 1a, the speaker on the left refers to a card located close to him by pointing to it with his right hand and uttering the speaker proximal demonstrative hẽ hie. While his interlocutor on the right is looking at the same card, the speaker is unaware of this as his gaze is focused on the table. In Figure 1b, the female participant reaches out to the card in question with her left hand, which lets the speaker assume that she has shifted her attention to the relevant card. Note that the speaker in this scene is not relying on his interlocutor's gaze to determine her attentional focus as he is looking down at the table. At this point, once the speaker assumes that joint attention has been established, he switches to twẽ hié to refer to the same card.
The instances discussed above suggest that a spatial analysis cannot account for the choice of twẽ hie. The demonstrative refers to objects that have previously been indicated by hẽ hié (for referents close to the speaker) as well as kwẽ hié (for distant referents). Moreover, twẽ hié is not used when an addressee has not yet shifted their attention to the object in question.
To summarize, while hẽ hié and kwẽ hié signal contrasts in distance (proximal and distal), twẽ hiéreflects speakers' assumptions about the attentional state of their interlocutors. More precisely, speakers use twẽ hié to refer to an object only if they assume that their interlocutors have shifted their focus on the relevant object and joint attention is established.
Note that the association of such expressions with addressee proximity can be seen as a 'typicality effect' (Burenhult 2003:367): Objects in the vicinity of the addressee are most likely to be in their focus of attention, and therefore tend to be referred to with demonstratives associated with shared attention.
In order to corroborate the new proposal for the function of twẽ hié further, an interactional elicitation task inspired by Burenhult's (2003) study was conducted, which is detailed in the following section.

Twe-demonstratives in the Shape Classifier Task
The functions of the demonstrative twẽ hié and its related forms twẽ and twalde (henceforth subsumed under the preliminary label twe-demonstratives) was investigated by means of an elicitation task based on Seifart's (2003) "Shape Classifier Task". In the task, a director instructs a matcher to reconstruct an arrangement of objects depicted in a photograph. The verbal instructions of the director consist of a number of referential sequences in which the speaker refers to a specific object in a set of stimuli of different shapes and sizes, and attempts to guide the matcher's focus of attention to it.
While the Shape Classifier Task was originally designed to investigate shape-encoding expressions, it can yield revealing data about demonstrative use, as observed by Burenhult (2003, see Section 3 above) for Jahai. The setting is especially suitable for investigating the exophoric use of demonstratives, as the attentional states of the participants can be observed. A referential sequence typically starts with  an attentional asymmetry between participants, which is eventually resolved when the director has successfully aligned the matcher's attention with their own.

Method
For the exploration of twe-demonstratives a modified version of Seifart's (2003) Shape Classifier Task was created. Since Kogi does not have shape classifiers in its grammar, a reduced number of stimuli and photographs was employed. The materials consist of 25 small clay objects (Figure 2a) as well as 8 images that depict a number of these objects in certain constellations (examples in Figure 2b). As in the Shape Classifier Task, the object stimuli feature subsets of similar shapes, e.g. squares differing in thickness and size. This would engender more elaborate verbal instructions, as addressees are confronted with more than one possible choice considering the speakers' initial description in terms of shape.
The setup of the task is illustrated in Figure 3. It involves two speakers that are seated at a 90 degree angle to each other, facing the object stimuli. The director (on the right) has the photographs and provides the matcher (on the left) with verbal descriptions of the configuration in the photograph. The director has a full view of the photographs as well as the arrangement that the matcher is building. The matcher, by contrast, does not see the photographs and needs to reconstruct the arrangement of objects solely relying on the director's verbal description. Two sessions of the task were carried out by a total of four speakers, who switched roles halfway through. The task was recorded on video and audio.

Analysis and results
The focus of the analysis lies on the demonstrative twẽ hie, including the related forms twẽ and twalde, which were also found in the data. Furthermore, since we are concerned with exploring attention-guiding by means of demonstratives, the analysis is limited to utterances by the director.
Twe-demonstratives occurred frequently in directors' utterances, whereas other (ad)nominal demonstratives were absent.¹ The proximal hẽ hié (and hẽ , halde) and the distal kwẽ hié (and kwe) are not attested in the directors' speech, which may be explained by the fact that all the stimuli are located approximately in the same place, close to the speech act participants. In this situation, the distanceencoding forms do not provide the addressee with any useful information as to where to look for a referent.
Given the aim of investigating the role of addressee attention in demonstrative choice, a way of determining the attentional relation between the addressee and the relevant object is necessary (cf. Burenhult 2003:370). The attentional cue of gaze direction proves difficult to determine in the task, given the piled-up nature of the stimuli, which are scattered in a small space on the table. Following Burenhult (2003), the addressee's attention was instead determined on the basis of their (potential) physical contact with the referent, i.e. whether they were reaching for, or touching the object. In order to test the relation between attentional contrasts and the use of demonstratives, the matcher's attentional state was evaluated at the moment of utterance of twe-demonstratives. Thus, whenever the matcher touched or reached for the intended referent, it could be assumed that this referent was in the matcher's attention. Moreover, as the director monitored the actions of the matcher, their feedback can be considered a further cue. Once the director confirms the matcher's choice (e.g. by uttering 'Yes, that one.') the attention of both participants has been successfully aligned.
One session of the task includes a total of 36 referential sequences, i.e. stretches of utterances by the director in which the matcher is instructed to search for a specific object. Such a sequence typically starts with the description of the referent, e.g. 'Take the one, which is flat and has four edges'. As the matcher searches for the correct object, the speaker may provide further instructions, which involved more detailed descriptions of the object's shape, e.g. 'Not that one, a thicker one.', or information about its location, e.g. 'It's next to the ball.'. While these descriptive strategies were most prominent in the data, the location was occasionally indicated by the use of adverbial demonstratives, as illustrated in examples (4) and (5) All three twe-demonstratives are attested in the obtained data, yet each speaker had a clear preference for certain forms (see Table 7). Two speakers, who are sisters, preferred twẽhie, while one of them occasionally used twe. Speaker 3 made equal use of twẽhié and twalde (i.e. the form that contrasts a referent with other potential candidates). In the speech of Speaker 4, only twẽ is attested. ith regard to semantic distinctions, it is apparent that twe-demonstratives do not target the specific location of a referent in the pile of stimuli. Instead, they are used to refer to objects irrespective of their location and do not specifically refer to those located closer to the addressee. The use of twe-forms was found to occur in three different contexts: reference to an object in the pile at the end of a referential sequence to confirm the matcher's choice, (ii) reference to an object in the pile during a sequence to refute their choice, and (iii) reference to an object previously manipulated by the matcher in the arrangement.
In the 72 total referential sequences of both sessions, 42 were terminated by the use of a twe-demonstrative to confirm the matcher's choice. In other instances, the matcher's choice was reinforced by positive backchanneling, or simply by not diverting their attention further.
Each case in which the director uttered a twe-demonstrative for confirmation coincided with proximity (i.e. reaching for) or physical contact (i.e. touching or holding the referent) between matcher and referent. This suggests that the matcher has aligned their attention with the director's at the time of utterance of twe-forms.
Examples of such sequences are given in (6) and (7), both of which are terminated by the director's confirmation, using twẽ or twẽhié (preliminarily glossed as dem.twe). The second context corresponds to situations in which the matcher had physically engaged with a potential referent, yet their choice was incorrect and was therefore refused by the director. This context, even though the guiding sequence had not yet been successfully completed, both participants are attending the same referent. This is exemplified in (8). All of these instances involved physical contact between the matcher and a potential referent. Note that in these situations it is the director who realigned their attention with the matcher's, rather than the other way around. That is, the matcher drew the director's attention to a potential candidate in the pile by reaching for or touching it. The director then shifted their focus of attention, evaluated the matcher's choice, and subsequently refuted it. Finally, twe-demonstratives were used when the director referred back to an object that the matcher had previously manipulated in the arrangement. This was typically observed at the beginning of a guiding sequence, as illustrated in example (9), in which the director introduces a new referent (i.e. the next object to be picked) by relating it to the referent of the preceding sequence.
(9) twẽ hana-gatse=ga naldatshák akldé wẽźhildukka dem.twe similar-seem=emph but more long 'It looks similar to that one but it's longer.' This use of twe-demonstratives is not that frequent (seven instances), as it competes with another strategy to refer to previously mentioned referents. That is, the endophoric demonstrative ẽ 'that one (mentioned before)' was employed more frequently in such contexts.
The third context differs from the first two in one important aspect: The matcher did not always physically engage with the referent, which suggests that the addressee's visual attention was not at stake. Thus, twe-forms may signal shared cognitive accessibility in addition to attention. This is elaborated on in more detail in the next section, which summarizes and discusses the results.

Discussion
Twe-demonstratives occurs in three contexts that are repeated below.

(i)
reference to an object in the pile at the end of a referential sequence to confirm the matcher's choice (ii) reference to an object in the pile during a sequence to refute their choice, and (iii) reference to an object previously manipulated by the matcher in the arrangement.
Context (i), in which the matcher physically engages with the referent and receives positive feedback from the director, is the most prominent use of twe-demonstratives, observed in more than half of the guiding sequences. These instances arguably constitute a strong case for the hypothesis that shared attention warrants the use of twe-forms.
Context (ii) also involves contact, or proximity between the matcher's hand and a referent, yet not the one indicated by the director. In these cases, the director realigned their attention with the one of the matcher to evaluate the object and refute their choice. While this use is not frequent in the data, it is a further case in point for the analysis that twe-forms are licensed by joint attention. In this situation, however, it is the speaker that realigns their attention with the one of the addressee, rather than the other way around.
In (iii), as opposed to the first two contexts, the utterance of twe-forms did not coincide with a close physical relation between the referent and the matcher's hand. Therefore, these uses cannot be associated with shared attention, which suggests that twe-demonstratives do not exclusively target attentional states. Instead, referents that were introduced previously by the director are still prominent in the matcher's mind and therefore constitute shared knowledge.
Based on these findings, it is evident that the Kogi paradigm of (ad)nominal demonstratives shares properties with Jahai demonstratives. Even though both systems differ considerably in complexity, the function of the Kogi twe-forms parallels the one for ton (example (1) above). Both are used for referents that are in the focus of attention of the speaker and the addressee. This use occurred most prominently at the end of guiding sequences in the task when the director confirmed the matcher's choice. In addition to shared attention, both forms also target shared knowledge when they refer to an entity that was mentioned in previous discourse and is known to both interlocutors. Thus, we may conclude that twe-demonstratives signal shared epistemic access in terms of attention and knowledge.
As for other (ad)nominal demonstratives, Kogi speakers, in contrast to speakers of Jahai, do not make use of such forms to direct the matcher's attention. Instead, directors draw on other strategies, for example, indicating its location with adverbial demonstratives. This clearly is a shortcoming of the conducted study, as it fails to provide information about the contrast between twe-forms and other (ad)nominal demonstratives. In order to understand the functional extensions of twe-demonstratives more thoroughly, it is necessary to study them in relation to other forms of the paradigm in future research. Finally, we may note that the Kogi system structurally resembles the Turkish one in that both make a threeway contrast. As shown in Table 8 and 9, the systems constitute mirror images of each other: While Kogi features one value for '+ joint attention' context, and two for '-joint attention' contexts, i.e. proximal and distal, Turkish exhibits two values for '+ joint attention', and a neutralized spatial contrast for '-joint attention'.

Conclusion
This paper provides an analysis of Kogi (ad)nominal twe-forms in terms of shared attention and knowledge. It was argued that a description solely in terms of space cannot account for the use of twẽhie, twẽ and twalde. While these forms were initially hypothesized to signal addressee proximity, this analysis was later rejected as they were observed to refer to objects irrespective of location. More elaborate discussions in elicitation revealed that e.g. twẽhié can be used for referents close to the speaker as well as referents at a distance from both interlocutors. It was concluded that the deciding factor is the attentional status of the addressee and twehié is used whenever both speech act participants attend to the same referent.
To situate the analysis of the Kogi demonstratives, Turkish and Jahai were discussed as examples of languages that feature demonstratives targeting the assumed cognitive state of the addressee. Both languages feature forms whose use is dependent on whether speaker and addressee focus their attention jointly on a given referent. With this as a background, it was argued that Kogi twẽhié and related forms can be analyzed in a similar vein with respect to shared attention or knowledge.
Evidence for this analysis comes from elicited data, which suggested that twẽhié can be used for referents that were previously referred to with hẽhié (speaker proximal) or kwẽhié (distal), the crucial factor being that joint attention is established. The hypothesis received further support by data from an interactional matcher-director task. In the task, twe-forms were most prominently used by the director once joint attention was established to confirm the addressee's choice of object. Moreover, the forms occurred when the director had realigned their attention with the one of the matcher in order to evaluate the choice of object.
While utterances of twe-forms most frequently coincide with joint attention, the forms can be associated with shared knowledge rather than attention in some instances. That is, twe-forms are used to refer to referents that do not have the addressee's attention, yet can be assumed to be known to both interlocutors as they were previously introduced into discourse. In this respect, the function of the forms in question parallels the one of Jahai ton.
In conclusion, the findings obtained in elicitation as well as in the matcher-director task reveal that twe-demonstratives target shared attention as well as shared knowledge and thus constitute a further example along the lines of Jahai which can be considered in the study of engagement marking. The study further reaffirmed the relevance of interactional data for the analysis of demonstratives, and showed that structured referential tasks as in the Shape Classifier Task are a convenient method to obtaining them.
The investigation is limited to (ad)nominal demonstratives and does not discuss the functions of adverbial demonstratives in detail. As for the set of locative adverbial demonstratives that is related to the (ad)nominal twe-forms (tweka, twai and tweni), this also seems to correlate with addressee proximity (see Section 2). That is, these forms denote locations close to the addressee and thus far, no evidence counter to this analysis has emerged.
An open question concerns the use of demonstratives in reference to entities that have previously been introduced and are known to both speech act participants. While twe-forms denote mutually known referents in the matcher-director task, speakers more frequently used the anaphoric demonstrative ẽ in these contexts. Thus, one goal of future research is the relationship between these two types of demonstratives in reference to shared knowledge.
Lastly, given the setup of the task, which featured stimuli close to and at a relatively equal distance from both interlocutors, it did not engender any use of speaker-proximal or distal (ad)nominal demonstratives on part of the director. Thus, it could not be investigated how twe-demonstratives directly contrast with the other members of the paradigm. In order to obtain a fuller understanding of the demonstrative system of Kogi, the findings of the present study will ideally be complemented with investigations based on additional naturalistic and interactional data.