Skip to content
BY 4.0 license Open Access Published by De Gruyter Open Access December 23, 2021

How grammar grows out of social interaction: From multi-unit to single-unit question

Simona Pekarek Doehler
From the journal Open Linguistics


This article scrutinizes interactional motivations for the sedimentation of grammatical usage patterns. It investigates how multi-unit questioning turns may have routinized into a single-unit social action format. Multimodal sequential analysis of French conversational data identifies a recurrent pattern in which a question-word question is followed by a candidate answer (formally: [question-word question + phrase/clause]). The data show a continuum of synchronic usage, the pattern being implemented as either two or one turn-constructional unit(s), with intermediate cases displaying fuzzy boundaries. In usage (i), a candidate answer emerges in response to the recipient’s lack of uptake as a way of pursuing response; in (ii) the candidate answer occurs immediately after the question, with fuzzy prosodic boundaries between the two units; in (iii) the pattern is produced as a single turn-constructional unit, showing important lexico-syntactic and prosodic consistency. It is argued that the integrated format (iii) originates in the repeated interactional sequencing of two subsequent actions, as in (i), and serves as a resource for proffering a highly tentative guess: It is the routinized product of frequent combinations in use, emerging from the interactionally motivated two-unit format. The findings support an understanding of interaction as a driving force for the routinization of patterns of language use.

1 Introduction

The way a turn at talk is built centrally contributes to the type of action it is understood to implement and to the type of response it makes relevant next. That is, the design of turns is consequential for action formation, action ascription, and action projection (Sacks 1992, Schegloff 2007). This design – and the grammatical and bodily resources it builds on – is accomplished in real time: It is the emergent product of turn- and action trajectories rather than of the implementation of discrete (linguistic) units. As Hopper (2011, 23) puts it, “speakers do not possess a bird’s eye view of an utterance, but rather move forward in time through it”. They do so in ways that are contingent upon the local circumstantial details of the interaction (Goodwin 1979), including recipients’ co-occurring conduct. Based on such contingencies, speakers adapt turn-designs and the related grammatical-bodily trajectories, and expand or revise these in the very course of their production (Auer 2009, Hopper 1987, 2011, Pekarek Doehler 2011, Streeck 2009). Recurrent turn-designs – including recurrent on-line adaptations – for accomplishing precise actions may in turn lead to the routinization (cf. Haiman 1994) of frequent combinations of grammatical (and bodily) units, ultimately ensuing in the sedimentation of grammatical action formats from frequent combinations in use (Bybee 2010, Hopper 1987, 2011). Yet, to date we have little empirical evidence for how such sedimented formats grow out of interaction, i.e., routinize or even grammaticize in response to speakers’ repeated dealing with local interactional needs (but see Barth-Weingarten 2014, Couper-Kuhlen 2011, Pekarek Doehler and Balaman 2021, Pekarek Doehler, De Stefani and Horlacher 2015). This article sets out to investigate an instance of such routinization of grammar-for-interaction.

The analytic focus is on a precise type of sequentially first turn in which an initial question-word question (QWQ) is followed by a candidate answer offered by the same speaker, as in example (1) (the focal pattern is highlighted in bold; see Appendix for transcription conventions):

Each of the two spates of talk in lines 01 and 03 forms an intonation unit ending on final intonation, the first unit is syntactically complete, the second depends on the first for its interpretation, and each implements a question and hence stands as a turn constructional unit (TCU: Sacks et al. 1974) in its own right. The first is hearable as a QWQ[1] seeking information, the second as a polar question (PQ) that offers a candidate answer to the first question and seeks confirmation. Together, they build a formal pattern of the type [QWQ + phrase], or sometimes [QWQ + clause], with a gap in between (no verbal or embodied recipient response). The recipient’s response shows how she treats the preceding questions: She first responds to the QWQ (l. 04, in overlap) and then to the PQ (l. 05). The initial question is hence re-designed, and so is the action implemented therein, from an information request to a confirmation request.

Regularly in conversational French, however, what at first sight may appear as two subsequent questions is delivered under one coherent prosodic contour and is treated by recipients as a single TCU, implementing a single action, as in (2):

In (2), the speed-up of tempo on the segment elle fait quoi ‘what’s she doing’, the continued pitch (lack of pitch up/down-step) as well as the continuing voicing between the two segments – and hence latching (Schegloff 2000) of the second onto the first – present the whole stretch of talk in line 1 as a single unit, ending on final intonation; furthermore, the quoi ‘what’ does not carry any phonetic exponents of finality (such as nuclear accent in French, Delattre 1966, Persson 2014). Through all these features, the second segment, though syntactically separate, is prosodically tightly integrated with the first. The whole stretch of talk represents one complex construction grammatically configured as [QWQ + phrase], which is treated as a PQ by the recipient (l. 02), who provides a “no” response and then adds the alternative “bachelor”. Although lexico-syntactically, the turn is built in ways that mirror the QWQ plus candidate answer in (1), here the whole stretch implements one single action.

These two initial examples show two different realizations (I will refer to these as “formats”), representing the extreme ends of a continuum of synchronic usage of the formal [QWQ + phrase/clause] pattern. They raise three key issues that I wish to address in this article:

  1. The action-formation issue: What are the interactional jobs speakers get accomplished by means of the single- vs the multi-unit formats? How do these jobs differ from “conventional” question formats?

  2. The delimitation issue: Is there a categorical distinction between the single- and the multi-unit format of the pattern at hand or are there fuzzy boundaries between them? What are the syntactic, prosodic, bodily-visual, and sequential cues for delimiting the units? And to what extent is the pattern’s unit-compositionality, including fuzzy boundaries, relevant for participants?

  3. The emergence-routinization issue: What conclusions can we draw from the synchronic usage of the single- and multi-unit formats as to the in situ emergence and the over-time routinization of patterns of language use for interaction?

In what follows, I first offer a short overview of question–answer adjacency pairs and multi-unit questions and address the routinization of patterns of language uses (Section 2). I then present the data under analysis (Section 3). Based on multimodal sequential analysis of video-recorded ordinary conversations in French, I subsequently document a continuum of synchronic usage of the formal [QWQ + phrase/clause] pattern, ranging between one and two TCUs, with intermediate cases that display fuzzy boundaries (Sections 46). I argue that the integrated single-unit format (as in (2) above) has grown out of the repeated interactional sequencing of two successive actions (as in (1) above) and has routinized as a resource for offering a tentative guess. As opposed to a conventional PQ (e.g., “is she doing her Masters thesis?”), through which speakers offer a candidate which they fully endorse (Pomerantz 1988, 369), the focal format flags the candidate as just a try – a highly tentative guess. After discussing the findings (Section 7), I conclude by addressing some of the consequences we can draw from observing recurrent in situ emergent grammatical turn-designs, specifically as regard the over-time routinization of “social action formats” (Fox 2007)[2] out of frequent combinations in use (Section 8).

2 Background

Questions can be vehicles for a range of social actions. They may seek information, initiate repair, request confirmation, accomplish offers, requests, or assessments, and so forth. Questions exert normative preference constraints on responses: Answers are preferred over non-answers (Clayman 2002), confirming answers are preferred over disconfirming answers (with PQs; e.g., Stivers 2010), type-conforming answers are preferred over non-type conforming ones (Raymond 2003). Type-conforming responses to PQ are responses such as yes or no that “fit” the formal design of the question. For QWQ, type conformity is looser: Schegloff (2007, 78) argues that with QWQ any answer providing the sought-for information (who, where, when, etc.) is type-conforming. Overall, responses that do answering, accept the terms of the question and are structurally fitted to it, tend to be delivered as preferred responses, typically being produced contiguously, without delay, prefaces, mitigations, or accounts (Pomerantz 1984, Sacks 1987).

Questioners typically gaze at the recipient during or at the end of questions (Rossano 2012, Stivers 2010), and their gaze can also be a resource to indicate that a response is due (Heath 1986, Kendon 1967) or to pursue response after lack of uptake (Stivers and Rossano 2010). Equally, holds (i.e., suspension of bodily movements) can be a means for displaying that a response is still awaited (Li 2013).

Heritage (2013), focusing on PQs, suggests that the grammatical design of questions encodes epistemic status and/or stance of the questioner, and attributes epistemic rights and entitlement to recipients: Declarative formats are heard as providing information when they target issues in the epistemic domain of the speaker, and as requesting confirmation when they target issues in the epistemic domain of the recipient. It follows that on-line adaptations in question-design may be vectors for adapting the epistemic gradient between questioner and recipient.

A few researchers have investigated so-called “multi-unit questioning turns”, implemented through two or more TCUs. These are part of “a family of types” (Linell et al. 2003) covering a wide range of formal realizations. An initial discussion of multiple questions was offered by Sacks, who notes that a “second question in a series will commonly be a candidate answer to the first” (1987, 60) designed to reverse the preference after lack of recipient response: It is a pre-emptive move to avoid a disconfirming response. Linell et al. (2003) argue that multi-unit questioning turns most often work in a “narrowing” way to increase the pressure on (or guidance to) the recipient to provide a precise type of answer (see also Clayman and Heritage 2002). Other research has shown how interviewers may re-formulate a question as third-turn repair in response to recipients’ problems with responding (Houtkoop-Steenstra and Antaki 1997), and that questioners’ production of candidate answers may work in the same way (Svennevig 2013). While the quoted studies focus on a large variety of formal realizations and types of questions, they investigate multi-unit-question turns that are typically composed of several syntactically complete questions “hearable as two distinct questions” (Linell et al. 2003, 545).

Different from existing research, in this article I zoom in onto a precise question pattern that shows high consistency as regards its syntactic (and to some extent lexical) constituency: [QWQ + phrase/clause]. I show how, on occasions, that pattern emerges in real time in response to local interactional contingencies, such as a lack of a recipient’s response (ex. 1 above), how, on most occasions, it is produced in more condensed ways, being prosodically delivered as a single TCU and treated as such by co-participants (ex. 2 above), and how, in between these two forms of realizations, we find cases displaying fuzzy boundaries. I argue that this continuum of synchronic usage suggests a possible routinization (or even grammaticization) path from a “double-unit question format” that emerges locally in response to interactional contingencies to a “single-unit format” that conflates the two units, and is used for the purpose of proffering a highly tentative guess.

This argument is in line with an understanding of grammaticization as a form of routinization of language (Haiman 1994) – typically involving features such as loss of semantic meaning and morphophonological substance, along with shift in pragmatic significance and in grammatical structure, relations, and/or constituency (cf. Hopper and Traugott 2003). Grammatical routines may specifically be motivated by social-interactional needs (Couper-Kuhlen 2011) such as turn-taking (Detges and Waltereit 2011), the maintenance of progressivity (Pekarek Doehler 2011, Pekarek Doehler and Balaman 2021), and speaker–hearer negotiations of meaning (Hopper and Traugott 2003, 71). Frequent combinations in use (collocations) may lead to routinization and ultimately grammaticization of constructions (Bybee 2010). While change begins in individual contextual instances of language use, continua of synchronic usage can be understood as the manifestations of a cline in grammaticization – and hence as synchronic evidence for ongoing routinization/grammaticization (cf. Lehmann 1985, Hopper and Traugott 2003). In what follows, I document such a continuum in use of the formal pattern [QWQ + phrase/clause] and provide evidence suggesting an interactionally motivated routinization of the “double question” format into a single unit question.

3 Data and procedure

The data consist of 33 video-recorded ordinary conversations (totaling 12 h) in French among two to four student-friends (76 participants) during coffee breaks in a university cafeteria. Participants, who provided informed consent for collection and publication of the data, were recorded from two different camera angles. Transcriptions of verbal conduct follow Jefferson (2004); transcriptions of bodily conduct follow Mondada (2018); see Appendix.

An initial inventory of all QWQs in the data was established as part of a study investigating the effects of the word order of QWQs on the timing of responses (Pekarek Doehler 2021). The occurrence of the [QWQ + phrase/clause] pattern emerged as a salient feature of the data. A total of 49 occurrences of the target pattern were found, equaling 19% of all QWQs in the data. In 32 of the 49 occurrences, the second segment consisted of a phrase (65%), and in 17 of a clause (35%). Possible prosodic boundaries within the pattern were first assessed auditorily, and then examined using PRAAT. Prosodic completion in French is marked by the nuclear accent; the nuclear accent, showing important pitch movement and augmented syllable length, is hence generally final (i.e., on the last full syllable), coinciding with TCU or even turn-ends (Delais-Roussarie et al. 2015, Persson 2014).

Based on multimodal sequential analysis, forms of realization of the [QWQ + phrase/clause] pattern were found to be distributed along a continuum of integration of the two parts, and were grouped into three categories:

  1. (i)

    instances clearly composed of two prosodic units, each ending on final intonation, equaling two TCUs; the second unit is delivered as a post-gap turn-extension after the first unit had reached a transition relevance place (TRP) marked by syntactic and prosodic completion, but was met with lack of recipient response (as in ex. 1 above);

  2. (ii)

    instances with less clear (i.e., weak) prosodic boundaries, e.g., with minor pitch movement at the end of the first segment combined with other more integrating characteristics, such as no audible gap between the two and latching of the second unit onto the first, displaying fuzzy boundaries, i.e., weak cesuras, between them;

  3. (iii)

    instances where the two parts were encompassed under one coherent prosodic contour, conflated into one single TCU without a TRP in between; these show syntactic completion but clearly no prosodic completion after the first part (as in ex. 2 above).

It should be clear from the above that the categorization into (i), (ii), and (iii) is a heuristic, representing what in fact is a cline of integration of the two units. As shown in Table 1, format (iii) is by far the most frequent.

Table 1

Relative frequencies of formats (i), (ii), and (iii)

Format (i) – two units Format (ii) – fuzzy boundaries Format (iii) – single unit Total
18% (n = 9) 31% (n = 15) 51% (n = 25) 100% (n = 49)

In what follows, I discuss each of these formats, identify their distinctive grammatical properties, scrutinize speakers’ co-occurring bodily-visual conduct, and examine the formats’ interactional workings.

4 Format (i): QWQ plus post-gap turn-extension – a multi-unit questioning turn

This section offers an analysis of representative excerpts illustrating format (i), in which the two parts of the pattern are produced as two clearly distinct prosodic units and two distinct actions with a gap in between: A phrasal/clausal element is added on as a post-gap turn-extension after an initial QWQ was met with absence of recipient response. The added-on element offers a candidate answer to the QWQ and, by virtue of that fact, operates a shift in the epistemic gradient: Questioners thereby not only pursue a response, but upgrade their own epistemic stance from displaying “no knowledge” (in the QWQ) to claiming some epistemic access through offering a guess that invites confirmation or disconfirmation. Questioners’ embodied conduct provides additional cues as to how they package their talk into units of action.

Excerpt 3 offers an illustration. Following the closing of a preceding sequence with both participants’ gaze averted from each other, Ekta launches an information request:

Ekta’s QWQ (l. 02) is met with a lack of recipient response (l. 03), upon which Ekta extends her turn with the phrase une heure et demi ‘one thirty’ (l. 04). Prosodically, both the initial question and the turn extension are delivered as independent units, each carrying its own focal accent and ending on final (high-rising) intonation, with clear cessation of articulation between the two. Consisting only of a phrase, it is by virtue of the second unit’s occurrence after the first that it can be heard as implementing a question (a PQ: “[is it] one thirty?”). Sequentially, the second unit is produced in response to recipient lack of uptake (l. 02), and praxeologically, it offers a candidate answer to the just-produced question, marked as a “try” by rising intonation (l. 04; Sacks and Schegloff 1979), thereby inviting confirmation from the recipient. Joana’s confirming response is delivered immediately thereafter (l. 05) and is formally fitted to the PQ (Raymond 2003): It treats Ekta’s turn-extension (l. 04) as seeking confirmation.

So, speakers may extend a question turn accomplishing a request for information by offering a candidate answer as a way of “fishing for a response” when one is lacking. Thereby, not only the formal nature of the question is redesigned – shifting from a QWQ to a PQ –  but also the action it accomplishes: The second TCU works as a confirmation request while the first implemented an information request.

The practice of re-designing an initial question after a lack of recipient uptake has been discussed in prior research as a way of facilitating an agreeing response (Sacks 1987) or guiding the recipient with regard to the expected response (Linell et al. 2003). The cases discussed here differ from these prior findings both in their grammatical form (here, the second question consists of just a phrase) and their interactional workings (here, the second question offers a candidate answer, see also Svennevig 2013). While they do sequential repair in the absence of a recipient response, the added-on candidate answers can hardly be seen as facilitating agreement or guiding the recipient’s response, given the issues they target (e.g., “what time is it?”, ex. 3; “what is your nationality?”, ex. 4 below). What the redesign of the question does, however, is to narrow down the recipient’s leverage to provide a type-fitted response to the mere choice between “yes” and “no” (type conformity to QWQs being “looser” than to PQs; Schegloff 2007, 78) while at the same time reducing the displayed epistemic asymmetry between speaker and recipient: By providing a candidate answer, the speaker claims more access to the issue at hand than (s)he did in her initial QWQ. As Pomerantz (1988, 369) notes, through PQ speakers offer a candidate that they fully endorse: a “best guess”. In excerpt 3, then, Ekta ventures a guess as to the object of her own question (“what time is it”), further reducing the assertiveness of the proposed candidate answer through final rising intonation.

One further observation deserves mention. With such questions accomplished through two subsequent TCUs, the speaker’s bodily conduct often shows a break between the first and the second unit, and this is markedly different from the cases where the two parts of the formal pattern are produced in more condensed manners (see below). In excerpt 3, Ekta (on the left) gazes at Joana’s watch during the delivery of the QWQ (l. 01-4, Figure 1), then shifts her gaze and head toward Joana’s phone exactly with the delivery of the candidate answer (Figure 2).

Figure 1 
               EKT gazing at JOA’s watch.

Figure 1

EKT gazing at JOA’s watch.

Figure 2 
               EKT’s head turned left, gazing at JOA’s phone.

Figure 2

EKT’s head turned left, gazing at JOA’s phone.

While Ekta’s gaze shift may here be responsive to Joana’s turning her own gaze toward her phone (l. 03 sq.), it still creates a bodily hiatus between her first and her second question. It provides one instance (for a clearer illustration see ex. 4 below) of how gesture/posture is coordinated not only with linguistic units but also with interactional units (cf. Goodwin 2000) such as actions, the bodily shift being deployed in concert with the speakers’ moving into a subsequent action. Furthermore, it is noteworthy that Ekta holds her shifted position, relinquishing it only right after the reception of Joana’s answer (l. 06, cf. Li 2013), thereby displaying an embodied packaging of the PQ–answer pair together into one sequence, after having embodiedly dissociated the prior QWQ from that sequence. Such a hiatus in the questioner’s bodily conduct between the QWQ and the candidate answer, followed by a hold until the recipient’s response delivery, is recurrent with the multi-unit format in which each question is delivered as a separate TCU.

A second example is shown in (4), taken from a conversation between Stephan, Rosa, and Eliot (Eliot is on the right, off the captures, Figures 3 and 4). Eliot inquires whether Stephan voted in last weekend’s popular vote, to which Stephan replies that he did not because he is not Swiss by nationality (l. 02) and therefore can vote only at the communal level (l. 05). Rosa’s question (l. 08) comes in overlap with Stephan’s turn-extension (l. 07).

Figure 3 
               ROS gazes sideways at STE.

Figure 3

ROS gazes sideways at STE.

Figure 4 
               ROS shifts body/head back and left, then holds still.

Figure 4

ROS shifts body/head back and left, then holds still.

Figure 5 
               MAR gazes at PAT, holding fingers.

Figure 5

MAR gazes at PAT, holding fingers.

Grammatically, prosodically and sequentially, Rosa’s questions (l. 08, 10) are shaped in a manner similar to Ekta’s in (3) above: Rosa offers a QWQ, followed by a phrase after a lack of recipient uptake (l. 09), each of these being delivered as a TCU ending on final intonation (here: low-falling, l. 08; high-rising, l. 10). By means of her turn-extension, Rosa not only re-issues an invitation for Stephan to respond, creating a second sequential opportunity to do so, but offers a guess as to Stephan’s nationality.

As Rosa’s question targets an issue that is fully in Stephan’s epistemic domain, her guess cannot be seen as a means of facilitating a response or guiding the recipient. Rather – and just like Ekta in (3) – by providing the guess, Rosa shifts from displaying an unknowing stance (through the QWQ) to claiming at least potential access (through the candidate NP), which she marks as tentative by means of the rising intonation and the lower volume on the candidate. Stephan immediately responds that he is binational French-Italian, thereby only partially confirming the adequacy of Rosa’s guess.

Here, too, the questioner’s bodily conduct shows a notable break between the two units. With the delivery of her QWQ (l. 8), Rosa shifts her upper body toward her right, further away from Stephan, and at the same time twists her head slightly back and toward her left, so that she is now gazing straight at Stephan (Figure 4) rather than from a light angle as before (Figure 3). This dynamic movement contrasts with her holding her posture notably still during the subsequent gap and the delivery of the candidate, again creating a bodily hiatus which converges with the prosodic break between the two actions.

Rosa’s maintaining her gaze and position still after the QWQ embodies her expectation that a response is due already here (cf. Heath 1986, Kendon 1967, Li 2013). This expectation is then loosened only after Stephan’s response (l. 11, produced in overlap with Rosa’s candidate), with Rosa’s nod and her sequence-closing third (l. 12).

In sum, then, the aforementioned excerpts illustrate the following sequential pattern, which accounts for roughly a fifth (18%, n = 9) of the occurrences of the formal [QWQ + phrase/clause] pattern in the data:

Adding a candidate to a QWQ after a lack of uptake is among the practices speakers deploy for fishing for a recipient response after such a response had been missing; by the same token, speakers change the action accomplished by their question from an information request to a confirmation request. They offer a tentative guess that is seeking confirmation. The tentative nature of the guess is consistently marked in the data by high-rising intonation, qualifying it as a try (Sacks and Schegloff 1979; ex. 1, 3, 4), and sometimes by lower volume (ex. 4), while the intonation on the prior QWQ may be high-rising or low-falling (e.g., ex. 1, 3 vs 4) but is consistently final. The added-on candidate is part of what Schegloff, Jefferson, and Sacks (1977, 373) group together as “guess, candidate, or ‘try’” that do not assert, but are proffered for acceptance or rejection.

Finally, it is noteworthy that the questioner’s bodily conduct converges with the prosodic break, showing a hiatus between the two units/actions. This stands in sharp contrast to what we find with the more condensed patterns, suggesting that speakers’ bodily conduct may provide cues for how they orient to action-units in interaction.

5 Format (ii): QWQ followed by immediate turn-extension with weaker boundaries

Let us now turn to cases where both segments still appear to stand as TCUs, yet with less clear boundaries between the two. First of all, there are no gaps between the two components. Moreover, we only find small pitch movements at the end of the first segment, combined with latching. These features make the boundaries between the two parts less clear than in examples 1, 3, and 4. Praxeologically, this has the effect that the QWQ accomplishing an information request appears to get transformed “on the fly”, in the very course of turn-production, into a PQ accomplishing a confirmation request. The adding on of the second unit is here not done for the purpose of pursuing a response but merely proposes a candidate answer that is displayed as tentative – a guess.

An example is provided in (5). Both participants gaze at each other throughout the excerpt.

Marie asks when Pat has to leave for his next course (l. 01). After Pat informs her that his course starts at quarter past (l. 06), Marie launches a QWQ asking about the precise hour (l. 08): Her de quelle heure (roughly: “of what hour”) is hearable as “quarter past what hour?”. In immediate turn-extension, Marie then offers de midi et quart ‘twelve fifteen’ (literally: “from twelve fifteen”). Just like in the excerpts above, both segments stand as TCUs: Each can be heard as accomplishing an action in itself, the first proffers an information request and the second a candidate answer inviting confirmation by the recipient. As to the latter, note that what in the transcripts is noted as final rising intonation (l. 09), in French may occur as a concave final pitch contour or a high-rise ending on a high pitch plateau (Delattre 1966), as shown in Figure 7 (for Swiss-French, both high-rise and high-rise-fall contours have been documented for PQs; Delais-Roussarie et al. 2015). Also prosodically, lines 08 and 09 are formatted as two units, yet in ways that are less prominent than in excerpts 3 and 4 above: A shown in Figure 7, each is carrying its own focal accent, there is a break in phonation between the two, yet only a mid-falling intonation at the end of the first part (marked as “;” in the transcript – as opposed to the low falling in 3 and 4) as well as only a small up-step to the second. The span of the pitch movement at the end of the first unit is therefore rather small (note that there is quite some background noise in this excerpt, hence the “bumpy” pitch trace):[3]

Furthermore, not only are the two units delivered without any audible gap, but the acceleration on the end of the first unit (>quelle heure<) further diminishes cues of finality (no syllable lengthening, which, together with pitch movement, would materialize a nuclear accent, which is unit-final in French; Persson 2014). On the contrary, the acceleration accomplishes a rush through – a method for circumventing turn-transition (Schegloff 1982). This, together with the only slightly falling pitch movement on heure, suggests that the second unit may designedly follow the first, i.e., its production may be projected already toward the end of the first candidate unit.

The above properties converge to suggest that the QWQ was transformed into a PQ “on the fly”, the added-on de midi et quart ‘from twelve fifteen’ being again produced as a guess whose tentative character is marked by the final rising intonation (see above). Also note that the recipient comes in only in line 10 with a response to the second segment – the response is both confirming and “type-connected” (Sacks 1987) to the PQ, not to the QWQ; it responds to the call for confirmation.

Furthermore, the questioner’s bodily conduct equally contrasts with what we observed in the preceding excerpts, each of which showed a bodily hiatus between the two parts. In excerpt 5, Marie remains immobile throughout the question–response sequence, only very slightly moving her right hand with her delivery of the second candidate unit, keeping her gaze fixed on Pat (Figures 5 and 6).

Figure 6 
               MAR gazes at PAT, slightly opening LH fingers.

Figure 6

MAR gazes at PAT, slightly opening LH fingers.

So, here we have two prosodic units occurring in immediate follow-up, yet the upcoming of the second is foreshadowed by speed up and only small pitch movement at the end of the first, and bodily cues package the two stretches together. Most importantly, the two are not treated as two distinct actions: The speaker provides a sequential slot for recipient response only after the second unit; and the recipient takes the turn exactly after that unit and not after the first. This is so despite the fact that the preference for contiguity (Sacks 1987) would normatively privilege an answer to come in immediately after the QWQ (e.g., in overlap with the questioner’s turn-extension, as in ex. 1 and 4). What syntactically appears as two subsequent questions is prosodically formatted, and treated, as one single TCU and action, and responded to as one single PQ accomplishing a request for confirmation.

Figure 7 
               Pitch trace of de quelle heure.

Figure 7

Pitch trace of de quelle heure.

Figure 8 
               Pitch trace tu mets lieu quoi.

Figure 8

Pitch trace tu mets lieu quoi.

Excerpt (6) shows a similar – yet prosodically even fuzzier – case. Katja and Michaela are filling out the consent form for the recording they are part of:

By means of the added-on caf(h)e(h):t? (l. 03), Katja offers a non-serious candidate to her question as to what place name to fill in the form (see her laughing voice and laughter tokens). Michaela does not affiliate with this non-serious stance, but instead produces an eu:::h-prefaced disconfirming response (l. 04): She provides an alternative place name with a neutral voice quality, thereby discarding Katja’s candidate and returning to a serious tone. Figure 8 shows the pitch trace of this segment. (Note that the high-rise-slight-fall final contour on cafet is a feature often found with PQs in Swiss-French; see above.)

This, too, is a case of what Barth-Weingarten (2016), suggesting the notion of “cesuring”, highlights as being part of a continuum of prosodic integration: The fuzziness is due to the varying degrees to which the relevant cesuring parameters (pitch, tempo, volume, etc.) change, indicating stronger or weaker boundaries (“cesuras”) between candidate units (see also Barth-Weingarten and Ogden in this SI). Here, the presence of a focal accent in each of the segments (on lieu and on caf(h)e(h):t, respectively) cues them as two distinct prosodic units, yet the only very slight rise on quoi at the end of the first segment (l. 02) with no further pitch movement to the second establishes a notably weak pitch boundary between them. The impression of tight integration is further enhanced by the latching of caf(h)e(h):t to quoi – a resource through which speakers hold the turn past a TRP (Schegloff 2000) and minimize the set of cesuring features in that there is no final lengthening, for instance. The questioner hence here again actively works to eliminate a sequential slot that was projected syntactically and praxeologically, by means of the QWQ, for her recipient to take the turn, and instead adds her own candidate answer to the question.

Furthermore, just like in (5), but in contrast to format (i), the speaker here deploys a bodily movement that stretches coherently across the two segments, progressively moving her head and then gaze upward, ending with her gaze on the recipient at the very end of the second segment (Figures 911).

Figure 9–11 
               KAT lifting her head and gaze toward MIC.

Figure 9–11

KAT lifting her head and gaze toward MIC.

It is as if the continuity of this bodily movement, in concert with the latching and the only slight pitch movement at the end of the first segment, presented the two segments as a single unit/action. This resonates with Barth-Weingarten’s (2016, 81) suggestion that visual cues may be “potentially relevant for censuring”. Furthermore, the speaker’s gaze – as a response-mobilizing feature (Heath 1986, Kendon 1967) – turns to the recipient only right after the delivery of the second segment (Figure 9c), which contrasts with what we have observed for format (i), and further suggests that the QWQ-segment is not designed to call for a response.

In sum, format (ii) groups together cases in which no gap occurs in between the potential units, and these are prosodically integrated to a lesser (ex. 5) or greater (ex. 6) extent. While the two parts still carry their own focal accent and exhibit some unit-final pitch movement, the latter is only small and they are audibly connected by latchings, thereby displaying a more or less weak boundary between them, so that their unit-ness becomes questionable. Format (ii), which represents 31% (n = 15) of the [QWQ + phrase/clause] pattern found in the data, schematically runs as follows:

While syntactically identical to (i), in format (ii) there is no gap between the two parts, and the format’s compositionality as one or two intonation units cannot always be determined straightforwardly. Furthermore, contrary to (i), the speaker’s bodily conduct shows typically a coherent contour across the two units, and speaker’s gaze suggests that a response is relevant only after the second part. Respondents, however, orient to the whole pattern as accomplishing one rather than two consecutive actions, responding only to the second part as offering a candidate answer.

Rather than re-doing a question in response to recipient’s lack of uptake, the initial QWQ is transformed into a PQ “on the fly”, and so is the information request into a confirmation request. Format (ii) hence provides a case in point for the incrementally emergent nature not only of grammatical trajectories (Hopper 1987) but also of actions.

While the QWQ prospectively frames the subsequent phrase as a candidate, the consistently rising intonation on that candidate (again, as opposed to pattern (i), which shows more variation) retrospectively conveys the speaker’s uncertainty. Both work in concert to highlight the tentative character of the candidate, qualifying it as just a rough guess. In some cases, the tentativeness of the guess is further highlighted by turn-final epistemic downgrades (ex. 7) or by alternative formulations of the candidate (ex. 8):

Compared to straightforward PQs (e.g., “are you Italian?”, “are we the fifteenth?”), this format upgrades the tentative character of the guess, yet, compared to a QWQ (e.g., “what’s your nationality?”, “what’s the date?”), it augments the speaker’s claim to knowledge while narrowing down the recipient’s leverage to provide a formally fitted and hence preferred response (see above).

The very existence of a continuum in strength of boundary marking in the target pattern may suggest a cline of routinization of the pattern between two and one unit, i.e., it may be symptomatic of an ongoing sedimentation of [QWQ + phrase/clause] into a single-unit format as a routinized grammatical resource for venturing a tentative guess. This is what we turn to in the next section, which documents an important degree of lexico-syntactic consistency of the single-unit format.

6 Format (iii): a routinized single-unit format for venturing a tentative guess

6.1 Prosodic, bodily-visual, and sequential characteristics

The most frequent occurrences (51%, n = 25) of the target pattern in the data are found at the other end of the continuum of synchronic usage: cases where the second segment is still syntactically separate from the first, yet the two are produced under one coherent prosodic contour, conflated into one TCU (often a single-unit turn) ending on final intonation and implementing a single action, namely the offering of a tentative guess that invites confirmation from the recipient. A first illustration is provided in (9), which occurs immediately after (5) discussed above, where Marie asked about Pat’s course:

In line 14, Marie first asks where the course is taking place. Her question bears the lexico-syntactic and prosodic features of the integrated single-TCU format typically found in the data: speed-up of tempo on the QW-segment – as a consequence, there is no final lengthening – relative shortness of that segment, absence of unit-final pitch movement, often continuation of voicing between the first and the second segment (here and au are contracted), and one single focal accent on the whole stretch of talk (Figure 13).

Moreover, throughout the segment Marie steadily gazes at Pat, without moving, while Pat’s gaze is averted (Figure 12). Pat’s response (l. 15) unmistakably treats Marie’s turn as offering a PQ rather than a QWQ: his ouais “yeah” is type-conforming and it is doing confirmation.

Figure 12 
                  MAR gazing at PAT.

Figure 12

MAR gazing at PAT.

Figure 13 
                  Pitch trace c’est où.

Figure 13

Pitch trace c’est où.

While in (9) the single-unit question ends on falling intonation, in most cases in the data it shows final rising intonation (see Section 6.2), as in excerpt (10):

We see again a speed-up of tempo on the first segment, latching of the second to the first, continued voicing at the transition between segments, and an overall prosodic packaging of the whole stretch as a single unit, with one focal accent and final rising intonation at its end (Figure 15).

The packaging of the whole stretch of talk as one unit is mirrored in Pat’s co-occurring bodily conduct: Pat gazes at Marie (Figure 14) throughout his turn, deploying a continuous rounded up-down movement of his shoulders, with both arms crossed on his belly, which comes to a stop right before Marie delivers her response (l. 04).

Figure 14 
                  PAT gazes at MAR with arms crossed over his belly.

Figure 14

PAT gazes at MAR with arms crossed over his belly.

In sum, (9) and (10) illustrate cases where the two parts of the formal pattern [QWQ + phrase/clause] are conflated into a single TCU, used as a social action format (Fox 2007) for conferring a highly tentative guess. While, retrospectively, the addition of the “second” piece transforms the QWQ into a PQ, and an information request into a confirmation request, prospectively, the QWQ augments the tentativeness conferred to the candidate by framing it as just one candidate among others: consider “we came what time – ten to?” (ex. 10) as opposed to “we came at ten to?” or “it’s where – at Mail?” (ex. 9) compared to “it’s at Mail?”. This is in line with the analysis of the segments occurring as an immediate turn-extension in format (ii). While in (ii) we observe an on-line transformation of an information request into the proposing of a candidate-guess, produced in turn-extension and marked as seeking confirmation, (iii) represents a routinized format for offering a guess and marking it as highly tentative. More precisely, it is a practice for offering a tentative guess in first position, i.e., as sequence-initiating action that calls for a response.

In some cases in the data, the second part in this format does not consist of a phrase but of a short clause that recycles the subject and the verb from the QWQ segment, so that the only “new” information that part conveys is again via a phrase (here: le huit ‘the eight’):

Figure 15 
                  Pitch trace on est venu à quelle heure.

Figure 15

Pitch trace on est venu à quelle heure.

Again we have speed-up of tempo on the (end of the) first segment, latching of the second onto the first, a single nuclear accent (on huit), and a coherent prosodic contour that packages the two as one single intonation unit. Interestingly, the second unit being clausal in nature, the prosody here unites two syntactically independent clauses under one single contour. The speaker, Camille, deploys again a coherent bodily movement, which consists of slowly but steadily lowering her head toward her paper with her gaze fixed on it (Figures 1720), and this dynamic movement is deployed exactly synchronously with the whole question turn (l. 02), contrasting with her preceding gaze at her watch with her head still (Figure 16, l. 01).

Figure 16 
                  CAM looking at her watch.

Figure 16

CAM looking at her watch.

Figure 17–20 
                  CAM continuous lowering of head and torso.

Figure 17–20

CAM continuous lowering of head and torso.

Cedric’s response comes immediately, yet formally it is fitted neither to a QWQ-question (according to Fox and Thompson 2010, clausal responses treat the question as problematic), nor to the PQ (it is neither type conforming, nor does it do confirmation; Raymond 2003). Rather, it disconfirms the candidate on est le huit ‘we are the eighth’, correcting it to on est le sept ‘we are the seventh’, and mitigating this other-correction with je crois ‘I think’ (cf. Schegloff, Jefferson and Sacks 1977, 387f).

There are not enough comparable occurrences in our collection to make a conclusive point, yet excerpt 11 suggests that the immediate incoming of the response may be symptomatic of the fact that a disconfirming response may not be exactly dispreferred (compare the late delivery of the “dunno”-response in excerpt 10 which, as a non-answer response, is dispreferred). In other words, the fact that the guess is indexed as highly tentative may weaken or even neutralize the preference for a confirming response. Excerpt 12 provides further support to this point. Leila is explaining what type of mushroom she found in the forest (l. 01):

Vivi offers a tentative guess (l. 02) as to the nature of the mushroom, which is then disconfirmed by Leila’s straightforward “non”, though it is produced with a slight delay during which Leila displays cognitive search (l.03, she deploys an “out of focus ‘middle-distance’ look” (Goodwin 1987, 117).

Just like in pattern (ii), speakers may further enhance the tentative character of their guess by ending their turn with ou bien ‘or’, an epistemic downgrade or a tag-like element (ex. 13–15):

As Drake (2015) has shown, turn-final or, when occurring with PQs, relaxes the polarity implemented by the question. The ou bien ‘or’ in (13) hence further mitigates the guess, neutralizing the preference for a confirming response by virtue of downgrading the speaker’s commitment to the tentative candidate, and so does the “I don’t know what” and the “is that it” in (14) and (15). Most consistently in the data (see below), however, speakers display the tentative character of their candidate by means of the QWQ-preface and final rising intonation, indexing the candidate as a try, i.e., as just a rough guess.

6.2 Lexico-syntactic and prosodic consistency

The single-unit question format that conflates what formally appears as a QWQ plus a candidate answer shows remarkable structural and lexical consistency, and this also contrasts with formats (i) and (ii), which show more variation. The following table lists all 24 occurrences of format (iii) found in the data (Figure 21).

Figure 21 
                  Inventory of the single-unit format in the data.

Figure 21

Inventory of the single-unit format in the data.

As to the QWQ-segment of the single-unit format:

  1. 92% contain a pronominal subject (only one lexical subject, item x; plus two ellipses; items i, j); ce ‘it’ and on ‘we/one’ alone make up of 60% of these subjects;

  2. the verbs être ‘to be’, faire ‘to make’ and avoir ‘to have’ make up 72% of the predicates; the copula être ‘to be’ alone makes up 48%;

  3. in all cases but one (y), the question word occurs in post-verbal position (i and j have no verb);

  4. in 64% of the cases the question-word is quoi ‘what.’

The second segment:

  1. consistently ends on high-rising intonation (except items a and g; plus some indeterminate cases due to overlap or turn suspension);

  2. consists typically of a phrase (76%, n = 19), and sometimes (24%, n = 6) of a short clause that recycles the subject plus verb from the first piece (items s-w and y).

The strong predominance of phrasal elements for this format contrasts with formats (i) and (ii) in which, taken together, the distribution between phrasal and clausal elements is more balanced (54%/46%; n = 13/11). This further highlights the relative structural consistency of format (iii), the prototypical form of which can be schematized as follows:

The lexico-structural and prosodic consistency of (iii), together with its relatively high frequency (51%) compared to the prosodically non- (18%) and less-integrated (31%) two-unit formats, suggests that it represents a highly routinized, lexically semi-fixed grammatical practice. It can be seen as the sedimentation of frequent combinations in use (cf. Bybee 2010, Hopper and Traugott 2003), involving not only lexico-structural streamlining, but the “fusion” of two units into one (cf. Bybee 2010). While the format shows features typical of grammaticization such as loss of semantic and pragmatic meaning (e.g., what formally appears as a QWQ does not work as such any more), the very existence of continua in usage combined with its lexically not entirely fixed form may be seen as symptomatic of an ongoing routinization process, in which a multi-unit question format is on its way to sediment as a single-unit format specialized for specific interactional purposes.

6.3 Interactional motivations for the routinization of patterns of language use: a grammatical format for proffering a tentative guess in first position

But what does this format do that is different from what a straightforward PQ, such as c’est au Mail? ‘is it at Mail?’ would do? Most importantly, the format confers a more tentative nature to the candidate (e.g., au Mail) than a straightforward PQ would do. In her discussion of PQs, Pomerantz (1988, 369–70) treats these as offering a candidate that is fully endorsed by the speaker. That is, in an utterance such as “is that Temple?”, the speaker displays a relatively high degree of knowledge, offering “Temple” not just as a guess, but as a “best guess” (p. 369). As opposed to the conventional interrogative format of PQ, the format discussed here (e.g., “what’s that Temple?”) can be heard as offering just a try: a tentative guess – not a “best guess”. This in turn affects preference structure. If we re-word the format as a conventional PQ in French, C’est au Mail? ‘Is it at Mail’, on est le huit ‘are we the eighth’, on est venu à moins dix? ‘did we come at ten to’, then such PQs clearly invite a confirming response. They are what Sacks (1992, Lecture 3, fall 1964–Spring 1965) refers to as correction-invitation devices.[4] By contrast, the [QWQ + candidate] format confers a more tentative character, offering a candidate that is not fully endorsed by the speaker. The QWQ and the turn-final rising intonation can be seen as working in concert, the former framing the candidate as just one among others, the latter marking it as a try.

While Pomerantz is concerned with interrogative formats, Heritage (2013) compares interrogative and declarative formats of PQs, suggesting that through the declarative format (i.e., “That’s Temple”) speakers display a more knowing epistemic stance with regard to the issue at hand. I would like to argue that the [QWQ + phrase/clause] format works exactly in the other direction: compared to a conventional PQ, it reduces the speaker’s claim to knowledge by tagging the candidate as just one possible answer to an open (QW) question. As a consequence, it moderates the preference for a confirming response. While this is confirmed by the preferred action turn-format with which the non-confirming responses in excerpts (6), (7), (11), and (12) are delivered, it certainly remains to be further corroborated by more extensive analysis of the type of responses to these guesses.

Furthermore, it is important to note that in informal spoken French, PQs are typically delivered in a declarative format,[5] that is, they are marked as questions only optionally through turn-final rising intonation (see Couper-Kuhlen 2012 on the non-systematicity of rising intonation with questions in English). This has two possible consequences. For one thing, if question design is a way of displaying epistemic stance, then speakers of French would need to resort to other design features than speakers of English to display differences in their epistemic claims through the grammatical formatting of PQs. In the absence of interrogative (vs declarative) PQ formats, French speakers’ resorting to the QWQ + candidate format may be one way of doing so.

For another thing, as PQs in conversational French are typically marked only optionally by turn-final rising intonation, the integrated [QWQ + phrase/clause] format provides a means for displaying the question-in-progress unmistakably as a question, and for doing so earlier on in the turn than turn-final intonation could do (note that QWQs in conversational French most typically have post-verbal QW). Notable in this regard is the fact that, for instance in excerpts (9) and (11), respondents’ answers are delivered in fast follow-up, as if the respondents’ actions were facilitated by the QW-segment, alerting them early to an answer being relevant as a next. Pekarek Doehler (2021) has recently demonstrated, for French, how the action-recognition point of an ongoing turn affects the timing of the response: early recognition of a question as a question in the course of its production (e.g., through turn-initial placement of the QW) is most often associated with fast (or even early) answer onset, while late recognition of the question as a question is regularly associated with answers coming in comparatively later (after the next beat). Such on-line unfolding of the question-in-progress configures the temporal opportunities of action ascription, and, concurrently, projection of the relevant next action. In this light, one of the affordances of the integrated format is that it allows for relatively early recognition (compared to mere turn-final intonation) of the fact that a question is under way, and hence for early projection of some aspects of the expected response. This may be one feature allowing for fast delivery of answers, as shown in (9) and (11).

In a nutshell, practical interactional issues, such as action formation (offering a tentative guess) or projection (allowing for early anticipation of the relevant next action), are what participants can gain with the integrated format; as such, these interactional issues may well be the driving forces for the routinization of the pattern [QWQ + phrase/clause] into a single social action format.

7 Summary and discussion

The present study documents a continuum of synchronic usage (cf. Bybee 2010) of what formally appears as a [QWQ + phrase/clause] pattern. A small proportion of the occurrences in the data (18%) show speakers first producing a QWQ and then, after lack of recipient response, offering a candidate answer as a means of pursuing response. At the other end of the continuum, speakers produce the formal pattern as a single TCU whose frequency (51% of the occurrences in the data) and lexico-syntactically as well as prosodically consistent form suggest its being used as a routinized social action format for offering a highly tentative guess. The remaining 21% are intermediate cases, without a gap between the two units, where prosody tends to indicate two units, but with often a fuzzy boundary between the two; in these cases, bodily conduct (e.g., gaze) and recipient response concur to suggest that the pattern is not designed nor treated as accomplishing two distinct actions. The fact that, in these latter cases, the pattern is consistently responded to by recipients as accomplishing one single action provides evidence that the existence of fuzzy boundaries based on non-convergent cues in the various dimensions (syntax, prosody, action, body) is not a participants’ concern. However, for the researcher, such fuzziness, as part of a synchronous continuum in formal realization, is one possible indicator of change in language use.

The existence of this continuum, the frequency of the three discussed formats, the occurrence of fuzzy boundaries and the only relative fixedness of the single-unit format suggest that there is currently an ongoing routinization process, in French, of the locally emergent “double question format” into one single complex grammatical construction. Formally, the latter conflates a QWQ plus a candidate answer, in which the first piece frames the upcoming piece as doing a question and, by that token, itself loses its status as an QWQ/information request.

It can be argued that this process of routinization (or even grammaticization) has its roots in speakers’ recurrent dealings with locally emerging social-interactional needs. Questions and the type of responses they make relevant are configured incrementally, in real time. An initial question may be redone in another shape, accomplishing another action, in response to interactional contingencies such as the lack of a recipient response (format i). Or it may be revised along the very temporal unfolding of the question in progress, whereby a QWQ can be transformed into a PQ “on the fly” (format ii). Such incremental configurations also adapt the projection of the relevant next action, so that conditional relevance is configured and altered incrementally in real time (Pekarek Doehler 2019). The single-unit format (iii) can be seen as the routinized product of such repeated interactionally driven incremental configurations: What on occasion is configured as two subsequent actions becomes one integrated whole, specialized for the precise interactional purpose of venturing a tentative guess in first position, that is: as a sequence-initial action. The findings thus provide evidence in support of “a path of grammaticization from vertical to horizontal development” whereby grammar-in-interaction grows out of social interaction, as hypothesized by Couper-Kuhlen (2011, 436). This further suggests that documenting the continuum of synchronic usage of grammar in interaction may shed light on how grammatical usage patterns emerge in real time and sediment over time in response to speakers dealing with local interactional needs.

8 Conclusion

Let me conclude with some broader implications. Grammar can be seen as a repertoire of linguistic practices (Fox and Thompson 2010, Schegloff 1996) that have evolved out of speakers accomplishing repeated actions, and have become operational in use based on precise sequential positions they occupy within locally organized social interactions: “speakers do not follow combination rules, but assemble familiar fragments – experientially – creating grammar as they go. Grammar is emergent and epiphenomenal to the ongoing creation of new combinations of forms in interactive encounters” (Hopper 2011, 26). The findings reported in this study provide further empirical support to this view. They retrace a path of routinization: An integrated social action format has grown out of the repeated sequencing of two subsequent actions – an information request followed by a candidate answer seeking confirmation – and has become specialized for a precise interactional purpose, namely the venturing of a tentative guess in first position.

Now, if we continuously assemble fragments on-line, and if frequently co-occurring fragments sediment over time as grammatical resources for action, then the fuzziness of prosodic and/or TCU boundaries is not a nuisance, nor an inconsistency, but an integral and inevitable part of the continuous interplay between on-line emergence on the one hand, and continuous sedimentation on the other. It ensues that unprecedented grammatical configurations that emerge locally in response to precise interactional needs may become entrenched over time as practical solutions to recurrent interactional problems, and eventually sediment as (canonical) grammatical usage patterns over time: Emerging and emergent grammar are inextricably intertwined (Hopper 2011, Pekarek Doehler 2011). The data under scrutiny suggest that the [QWQ + phrase/clause] pattern used for proffering a tentative guess may be the product of such entrenchment.

Furthermore, if routinized patterns are retraceable to locally emergent, moment-by-moment composed ad hoc configurations, as suggested by the empirical data examined here, then it might be trajectories rather than units that are of interest (cf. Ford et al. 2013). Namely those trajectories (and their expandability) that Goodwin (1979), Hopper (1987, 2011), Schegloff (1996), and Auer (2009), each in their way, have argued to be the very essence of grammatical patternings in talk-in-interaction, i.e., trajectories that are indissociably related to the local formation of action, in which grammatical usage patterns originate: grammar as epiphenomenal to social action (Ford et al. 2013, Hopper 2011). Prosodically and/or syntactically delimited units may well represent an experientially entailed cognitive reality and have linguistic categorical status; but what is primary from an action-formation perspective is trajectories as local accomplishments out of which what linguists call units or grammatical structure may grow and sediment over time.

Last but not least, if grammar grows out of action, and is a resource for action, it inevitably sits there in the midst of a complex ensemble of multisemiotic resources (Goodwin 2000). Can units, trajectories, and boundaries be better understood when considered within the larger ecology of language use in interaction? In this article, I have provided some observations as to how participants’ bodily conduct, along of course with prosody, can feed into our understanding of how these participants orient to units and trajectories, as part of action formation and ascription. While this last observation awaits further research, it converges with other ongoing work stressing the analytical import of analyzing grammar as it occurs embedded not only in the organization of turns and sequences, but also in participants’ use of other semiotic resources.


The present study was carried out with the generous support of the Swiss National Science Foundation, grant no. 100012_178819, project The emergent grammar of clause-combining in social interaction. The author thanks Sandra Schwab for her help with the PRAAT generated graphs. And the author is deeply grateful to the editors of this special issue, Dagmar Barth-Weingarten and Richard Ogden, as well as to Betty-Couper-Kuhlen, one “internal” reviewer, and two “external” reviewers for their helpful comments on a prior version of this article.

  1. Funding information: The present study was carried out with the generous support of the Swiss National Science Foundation, grant no. 100012_178819, project The emergent grammar of clause-combining in social interaction.

  2. Author contribution: The author has accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Conflict of interest: Author states no conflict of interest.

  4. Data availability statement: The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.


Transcription conventions for verbal conduct

[ Start of overlap
] End of overlap
= Latching
(0.7) Measured pause in seconds and tenths of seconds
wo- Truncated word
wo:rd Syllable lengthening (number of “:” depending on length of lengthening)
? Rising final intonation
¿ Mid-rise final intonation
. Falling final intonation
; Mid-falling intonation
, Continuing intonation
word Emphasis
°word° Softer than surrounding speech
WORD Louder than surrounding speech
↑word Marked high rise in pitch (refers to the next syllable)
>word< Faster than surrounding talk
.h In-breath
wo(h)rd Laughing voice
he. Laughter token

Transcription conventions for embodied conduct

* * Symbols such as these indicate start and end of embodied conduct
× ×
± ±
*----->l.12 Continuation of the described embodied conduct until line 12 of transcript
------>* End of the described embodied conduct
*----->> Continuation of the described embodied conduct until end of excerpt
…. *points at X Points indicate the preparation phase of the embodied conduct
*points,,,,,, Commas indicate the retraction phase of the embodied conduct.

Abbreviations used in glosses






past participle






Auer, Peter . 2009. “On-line syntax: thoughts on the temporality of spoken language.” Language Sciences 31, 1–13.Search in Google Scholar

Barth-Weingarten, Dagmar . 2014. “Dialogism and the emergence of final particles: The case of ‘and’.” In: Grammar and dialogism, eds. Günthner, Susanne , Wolfgang Imo and Jörg Bücker , p. 335–66. Berlin: de Gruyter.Search in Google Scholar

Barth-Weingarten, Dagmar . 2016. Intonation units revisited. Cesuras in talk-in-interaction. Amsterdam: Benjamins.Search in Google Scholar

Bybee, Joan 2010. Language, usage and cognition. Cambridge: Cambridge University Press.Search in Google Scholar

Clayman, Steven and John Heritage . 2002. “Questioning presidents: Journalistic deference and adversarialness in the press conferences of U.S. Presidents Eisenhower and Reagan.” Journal of Communication 52, 749–75.Search in Google Scholar

Clayman, Steven . 2002. “Sequence and solidarity.” In: Group cohesion, trust and solidarity, eds. Thye, Shane R. and Edward J. Lawler , p. 229–53. Oxford: Elsevier Science.Search in Google Scholar

Couper-Kuhlen, Elisabeth . 2011. “Grammaticalization and conversation.” In: The oxford handbook of grammaticalization, eds. Narrog, Heiko and Bernd Heine , p. 424–37. Oxford: Oxford University Press.Search in Google Scholar

Couper-Kuhlen, Elizabeth . 2012. “Some truths and untruths about final intonation in conversational questions.” In: Questions: Formal, functional and interactional perspectives, ed. De Ruiter, Jan P. , p. 123–45. Cambridge, UK: Cambridge University Press.Search in Google Scholar

Delais-Roussarie, Elisabeth , Brechtje Post , Mathieu Avanzi , Carolin Buthke , Albert Di Cristo , Ingo Feldhausen , Sun-Ah Jun , Philippe Martin , Trudel Meisenburg , Annie Rialland , Rafèu Sichel-Bazin and Hi-Yon Yoo . 2015. “Intonation phonology of French: Developing a ToBI system for French.” In: Intonational variation in romance, eds. Frota, Sonia and Pilar Prieto . Oxford University Press.Search in Google Scholar

Delattre, Pierre . 1966. “Les dix intonations de base du français.” The French Review 40(1), 1–14.Search in Google Scholar

Detges, Ulrich and Waltereit, Richard . 2011. “Turn-taking as a trigger for language change.” In: Rahmen des Sprechens. Beiträge zu Valenztheorie, Varietätenlinguistik, Kognitiven und Historischen Semantik, eds. Dessi Schmid, S. , Detges, U. , Gevaudan, P. , Mihatsch, W. and Waltereit, R. , p. 75–190. Tubingen: Narr.Search in Google Scholar

Drake, Vrignia . 2015. “Indexing uncertainty: The case of turn-final ‘or’.” Research on Language and Social Interaction 48(3), 301–318.Search in Google Scholar

Ford, Cecilia , Barbara Fox and Sandra Thompson . 2013. “Units and/or action trajectories? The language of grammatical categories and the language of social action.” In: Units of talk – Units of action, eds. Szczepek Reed and Beatrice, Geoffrey Raymond , p. 13–56. Amsterdam: Benjamins.Search in Google Scholar

Fox, Barbara and Sandra Thompson . 2010. “Responses to wh–questions in English conversation.” Research on Language and Social Interaction 43(2), 133–56.Search in Google Scholar

Fox, Barbara . 2007. “Principles shaping grammatical practices: an exploration.” Discourse Studies 9(3), 299–318.Search in Google Scholar

Goodwin, Charles . 1979. “The interactive construction of a sentence in natural conversation.” In: Everyday language: Studies in ethnomethodology, ed. Psathas, George , p. 97–121. New York: Irvington.Search in Google Scholar

Goodwin, Charles . 1987. “Forgetfulness as an interactive resource.” Social Psychology Quarterly 50(2), 115–31.Search in Google Scholar

Goodwin, Charles . 2000. “Action and embodiment within situated human interaction.” Journal of Pragmatics 32(10), 1489–522.Search in Google Scholar

Haiman, John . 1994. “Ritualization and the development of language.” In: Perspectives on grammaticalization, ed. Pagliuca, William , p. 3–28. Amsterdam: Benjamins.Search in Google Scholar

Heath, Christian . 1986. Body movement and speech in medical interaction. Cambridge, England: Cambridge University Press.Search in Google Scholar

Heritage, John . 2013. “Action formation and its epistemic (and other) backgrounds.” Discourse Studies 15(5), 511–78.Search in Google Scholar

Hopper, Paul and Elisabeth Closs Traugott . 2003. Grammaticalization, 2nd ed. Cambridge: Cambridge University Press.Search in Google Scholar

Hopper, Paul . 1987. “Emergent grammar.” Proceedings of the Thirteenth Annual Meeting of the Berkeley Linguistics Society, p. 139–57.Search in Google Scholar

Hopper, Paul . 2011. “Emergent grammar and temporality in interactional linguistics.” In: Constructions: Emerging and emergent, eds. Peter Auer and Stefan Pfänder , p. 22–44. Berlin: de Gruyter.Search in Google Scholar

Houtkoop-Steenstra, Hanneke and Charles Antaki . 1997. “Creating happy people by asking yes-no questions.” Research on Language and Social Interaction 30, 285–313.Search in Google Scholar

Jefferson, Gail . 2004. “Glossary of transcript symbols with an introduction.” In: Conversation analysis: Studies form the first generation, ed. Lerner, Gene H. , p. 13–31. Amsterdam: John Benjamins.Search in Google Scholar

Kendon, Adam . 1967. “Some functions of gaze-direction in social interaction.” Acta Psychologica 26, 22–63.Search in Google Scholar

Lehmann, Christian . 1985. “Grammaticalization: synchronic variation and diachronic change.” Lingua e Stile 20, 303–18.Search in Google Scholar

Li, Xiaoting . 2013. “Leaning and recipient intervening questions in Mandarin conversation.” Journal of Pragmatics 67, 34–60.Search in Google Scholar

Linell, Per , Johan Hofvendahl and Camilla Lindholm . 2003. “Multi-unit questions in institutional interactions: Sequential organizations and communicative functions.” Text 23(4), 539–71.Search in Google Scholar

Mondada, Lorenza . 2018. Conventions for multimodal transcription. Retrieved from in Google Scholar

Pekarek Doehler, Simona , Elwys De Stefani and Anne-Sylvie Horlacher . 2015. Time and emergence in grammar. Amsterdam: Benjamins.Search in Google Scholar

Pekarek Doehler, Simona and Ufuk Balaman . 2021. “The routinization of grammar as a social action format: a longitudinal study of video-mediated interactions.” Research on Langugage and Social Interaction 54(2), 282–302.Search in Google Scholar

Pekarek Doehler, Simona . 2011. “Emergent grammar for all practical purposes: The on-line formatting of dislocated constructions in French conversation.” In: Constructions: Emerging and emergent, eds. Auer, Peter and Stefan Pfänder , p. 46–88. Berlin: Mouton de Gruyter.Search in Google Scholar

Pekarek Doehler, Simona . 2019. “At the interface of grammar and the body. Chais pas (‘dunno’) as a resource for dealing with lack of recipient response.” Research on Language and Social Interaction 52(4), 1–23.Search in Google Scholar

Pekarek Doehler, Simona . 2021. “Word-order affects response latency: action projection and the timing of responses to question-word questions.” Discourse Processes 58, 328–352. 10.1080/0163853X.2020.1824443.Search in Google Scholar

Persson, Rasmus . 2014. Ressources linguistiques pour la gestion de l’intersubjectivité dans la parole en interaction. Analyses conversationnelles et phonétiques. PhD thesis, Lund University.Search in Google Scholar

Pomerantz, Anita . 1984. “Agreeing and disagreeing with assessments: some features of preferred/dispreferred turn shapes.” In: Structures of social action, eds. Atkinson, J. Maxwell and John Heritage , p. 57–101. Cambridge: Cambridge University Press.Search in Google Scholar

Pomerantz, Anita . 1988. “Offering a candidate answer: an information seeking strategy.” Communication Monographs 55, 360–73.Search in Google Scholar

Raymond, Geoffrey . 2003. “Grammar and social organization: yes/no-type interrogatives and the structure of responding.” American Sociological Review 68, 939–67.Search in Google Scholar

Rossano, Federico . 2012. Gaze behavior in face-to-face interaction. Nijmegen: Radboud University.Search in Google Scholar

Sacks, Harvey , Emanuel A. Schegloff and Gail Jefferson . 1974. “A simplest systematics for the organization of turn-taking for conversation.” Language 50(4), 696–735.Search in Google Scholar

Sacks, Harvey and Emanuel A. Schegloff . 1979. “Two preferences in the organization of reference to persons in conversation and their interaction.” In: Everyday language: Studies in ethnomethodology, ed. Psathas, George , p. 15–21. New York: Irvington Publishers.Search in Google Scholar

Sacks, Harvey . 1987. “On the preferences for agreement and contiguity in sequences in conversation.” In: Talk and social organisation, eds. Button, Graham and John R. E. Lee , p. 54–69. Clevedon: Multilingual Matters.Search in Google Scholar

Sacks, Harvey . 1992. Lectures on conversation. Oxford: Blackwell.Search in Google Scholar

Schegloff, Emanuel A. 1982. “Discourse as an interactional achievement: some uses of ‘uh huh’ and other things that come between.” In: Analyzing discourse: Text and talk, ed. Tannen, Deborah , p. 71–93. Washington, D.C.: Georgetown University Press.Search in Google Scholar

Schegloff, Emanuel A. 1996. “Turn organization: one intersection of grammar and interaction.” In: Interaction and grammar, eds. Ochs, Elinor , Emanuel A. Schegloff and Sandra Thompson , p. 52–133. Cambridge: Cambridge University Press.Search in Google Scholar

Schegloff, Emanuel A. 2000. “Overlapping talk and the organization of turn-taking for conversation.” Language in Society 29, 1–63.Search in Google Scholar

Schegloff, Emanuel A. 2007. Sequence organization in interaction: A primer in conversation analysis. Cambridge, UK: Cambridge University Press.Search in Google Scholar

Schegloff, Emanuel A. , Gail Jefferson and Harvey Sacks . 1977. “The preference for self-correction in the organization of repair in conversation.” Language 53, 361–82.Search in Google Scholar

Stivers, Tanya and Federico Rossano . 2010. “Mobilizing response.” Research on Language and Social Interaction 43(1), 3–31.Search in Google Scholar

Stivers, Tanya . 2010. “An overview of the question-response system in American English conversation.” Journal of Pragmatics 42(10), 2772–81.Search in Google Scholar

Streeck, Jürgen . 2009. “Forward-gesturing.” Discourse Processes 46(2–3), 161–79.Search in Google Scholar

Svennevig, Jan . 2013. “Reformulation of questions with candidate answers.” International Journal of Bilingualism 17(2), 189–204.Search in Google Scholar

Received: 2020-08-10
Revised: 2021-03-07
Accepted: 2021-04-06
Published Online: 2021-12-23

© 2021 Simona Pekarek Doehler, published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.