LOOKing for multi-word expressions in American Sign Language

: Usage-based linguistics postulates that multi-word expressions constitute a substantial part of language structure and use, and are formed through repeated chunking and stored as exemplar wholes. They are also re-used to produce new sequences by means of schematization. While there is extensive research on multi-word expressions in many spoken languages, little is known about the status of multi-word expressions in the mainstream U.S. variety of American Sign Language (ASL). This paper investigates recurring multi-word expressions, or sequences of multiple signs, that involve a high-frequency sign of visual perception glossed as LOOK and the family of ‘ look ’ signs. The LOOK sign exhibits two broad functions: LOOK / ‘ vision ’ references literal or metaphorical vision and LOOK / ‘ reaction ’ signals a person ’ s reaction to a visual stimulus. Data analysis reveals that there are recurring sequences in distinct syntactic environments associated with the two functions of LOOK , suggesting that LOOK is in the process of grammaticalization from a verb of visual perception to a stance verb. The sequences demonstrate the emergence of linguistic structure from repeated use through the domain-general cognitive process of chunking in ASL.


Introduction
Multi-word expressions form a central part of language. They come in all shapes and sizes, varying in complexity and specificity. A multi-word expression is a unit that is longer than one word and conveys meaning that may not be predicted from individual words (Arnon and Snider 2010; Barlow and Kemmer 1994;Biber 2009;Bybee , 2010Bybee and Torres Cacoullos 2009;Ellis 2002;Erman and Warren 2000;Goldberg 2006;Haiman 1985;Hopper 1987;Sinclair 1991; Thompson with a V-handshape in which the extended index and middle fingers point outward from the signer and moves in any direction in the space ahead of the signer's body, as shown in Figure 1. 2 This form is traditionally viewed as an unmodified form produced without context. Some researchers would call this as a 'citation form' or a lexeme that serves as an organizing unit for all morphophonological variants of LOOK (Fenlon et al. 2015). Now consider the two instances of LOOK, boldfaced in Figure 2, accompanied by the English glosses of the ASL signs and an English translation. 3,4 The images are extracted from a vlog (video blog) posted to a public group ASL That! on Facebook. The first instance exhibits path movement that targets a spatial location on the left side of the signer's body that is associated with the 'video' under discussion. 5 The signer produces a visible mouthing of the English word 'look' in co-occurrence with LOOK and subsequently fingerspells V-I-D-E-O. The sequence PRO.2 LOOK V-I-D-E-O gives a straightforward reading of the video as 2 The English translations provided for this sign are approximate and may only cover part of the range of meanings associated with it. There are morphologically related forms that can convey similar meanings of these translations too; for example, 'observe' can be used to refer to a twohanded form with repeated circular movement. This is further discussed in §2.2. 3 ASL signs are represented as English glosses in small capitals, although for some individual signs, I do not provide glosses and instead explain what they may mean. The choice of a gloss does not represent the linguistic functions of the sign in question. If a sign has a gloss that contains more than one word, e.g. MIND.PUZZLED, the sign bears an approximate meaning of that phrase. If a sign is fingerspelled, the letters are represented in dashes, e.g., V-I-D-E-O. Where appropriate, a sign may be represented by a sequence of film stills to direct the reader's attention to the relevant morphophonological properties of the sign. 4 For theoretical and methodological reasons, I do not focus on the marking of clausal boundaries in ASL. See §3.1 for further discussion on this. 5 This kind of spatial arrangement has been analyzed as verb agreement or directionality in sign language linguistics Hou and Meier 2018;Lillo-Martin and Meier 2011).
LOOKing for multi-word expressions in ASL the object of visual perception from the perspective of a second-person agent. The subsequent sign, DECIDE.FOR.YOURSELF, refers to the signer's urging the viewers to interpret the meaning of the video once they watch it.
The second instance of LOOK, on the other hand, exhibits reduced path movement that is not targeted at any discourse-meaningful location in space; it points away from the signer's body, mirroring their own outward gaze. Note that neither the second instance contains an explicit agent nor does it point in the same direction as the first instance did in the previous utterance. There is also no visible English mouthing co-occurring with LOOK. Rather, the signer's face assumes furrowed brows, squinted eyes, and pressed lips. This constellation of facial expressions gives the reading of a puzzled reaction to a previously identified visual stimulus. The LOOK instance is followed by MIND.PUZZLED and combined with the facial expressions, the whole construction functions as a unit and can be interpreted to mean "it's baffling". The unit signals the use of LOOK as a pivot to the signer's reaction to the video; the reaction is an attitudinal stance, providing a window to the mind of the signer.
These preliminary observations form the base of the main arguments of the study presented here: (1) the LOOK sign is grammaticalizing from a verb of visual perception to a stance verb; these changes can be observed in the context of multi-word expressions (as well as the form-meaning mappings); (2) the more high-frequency expressions are highly conventionalized units, likely prefabs, and (3) the schematization of multi-word expressions allow the productivity of new constructions. ASL multi-word expressions offer evidence of frequency effects for the grammaticalization of LOOK. The frequency effects substantiate chunking, entrenchment, and automatization as domain-general cognitive processing mechanisms that are not only specific to spoken languages but also occurs in signed languages (Lepic 2016(Lepic , 2019Wilkinson 2016;Wilkinson et al. in press).
This paper is organized as follows. Section 2 reviews the background on multi-word expressions in signed languages and the theoretical approaches for analyzing them. Section 3 discusses the data used for the present study, and the corpus-like approach for analyzing the data. The same section also discusses the results, segueing to Section 4 for a qualitative analysis of the multi-word expressions and the theoretical implications of chunking in the formation of these sequences and the grammaticalization and schematization of LOOK. The paper wraps up the discussion on multi-word expressions in signed languages.

Background on multi-word expressions in signed languages
Broadly, a multi-word expression in a signed language is defined as a sequence of identifiable signs functioning as a larger unit that may be conventionalized in meaning and form (Hou and Morford 2020;Lepic 2019;Wilkinson 2016;Wilkinson et al. in press). 6 Currently, the development of signed language corpora lags far behind globally spoken language corpora such as English and Spanish. There is no publicly available, machine-readable corpus for ASL yet (Lepic 2019;Morford and MacFarlane 2003;Occhino et al. 2021;Wilkinson 2016). 7 Even when such corpora become a reality, there is a very long way to go before one can search for frequency of individual signs and frequency of co-occurrence of strings of multiple signs easily as one can for English (Börstell 2022). There are several existing signed language corpora that are publicly available online such as Australian Sign Language (Auslan), Sign Language of the Netherlands (NGT) and Swedish Sign Language (STS) (Börstell 2022). Although these corpora are at a stage where one could start searching for n-grams, or the sequential string of words at any length, the small size of these corpora would not yield highly generalizable results beyond the dataset. While ASL has a few documented idioms, including the classic one TRAIN GO SORRY (one English equivalent is 'missing the boat'), the use and frequency of such idioms are unknown (Wilkinson et al. in press). Apart from the lack of large-scale corpus data for signed languages, there is also the methodological question of identifying multi-word expressions. Signed language corpora may differ in how they treat the more entrenched sequences, especially if they exhibit phonetic reduction and/or fusion (Börstell et al. 2016). In their study of two-sign compounds in the STS corpus, Börstell et al. (2016) investigated the distribution and duration of compounds that were either tagged as reduced or non-reduced. They found that the reduced compounds exhibited significantly shorter duration than non-reduced ones. Moreover, the reduced compounds had more recurring sign types, whereas non-reduced compounds had bigger sign types but fewer tokens per type and more hapaxes.
Finally, there has been a prevalence of structuralist and generative-formal approaches in the scholarship of sign language linguistics from the advent of this discipline, though usage-based approaches have only been applied in earnest for the past decade (Janzen 2018;Lepic 2019;Lepic and Occhino 2018;Wilcox 2014;Wilcox and Occhino 2016;Wilkinson et al. in press). Most scholars have focused on describing how individual signs are made of discrete building blocks, positing derivational rules for forming grammatical sentences out of these blocks. The underlying implication is that users access individual signs and parse them conventionalized labels with signed languages and may not be familiar with more specialized labels such as 'multi-sign expressions' or 'multi-sign sequences.' 7 There is the on-going creation of an ASL corpus dedicated to child ASL data called the Sign Language Acquisition, Annotation, Archiving and Sharing (SLAASh) project: https://slla.lab. uconn.edu/slaaash/. There is also the ASL Signbank, an electronic resource that functions as a lexical database (and often as a dictionary for many people who use it to look up signs): https:// aslsignbank.haskins.yale.edu/.
according to the rules. Scholars have also focused on the 'simultaneity' of linguistic structures such as the use of space for reference tracking through agreeing/indicating verbs (Lillo-Martin and Meier 2011;Schembri et al. 2018) or the simultaneous use of the hands and parts of the body of conveying different kinds of information at the same time (Napoli and Sutton-Spence 2010;Vermeerbergen et al. 2007). Yet scholars have not paid the same amount of attention to the sequential process of how signs chunk together and form larger conventionalized units akin to word chunks in spoken languages.
There is some compelling evidence that chunking is not limited to auditory processing. Wilkinson (2016) conducted one comprehensive study about collocational frequency of NOT constructions in ASL. She identified three highfrequency collocations: NOT HAVE.TO, WHY NOT, and NOT UNDERSTAND. Figures 3 and 4 exhibit non-reduced and reduced collocations of NOT HAVE.TO, respectively. Figure 3 shows how NOT HAVE.TO is easily analyzed as a sequence of two distinct signs. Figure 4 shows the chunking leads to the fusion of the signs as a unit, evidenced by the co-occurrence of the reduced path movement and the extension of the thumb and the bent index finger. Both figures differ in the analyzability of the internal structure, which also shapes meaning. Wilkinson proposes that the non-reduced  LOOKing for multi-word expressions in ASL form of NOT HAVE.TO gives a literal reading of obligatoriness, whereas the reduced form denotes a more bleached meaning of obligatoriness.
Wilkinson proposed that the frequency effects in WHY NOT and NOT UNDERSTAND can be observed by the semantic bleaching of the meaning, the pragmatic strengthening of subjectivity, and the speaker's involvement in the discourse. The non-reduced form of WHY NOT gives the literal reading of cognitive reasoning of 'why', whereas the reduced form conveys the meaning of suggestion. For the third collocation, the non-reduced form of NOT UNDERSTAND gives the literal meaning of the cognitive inability to process information. The reduced form marks indifference to a given topic in the discourse, foregrounding the user's subjectivity that extends beyond the cognitive inability to understand. Wilkinson argued that the analysis of NOT collocations show that ASL is similarly sensitive to frequency effects of chunking, just as spoken languages are, exhibiting loss of the analyzability of the internal structure and semantic bleaching, and attesting to the automatization and fluidity of processing from repetition.
ASL has other multi-word expressions that vary along the continuum of analyzability of internal structure. Lepic (2019) identified two ASL verb-argument constructions INTERPRETER BRING.IN and TAKE.TO HOSPITAL. These constructions are not entirely fixed, in the sense that the ordering of the signs can be rearranged without altering the basic meaning, although one order is more common than the other order. Lepic suggested that they can constitute conventionalized multi-word expressions that exhibit more clearly analyzable structure compared to the NOT collocations (Wilkinson et al. in press). The preliminary observations of the verb-argument constructions and the NOT collocations suggest that multi-word expressions emerge from chunking and that higher-frequency ones can lead to changes in the internal structure and even semantic-pragmatic shift, forming what Wilkinson calls "schematic, fused constituent structures." My observations of the ASL data, such as Figure 2, led me to hypothesize that LOOK is an excellent candidate for investigating multi-word expressions. The first argument is that LOOK has been identified as a high-frequency token sign in ASL, reported to occur 6.3 times per 1,000 signs in a small-scale study of 4,111 signs for lexical frequency (Morford and MacFarlane 2003). The high lexical frequency of this sign does not appear to be specific to ASL. In British Sign Language (BSL), one sign glossed as LOOK had the ranking of 15 and another sign glossed LOOK2 the ranking of 56 in the top 100 most frequent signs in conversational data (Fenlon et al. 2014). These two same signs were determined to be the second and third most frequent verbs in a dataset of 1,612 verbs . In Auslan, which is related to BSL, one sign glossed as LOOK was ranked as the fifth most frequent type in a lexical frequency study of 63,436 tokens (Johnston 2012). In Swedish Sign Language (STS), one sign glossed as LOOK.AT had the ranking of 39 out of 300 sign types (Börstell et al. 2016). These lexical frequency studies did not investigate the frequency effects of LOOK in the context of n-grams. However, Wilkinson's (2016) study for NOT collocations in ASL and Börstell et al.'s (2016) study for frequency and duration of signs in STS suggested that multi-word expressions in signed languages are sensitive to frequency effects as multi-word expressions in spoken languages are. A high-frequency sign like LOOK could exhibit frequency effects that may include loss of internal structure and semantic bleaching with a shift to subjectivity. Moreover, in the absence of an ASL corpus, it would be somewhat easier to investigate the frequency effects of a high-frequency sign like LOOK compared to those of a low-frequency sign, albeit the sampling can limit the potential of statistical testing.
The second argument is that sensory perception is a rich domain of inquiry for language change. Cross-linguistic studies have shown how the physical domain of sensory perception verbs extend to the more metaphorical and abstract domains of experience such as visual and auditory cognition and communication (Evans and Wilkins 2000;Majid et al. 2018;San Roque et al. 2018;Sweetser 1990;Traugott and Dasher 2005;Viberg 1983). These studies demonstrated how the meaning of verbs shifts from activity to experience; very likely, this change occurred from repetition of use in particular syntactic environments. Moreover, the extension of these verbs led to the development of pragmatic discourse markers in spontaneous conversation such as the English look forms (Brinton 2001;Romaine and Lange 1991) and the evidential markers such as see (Kendrick 2019) and the Romance language equivalents (Fagard 2010;Waltereit 2006). Third, vision dominates the sensory perception of metaphorical extensions cross-linguistically Sweetser 1990;Winter et al. 2018). Signed languages are no exception to this tendency. Many sighted deaf signers rely on visual information in the world. They talk about themselves in relation to the world through what they can access the most. Their lived experiences tend to be grounded in the visual orientation, naturally shaping their language structure and use. Thus, it is not implausible to believe that verbs of visual perception in ASL, including LOOK, could be used for stance-marking to convey one's experiences and understanding of the world.

Background on LOOK-AT
The term 'American Sign Language' refers to a constellation of language varieties used by deaf and hard-of-hearing people used in the United States and anglophone Canada, as well as other parts of the world and, in the contemporary period, generally refers to the standard variety in use at Gallaudet University (Hill 2015). 8,9 The language has the basic word order of SVO in transitive clauses, as the example in Figure 2 illustrates, and SV in intransitive clauses (Fischer 1975;Liddell 1980). Pronominal and nominal arguments can be omitted and understood implicitly once they have been established in discourse (Wulf et al. 2002). In some instances the word order is more flexible with topic-comment structure (Janzen 1999) or agreeing verbs, which has been observed in many different signed languages. Some verbs mark their core arguments through spatial modification of the verb forms Hou and Meier 2018;Mathur and Rathmann 2012;Meir 1998;Padden 1988). Such verbs have been traditionally analyzed as directional verbs, agreeing verbs, or indicating verbs; the terminological choice depends on the researcher's theoretical position. These verbs, including LOOK, generally denote verbs of transfer with two animate arguments. But LOOK can be an exception to the generalization since the visual stimulus of the verb does not have to be animate as evidenced by Figure 2.
The grammatical category of LOOK is generally defined as an agreeing/ indicating verb. This leads to the grouping of different morphophonological variants of one "lexeme" for analysis as demonstrated in Figures 5 and 6 (Fenlon et al. 2015). These variants differ in the change in the direction of the path movement in which the verb points at, or in number of hands involved, but they do not fundamentally change in meaning, since they all pertain to the general activity of looking at a visual stimulus. Figure 5 shows one variant of LOOK that points at the signer as the referent, giving the interpretation of 'look at me' whereas Figure 6 shows another variant of LOOK that means two referents are looking at each other. A similar approach has been taken for BSL. There is one BSL sign glossed as LOOK2 that bears a strong resemblance to the ASL LOOK; it is listed as a one-handed sign in the BSL SignBank (Fenlon et al. 2015). 10 Signers can produce this form as a two-handed sign in a way that conveys the meaning of either two people looking at something or two people looking at each other. Fenlon et al. (2015) do not treat such variants as separate lexemes but rather different variants of one lexeme.
For other ASL signs as presented in Figures 7-10, researchers may treat them as separate lexemes for lexicography purposes. These signs pertain to different types of looking activities as well as metaphorical and subjective dimensions of vision. The change of the meaning corresponds to change in form. The formational changes appear to involve a combination of changes of direction, manner, and orientation of path movement, selection and representation of facial expressions, the number of hands, and the configuration of the hands (Frishberg and Gough 2000;  LOOKing for multi-word expressions in ASL Klima and Bellugi 1979;Naughton 2001). Figure 7 means 'to observe' or 'to examine' and is a prototypically two-handed sign, although it can be produced with one hand. Both hands are symmetrical for having the same handshape and movement; they have the same V-handshape and move alternatingly in a circular manner. Figure 8 means 'view'/'perspective' or 'to look at something.' The sign is a   non-symmetrical two-handed sign, in which one hand moves and the other hand does not. The active hand has the V-handshape pointing towards the other hand, a stationary 1-handshape. Figure 9 means 'to read.' The sign is also a nonsymmetrical two-handed sign. The active hand has the V-handshape that moves downward over a stationary B-handshape. These signs are a few of the many signs with other extended meanings such as 'look forward to', 'reminisce', 'admire', 'look down on someone', and 'look someone up and down' (Naughton 2001). Figure 9 shows a one-handed sign that functions as an imperative for directing one's attention to a stimulus. This sign co-occurs with a visible mouth configuration: rounded lips with a protruding tongue that exhibits trilled flapping (Liddell 2003: 131-132). The tongue movement resembles movement of the consonant of a lateral approximant [l] with flapping action while the lips resemble the back rounded vowel [ʊ]. This sign points at an intended referent and may exhibit reduced path movement, co-occurring with heightened affective facial expressions. Signers can 'hold' this sign with the mouthing for as long as necessary for dramatic effect and can use it covertly, without the accompanying manual sign, to direct one's attention to the stimulus.
All the above signs differ more in meaning and form compared to the cluster of different variants of the lexeme LOOK. What they share is the form of the V-handshape and the meaning of concrete or abstract visual perception. They form a network of associations on the basis of morphological and semantic relations. In this network, LOOK can be conceptualized as a prototypical and central member that extends its meaning to other signs (Naughton 2001). LOOKing for multi-word expressions in ASL There are other ASL signs relating to visual perception. Some are more distinct in form compared to the family of 'look' signs and thus may be considered as separate lexemes. One sign, conventionally glossed as SEE in Figure 11, bears a similar meaning as LOOK. Both signs share the V-handshape but differ by palm orientation, the facing of fingertips, and location. They differ in the meaning of visual perception with respect to agency. Naughton (2001) states that the difference is the event type representing sensory perception: activity constitutes a process controlled by a human agent whereas experience is a process that happens to an agent who cannot control it (c.f. Viberg 1983). LOOK is an activity verb. SEE is an experience verb. Naughton also states both verbs can extend different subjective meanings. SEE has epistemic functions of anticipation, possibility, and doubt, and can function as an evidential imperative. A few signs from the family of 'look' signs mark the signer's evaluation of a visual stimulusamong the signs cited as examples are LOOK.UP.AND.DOWN, LOOK.DOWN.ON, and VIEW, but not LOOK. 11

Subjective uses of LOOK
A few other scholars have made preliminary observations about the subjective meaning of LOOK in addition to referent marking and the metaphorical extensions of other signs from the family of 'look' signs. In an elicited study of psych verb constructions in ASL, Winston (2013) suggested that LOOK (glossed here as LOOK-AT in the example below) is a potential "light verb" that follows a psych verb as a main verb. The outcome is the production of a caused psych verb event. In (1), the signer says that when the children looked at the clown, they laughed loudly in 11 The glosses are from Naughton (2001). She did not supplement them with figures, only providing textual descriptions. amusement, rendering LOOK as more of an experience than an activity. LOOK is directed towards the object, as indicated by the subscripts, and co-occurs with affective non-manual markers, which spread to the adjacent main verb BELLY.LAUGH.AT.
(1) CLOWN b CHILDREN a a LOOK-AT b 'BELLY-LAUGH-AT++' "As for the clown, the children looked at him and enjoyed." (Winston 2013: 34) Winston proposes the following template for LOOK-AT when it functions as a potential light verb: In the template, the object is fronted, the subscripts index verb agreement, and the brackets represent the scope of the affective non-manual markers as well as a user's expressive language, thoughts, and/or actions. In the case of (1), the signer enacts the action of the children laughing. This enactment has been referred as role shift, depiction, or constructed action (most common term), though not quite interchangeably, for different signed languages, and is marked by a perceptible shift in the signer's body through change of non-manual markers and sometimes an explicit pronoun or noun for introducing the character and enacting them Hodge and Cormier 2019;Lillo-Martin 2012;Quer 2016). This enactment also encompasses a variety of quotative and non-quotative constructions with the common denominator of representing the character. While there is no unique marker or a unique group of markers that signals constructed action, LOOK seems to be a common trigger for signaling it in ASL and even other signed languages (Engberg-Pedersen 1993;Healy 2015;Liddell 2003;Meier 1990;Naughton 2001). 12,13 In a study of affective constructions in ASL, Healy (2015) found that LOOK is a common occurrence in these type of constructions and analyzes it as a discourse marker that anticipates an experiencer's reaction to a visual stimulus. The stimulus can be either a concrete or non-concrete entity. Signers may point to either a meaningful spatial location associated with the entity or to an arbitrary spatial location that is not associated with any entity, as exemplified by the second 12 A reviewer asked whether LOOK might have a logophoric function. Their question also opened the possibility of logophoric pronouns in signed languages (Lillo-Martin 1995;Lillo-Martin and Klima 1990;Nilsson 2004). This is an empirical question that merits further investigation with extensive analysis of pronominal coreference and clause structure of the data used here. 13 The data for the present study was not annotated and analyzed for the co-occurrence of constructed action. There may be a strong correlation for constructed action to co-occur with LOOK/ 'reaction' but it also occurs with LOOK/'vision'. The relationship between the functions of LOOK constructed action merits further investigation.
LOOKing for multi-word expressions in ASL construction of LOOK in Figure 2. In some instances, the signer can experience the stimulus through other senses such as touch by extension of LOOK (glossed as LOOK-AT in the example below) as in (2): (2) LAST-NIGHT PRO1 WORK TYPING-ON-COMPUTER FEEL SUDDEN-VIBRATION LOOK-AT WHAT'S-UP "Last night I was working at my desk and felt a sudden vibration. I wondered what caused it." (Healy 2015: 150) Healy also observed that the verb does not exhibit path movement and moreover, the verb and the signer's eye gaze are not always aligned with one another. The verb may be pointing in one direction while the eye gaze is looking in another direction. The reduced path movement and the misalignment of the manual and non-manual properties underscore the experiencer's cognitive attending to the stimulus in question while looking at it.
The observations of Winston (2013) and Healy (2015) generally corroborate my observations of LOOK as a stance verb. Yet there are a few fundamental differences that must be highlighted. First, I do not readily accept the analysis of LOOK as a light verb, because the data presented in this paper shows that LOOK does not always have an adjacent main verb with which it forms a complex predicate. 14 Rather I analyze it as a stance verb that is in the process of grammaticalizing from a verb of visual perception. Second, my analysis builds on more naturalistic data sources from the internet, whereas the earlier analyses are largely based on elicited data. This has potential implications for analysis from a usage-based perspective, as more usage data may yield a wider range of functions of the constructions in which LOOK occurs.

Data sources
For the present study, the data consists of 65 videos and vlogs of ASL by 38 distinct deaf signers. 15 The data summed up to 8 h and 21 min, and consisted of three major genres: news, monologue, and conversation. The appendix lists the major details of the data and the video sources. Since there is no publicly accessible, 14 It is beyond the scope of this paper to present an argument in favor or against the light verb analysis, since previous research has not fully evaluated LOOK as a light verb in consideration of cross-linguistic criteria among spoken languages. The criteria posits that light verbs are part of complex predicates that map onto mono-clausal syntactic structures (Butt 2010). The current research on complex predicates in signed languages has not been as well developed as that on many spoken languages. 15 The demographic backgrounds of the signers are not fully known; making speculations based on one's physical appearance alone is problematic (Hou et al. 2022), but some signers have explicitly identified themselves as coming from a deaf or hearing family. machine-readable ASL corpus yet, the Internet is an opportunistic corpus in which researchers can forage for data in ASL and other signed languages instead (Hou et al. , 2022. The drawback is that identifying, collecting, and transcribing data is an immensely labor-intensive and time-consuming process. There are no standardized notation systems equivalent to the International Phonetic Alphabet for representing signed languages and thus one cannot search for data directly. One must manually search for signed language videos on the Internet and annotate them with English glosses, which are generally arbitrary, as researchers have different research goals and approaches to interpreting signs. There is likewise no standard for representing morphosyntactic analysis in signed languages like the Leipzig Glossing Rules for spoken languages. Yet this drawback is offset by the availability of multiple videos and vlogs on the Internet. These materials have been voluntarily created and produced by deaf signers on video platforms such as YouTube and social media like Facebook. Such data can be more representative of different types of usage of ASL from a larger and more diverse pool of deaf signers, and relevant for the present study, these data are likely to include usage that is rich with signer subjectivity. This offers the opportunity for researchers to add internet data to their existing corpus data and/or their collection of elicited data from idealized deaf native signers.

Methodology
The methodology of the study is modeled after Wilkinson's (2016) study of NOT collocations in ASL with some modifications. As previously mentioned, LOOK has two broad functions with some potentially distinct formational properties. The 'vision' function seems to be associated with prototypically one-and two-handed forms that exhibit clear path movement and less affective facial expressions and can co-occur with visible mouthing of English words. The 'reaction' function seems to be associated with prototypically one-handed forms that may exhibit reduced path movement and more affective facial expressions, and constructions that convey the signer's saying, thoughts, and/or feelings. The 'vision' and 'reaction' functions are not exclusively based on these formational properties, which are not analyzed here, but also based on the analysis of the phrasal context in which LOOK occurs. Some tokens are also ambiguous in the sense that the function of a LOOK token simultaneously exhibits vision and reaction or overlaps with both vision and reaction. In some instances, the function is unclear.
The first step was to identify all forms belonging to the family of 'look' signs. Many forms are represented by other English glosses for approximate meaning. There are two rationales for considering the family of 'look' signs instead of just LOOKing for multi-word expressions in ASL the LOOK sign like the one in Figure 1. One, preliminary observations indicated the reaction function did not always occur in the variants of the LOOK lexeme but potentially a few other signs of the family. Two, looking at the whole family can better capture emergent patterns of multi-word expressions of various 'look' signs, regardless of how lexemes may be categorized. Some forms of the LOOK lexeme share the core meaning of looking at a stimulus but differ in the direction of path movement, e.g., one form means 'to look to the right' and another form means 'to look at me'. Such forms would be generally lumped together since they are considered variants of the same lexeme due to their semantic association. However, I treated any form as distinct when I observed recurring patterns, such as the four tokens of the sequence LOOK.AT.ME PRO.1 'to look at me' and two tokens of HAPPEN LOOK.AT.ME PRO.1 'happen to look at me'. These patterns led me to justify splitting LOOK.AT.ME from most LOOK forms and to consider that the associations of related forms of a lexeme "are gradient and depend upon the degree of semantic and phonological similarity and the token of frequency of the specific items" (Bybee and Torres Cacoullos 2009: 188). Other forms such as 'to read' or 'to observe' are separated from LOOK on the basis of the combination of meaning and formational properties.
Next, the preceding and following signs of the LOOK forms were coded. The scope of the string of signs for identifying the sequences was not limited to bigrams, e.g., only one sign that immediately preceded or followed the target sign, but included trigrams and quadgrams. The leeway of the scope allowed for identifying and grouping sequences and examining the type of syntactic environments in which the sequences appeared. The scope was coded for five preceding signs and five following signs of the target sign, unless the boundary of the utterance was marked by the signer clasping their hands or putting their hands down. This scope allowed for analyzing the function of LOOK beyond just looking at its formational properties and the immediate adjacency of the signs. This yielded a more in-depth understanding of the functions of LOOK in the wider scope at the utterance and clausal levels for better understanding the different types of syntactic environments in which higher-frequency sequences emerged. Next, the recurring sequences were identified. A sequence is considered recurring if it met the frequency threshold of two occurrences, i.e., the sequence occurred at least two times in the dataset. Once all the sequences were identified for each LOOK token, they were scanned and flagged for all recurring sequences.
The SEE forms and the string of adjacent signs were then coded. Some words occur more frequently than other words, and there are differing views of how some recurring words may co-occur more than only by chance (Gries 2012) or how they co-occur because users select them, perceiving them to have a sequential relation on the basis of meaning (Bybee 2010). The entire dataset used in this study has not been annotated, which limits the statistical testing options such as comparing the frequency of individual signs and the frequency of co-occurrence of sequences of multiple signs and such as measuring the strength of association between signs and constructions for a collostructional analysis (Stefanowitsch and Gries 2003). The coding of SEE and their string of signs allowed me to determine whether there were sufficient data to compare shared signs among potentially recurring, overlapping sequences of LOOK/'reaction', LOOK/'vision', and SEE, by means of chi-square tests.

Results
The data yielded a total of 706 tokens and 36 types from the family of 'look' signs (the 'see' signs are discussed in §3.3). The functions of LOOK are distinguished by the labels, LOOK/'reaction' and LOOK/'vision'. Table 1 summarizes the tokens and types for the functions, and the tokens categorized as ambiguous. Table 2 summarizes the recurring n-grams (n ≥ 2). The results of the quadgrams are not reported here, as there were no more than two or three tokens for each frequent quadgram.

Reaction
There are multiple recurring sequences observed in the 174 tokens of LOOK/'reaction'. Table 3 presents 24 recurring trigrams, and Table 4 presents 38 recurring bigrams. The "s" is short for sign, representing LOOK, and the "s−1" represents the sign preceding LOOK while "s+1", and "s+2" represent signs following LOOK, respectively. These three tables show that PRO.1 recurs in the majority of the quadgrams, trigrams, and bigrams. Table 4 shows that 55% of the bigrams have PRO.1 in the s−1 slot and  Padden (1986) and Lillo-Martin (1995) used for translating quotative and non-quotative constructions in ASL to English. This interpretation also echoes the grammaticalized English 'like' to introduce reported speech and thought (Romaine and Lange 1991). In such examples, PRO.1 represents the first-person pronoun, which does not vary in form for case, and PRO.3 represents a third-person pronoun, and OIC is an interactive sign commonly used for two purposes: backchanneling in conversations or to signal realization. PALM.UP is a polysemous discourse marker in both signed languages and co-speech gestures. As indicated by the gloss, the form is the rotation of one or two open hands with an   Figure 12, in which the signer immediately signals an intimate connection to what another signer was saying. Although PRO.1 precedes LOOK in 55% of reactions, non-first person referents also occupy the experiencer role such as the third-person pronoun PRO.3 and PEOPLE which together account for 11% of the reactions. Such referents occur in both the more conventionalized sequences like LOOK OIC and more schematic sequences such as LOOK CONCERNED. Figure 13 shows PRO.3 LOOK OIC; the signer was recalling her first meeting with the former U.S. president Barack Obama, and conveying his realization that the signer was deaf. Figure 14 shows PEOPLE LOOK CONCERNED followed by another reaction that demonstrates the fatigue of being concerned. The particular sequences PRO.1 LOOK PRO.1 and LOOK PRO.1 warrant further explanation, as without context, these sequences can give the wrong reading of "I look at myself" and "X looks at me." Rather, the repetition of the first-person pronoun signals a pivot to the signer's reaction, highlighting stance-taking from a first-person perspective.
In other bigrams, after the popular OIC, there are some recurring specific signs such as YES, WOW, GET.INSPIRED, GUT.INSTINCT, HOLD.ON, and MIND.PUZZLED. These recurrences suggest the LOOK/'reaction' co-occurs with a variety of signs, some of them cognitive in nature and others exclamatory, that are commonly used to

Vision
There are 369 tokens with the vision function from the family of 'look' signs, which include OBSERVE and READ. The gloss LOOK/'vision' refers to the LOOK form specifically.    Table 5 lists all the 18 'look' types that occurred in the dataset. For the sake of space, the tables of recurring n-grams are limited to LOOK here. Table 6 and Table 7 list recurring trigrams (n = 6) and bigrams (n = 40), respectively. These tables show a lower frequency of recurring sequences compared to the tables reported for the n-grams of LOOK/'reaction'. First, the most frequent sequence PRO.1 LOOK only accounts for 15% of the sample. This sequence is used to mark visual perception of an object, as exemplified in Figure 15. Second, the trigrams and bigrams show a lower distribution of the first-person pronoun. Finally, the bigrams show the distribution of a wider variety of modals, negators, possessives, and referents co-occurring with LOOK. None of these signs group together as a category that would distinctly signal to the signer's reaction to a visual stimulus, even when considered in the larger context of discourse. The bigrams also have more hapaxes; there are 61 hapaxes preceding LOOK (41%) and 96 hapaxes following LOOK (64%).
The patterning of the n-grams for LOOK/'vision' differs from the patterning of n-grams for the LOOK/'reaction' with respect to the distribution of different signs that follow LOOK. The link between these two sets of patterns can be observed in the ambiguous LOOK tokens.
LOOKing for multi-word expressions in ASL

Ambiguous tokens
There are 163 tokens from the family of 'look' signs that have been categorized as ambiguous. Table 8 lists all the 'look' types that occurred in the dataset; there is a clear overlap between this table and Table 5. The tables of recurring n-grams are limited to LOOK. Tables 9 and 10 list recurring trigrams (n = 2) and bigrams (n = 10), respectively. The sequence PRO.1 LOOK is one of the most frequent recurring sequence in the bigrams and trigrams, similar to what has been observed for LOOK/'reaction' and LOOK/'vision'. What makes a sequence ambiguous is the expression of stance. One example is the exclusive use of facial expressions to show one's reaction to a visual stimulus. In Figure 16, the signer produces a facial expression following

Rank
Type Count Percentages LOOKing for multi-word expressions in ASL PRO.1 LOOK which can be interpreted as a negative emotional stance from witnessing a scene of a person whispering to another person and then walking away; however, the signer does not make an explicit comment about their stance nor elaborates on it and instead continues narrating the events following the scene. Another example of ambiguity is the simultaneous expression of vision and reaction as demonstrated by the type of token and by the subsequent signs. Consider Figure 17 for its demonstration of two instances of PRO.1 READ. The first instance encodes vision with Edgar Allen Poe as the object of reading, and the second instance is ambiguous because the subsequent sign of PRO.1 READ encodes a reaction that shows the signer finding Poe too difficult to understand. The Table : Frequent (n ≥ ) bigrams with look categorized as ambiguous (n = ). combination and interaction of the two functions in the phrasal context renders the second instance of PRO.1 READ ambiguous.

Frequency distribution of LOOK and SEE
There are 210 tokens of SEE from the family of 'look' signs. The gloss SEE refers to Figure 11. This sign can be also a symmetrical two-handed form. The two-handed forms (n = 16) were excluded from the study for the time being, since they warrant additional investigation for their functions. One form, SEE-SEE, is a distinctly reduced form of SEE that only indicates one's anticipation about the outcome of a situation, i.e., 'let's see' (Naughton 2001). Whereas SEE is used to refer to the perception of a visual stimulus, SEE-SEE cannot be used likewise. Table 11 presents different types and tokens of the SEE signs.  Tables 13 and 14 list trigrams, and bigrams, respectively, for SEE. The quadgrams, which overlap with some of the trigrams, are not listed here, since they are heavily a part of scripted lines from the ASL radio show, The Daily Moth. The bigrams and trigrams for SEE-SEE are not listed here due to the low token count, which makes it difficult to draw generalizations about the co-occurrence of signs with that particular sign. LOOKing for multi-word expressions in ASL According to the tables below, the most frequent trigram is SEE WHAT.DO HAPPEN 'see what will happen', which signals anticipation of an outcome for a situation. The most frequent bigram is PRO.1 SEE followed closely by CAN SEE. Other frequent bigrams show a distribution of various modals, possessives, and referents, and have few signs that are cognitive in meaning and would be associated with attitudinal stance. The clustering of signs co-occurring with SEE is similar to what is observed for LOOK/'vision'.

Statistical analysis
The LOOK/'reaction', LOOK/'vision', SEE, and SEE-SEE sequences share certain recurring signs in the co-occurrence patterns. Are the observed frequencies of co-occurrence of certain signs in the overlapping sequences significantly more likely than their alternatives (e.g., X vs. Y)? Twelve chi-squared tests were conducted for pairs of relevant alternatives. Given the number of comparisons, the critical p-value < 0.01 is corrected to p < 0.0008. Table 15 summarizes the results of the chi-square tests that showed the statistically significant difference between pairs of overlapping sequences that co-occurred with PRO.1. It is the most frequent sign that co-occurred with all the targeted signs. The results show that for PRO.1 in the s−1 slot, there is a preference for PRO.1 to collocate with LOOK/'reaction' over LOOK/'vision', SEE and    SEE-SEE. Other pairs of overlapping sequences had lower frequency of occurrences of co-occurring with other signs and, crucially, did not show any statistically significant differences. There was no difference, for example, between the LOOK/'reaction'and LOOK/'vision' for the co-occurrence of PRO.1 in the s+1 slot.

Analysis and discussion
The study revealed that there are recurring sequences involving LOOK and its family of morphologically related signs and at least one likely conventionalized multiword expression, PRO.1 LOOK/'reaction'. The range of the number of signs in these sequences is consistent with the literature about multi-word expressions in different spoken languages falling in the range of two to four words (Green 2017;Pothos and Juola 2001). The study also showed that there are emergent patterns of recurring sequences associated with different functions of LOOK and particular varieties of syntactic environments in which they occur.

Syntactic environments of different LOOK types
The high-frequency sequence PRO.1 LOOK is the only sequence that recurs across the three categories of vision, reaction, and ambiguous. It is most robust for LOOK/'reaction', accounting for 55% of the 174 tokens, and least robust for LOOK/ 'vision', accounting for only 15% of the 150 tokens. It is relatively frequent in the ambiguous category, accounting for 34% of the 58 tokens. The degree of the frequency that the sequence PRO.1 LOOK occurs across these functions appears to  correlate to the type of LOOK, the co-occurrence of a cluster of signs with this sequence, and the syntactic environment. First, LOOK/'reaction' exhibits a highly specialized meaning which presents the signer's attitudinal stance towards a visual stimulus. This type may be accompanied by phonetic reduction, manifested by reduced path movement in the LOOK sign, and heightened affective facial expressions, but this warrants further investigation. Second, the stance is strengthened by the frequent co-occurrence of a first-person singular pronoun, and the expression of stance. Third, LOOK/'reaction' occurs in a more restricted syntactic environment, whereas LOOK/'vision' and the ambiguous LOOK constructions occur in the following environments: (a) The presence of an explicit object in a post-verbal position, e.g., 'comments' (Figure 15), (b) The co-occurrence with modals in a pre-verbal position, e.g., CAN LOOK ( Figure 18), (c) The co-occurrence with negators in a pre-verbal position, e.g., NOT LOOK ( Figure 19), (d) The formation of a complex predicate by the co-occurrence of LOOK with another verb, e.g., LOOK SEE (Figure 20) (e) The nominalization of LOOK, e.g., 'look back' or 'reminiscence' as a noun (Figure 21), The syntactic environments are not mutually exclusive of one another. Theoretically, LOOK/'vision' can have a combination of the properties such as the co-occurrence of LOOK with a modal and an explicit object (see Figure 18). Second, a more in-depth investigation of the internal structure of the constructions could yield a more fine-grained analysis, but this is hindered by the laborious difficulty of identifying clausal boundaries in ASL. There is an ongoing discussion about identifying clausal and sentential boundaries in signed languages (Hodge 2014;Jantunen 2017;Johnston 2019;Ormel and Crasborn 2012).
There has yet to be a systematic investigation on the body of syntactic and prosodic LOOKing for multi-word expressions in ASL cues for identifying the boundaries of basic and complex utterances based on spontaneous discourse in ASL specifically, though there is some research based on contextually isolated, elicited data. So for the time being, I do not make any specific claims about the clausal boundaries for the LOOK constructions, particularly for the LOOK/'reaction' ones, so what can be instantiated in the 'reaction' slot may be in the same clause or constitute another clause. Apart from the clause issue, the present data shows that there are observed differences in the syntactic environment of LOOK/'vision' and LOOK/'reaction' and in-between. This is illustrated in the three basic constructional schemas in Figure 22. The reaction construction,   however, has a more restricted syntactic environment, suggesting a loss of broad syntactic usage of the sign. This loss paves the way for the emergence of a new construction from the broader constructions, an indicator of grammaticalization.

Grammaticalization
Grammaticalization research on signed languages has focused on the incorporation of manual gestures and facial expressions as grammatical and lexical morphemes into signed languages (Janzen 2012(Janzen , 2018Janzen and Shaffer 2002;Wilcox 2004Wilcox , 2007 from cognitive linguistics perspectives and from formal linguistics perspectives (Meir 2003;Pfau and Steinbach 2006). These studies have demonstrated how certain elements of phenomena of grammaticalization can be specific to the modality, i.e., transmission channel, of language, and other elements are not specific to modality. Of interest is the ASL case study of KNOW (Janzen 2018). As a verb, it co-occurs with subject and object pronouns or noun phrases. As a discourse marker, i.e., 'I know' or 'you know', KNOW generally does not co-occur with any nominals. As a topic marker, KNOW co-occurs at the beginning of a topic phrase and co-occurs with raised eyebrow and potentially a slight backward head tilt. The changes observed in the lexical and grammatical uses of KNOW show the transformation of the syntactic dimension: the relatively free syntactic units become constrained grammatical morphemes. LOOKing for multi-word expressions in ASL Usage-based theories postulate that usage drives the grammaticalization of lexical items on the phonetic, semantic, and syntactic dimensions (Bybee 2003(Bybee , 2010Traugott 2003). The lexical item not only becomes a grammatical morpheme, as indicated by phonetic and semantic changes, but a new construction emerges from an old construction as wellboth form and meaning change in the emergence of new structures. The English construction going to/gonna is a welldocumented example. The grammaticalization of gonna from going to did not happen from mere repetition of the item itself, but rather through repetition of the instantiation of the item in the purpose construction [movement verb + Progressive + purpose clause]. This step produced a new construction [be going to VERB] that conveys the intention reading (Bybee 2003). Other movement verbs such as traveling, riding, or journeying, can be applied to the purpose construction. However, these verbs cannot be instantiated in the verb slot of the intention construction because they do not give the same reading as gonna does.
In the case of LOOK in ASL, the grammaticalization process is ongoing. The syntactic environment of LOOK/'vision' narrows for that of LOOK/'reaction' as the meaning becomes more specialized with an emphasis on pragmatic strengthening. Different 'look' types can be used to preface reactions as observed in some of the ambiguous constructions. In Figure 23, the LOOK form means 'to look up at an object', which in turn refers to an antiquated telephone in a museum display. This type is also distinct on the upward direction of the path movement, and the ambiguous function is based on the meaning of possibility which is conveyed by the modal CAN and the non-subjectivity from a third-person viewpoint, combined with a hypothetical stance. Many ambiguous constructions can be viewed as the intermediary between vision and reaction, or as part of a continuum of the array of LOOK constructions that are undergoing grammaticalization. Figure 24 represents a visual representation of the grammaticalization process for LOOK moving from vision towards reaction. The brackets represent the schema, and the parentheses represent an optional slot; the formational properties associated with the functions have yet to be fully investigated quantitatively. The schemas illustrate that the subject transitions from the agent who looks at a stimulus to the experiencer who expresses their reaction to it. The change of the syntactic construction exhibits a greater degree of subjectivity. A sequence [PRO.1 LOOK/'vision' (object)] can represent some degree of subjectivity on the basis of an experience of a looking activity from a first-person viewpoint, whereas a sequence [PRO.1 LOOK/'reaction' reaction] clearly represents a stance from a similar viewpoint.

Prefabs and schematization
In the data, the most robust patterns are the LOOK/'reaction' constructions. The most frequent sequence is PRO.1 LOOK, which can be properly viewed as a prefab. Other sequences cluster around a group of recurring signs, which are cognitively oriented, that precede and/or follow LOOK, indicating the strength of association of such signs with reaction. These sequences may constitute prefabs, stored as multiple instances of exemplar wholes, rather than as individual component parts; they would be entrenched as autonomous chunks, facilitating retrieval and processing as chunks (Bybee 2010). These prefabs also allow for the instantiation of a more schematic template for sequences in which the immediate slots that precede and follow LOOK can be filled with other signs. The template accounts for the creation of novel sequences, as evidenced by the lower frequency sequences in the data: [(experiencer) LOOK/'reaction' reaction]. Although the [experiencer] slot is commonly filled by a first-person pronoun, this slot is also filled by other LOOKing for multi-word expressions in ASL pronominal and nominal arguments such as PRO.3 and PEOPLE, and the occasional discourse marker PALM.UP. 16 In some instances, the experiencer is not explicitly mentioned. For the [reaction] slot, it is filled by the interactive OIC for 21% of the tokens for bigrams and 10% of the tokens for trigrams. This slot is also filled by other lower frequency but recurring signs such as PRO.1, PALM.UP, WOW, GET.INSPIRED, HOLD.ON, MIND.PUZZLED, and YES. The slot is also filled by other signs that are hapaxes and even a longer string of signs that constitute the reaction as seen in Figure 25. The reaction is not necessarily limited to individual signs but rather a string of signs that convey the signer's stance.
The repeated co-occurrence of certain signs shows how they cohere together in recurring sequences and give rise to the emergence of relatively fixed, conventionalized strings of units as prefabs. They also give rise to the schematization of these units, allowing for the creation and re-use of new structures. The issue of whether users use abstraction or analogy, or both, for producing such novel structures remains an open question.

Conclusion
Sign language linguistics has come a long way since its advent in the 1960s, when ASL was first heralded as a full-fledged language with its own grammar. However the investigation of multi-word expressions has only begun to advance to the point where researchers are moving beyond structuralist and formal-generative approaches and looking at the structure of ASL in terms of recurring chunks of structure in discourse (Lepic 2019;Wilkinson et al. in press). This paradigmatic 16 It has not escaped my attention that PALM.UP recurs in the more frequent sequences associated with the vision, reaction, and ambiguous LOOK.AT constructions and even the SEE constructions. While I have yet to investigate the functions of the PALM.UP forms, I believe that the distribution of PALM.UP may be analogous to fillers in spoken languages.
shift provides the opportunity of ascertaining whether recurring sequences in signed languages emerge from domain-general cognitive mechanisms and how these sequences contribute to linguistic structure and meaning. This opportunity is made possible with the rise and spread of internet data for ASL and signed language corpora such as Auslan, BSL, STS, and many more.
Usage-based linguistics posits that grammar emerges from repeated use in particular discourse contexts. The use is shaped by the application of domaingeneral cognitive processes including chunking, entrenchment, and automatization. Chunking leads to the formation of multi-word expressions, and higher frequency chunks contribute to the grammaticalization of certain units. The empirical inquiry of what multi-word expressions exist and how they emerge in different signed languages has only recently been considered with the rise of corpus data. This avenue of investigation contributes to the inquiry with the case study of the ongoing grammaticalization of a high-frequency ASL sign, LOOK. The study offers evidence for chunking, based on the co-occurrence of the first-person pronoun with different functions of the family of 'look' signs in various syntactic environments.
First, it appears that LOOK/'vision' occurs in a wide variety of sequences in diverse syntactic environments, whereas LOOK/'reaction' occurs in a much more restricted syntactic environment and tends to co-occur with a first-person pronoun in a construction that represents the signer's attitudinal stance. Second,

Data availability statement
Where available, the links to the videos are provided in the Appendix and as part of the Figure