Transitivity on a continuum: the transitivity index as a predictor of Spanish causatives

This paper contributes to the study of transitivity as a general property of the clause. Unlike most previous work on the subject, however, transitivity in the present article is used to study a lexical alternation, namely the two causative predicates dejar ‘let’ and hacer ‘make’ in Spanish. To do this, I use the transitivity index (TI), a weighted continuous measure of transitivity based on Hopper and Thompson’s (1980, transitivity in grammar and discourse, Language 56, 251–299) transitivity parameters. The advantage of the TI is that it assigns different weights to each of the transitivity parameters and it is therefore sensitive to the particular construction it is applied to. I show that the TI can correctly predict the two Spanish causatives dejar ‘let’ and hacer ‘make’ with 80% accuracy and demonstrate that hacer is associatedwith higher transitivity contexts. In addition, linguistic features of the causer such as grammatical person and number are found to help distinguish between the two predicates. The finding that a lexical alternation can be reduced to a difference in transitivity raises important questions regarding the structure of the lexicon and the type of information it may contain.


Introduction
Transitivity is a pervasive, perhaps universal, phenomenon in natural language (Naess 2007), defined as the effects of an action performed by an agent on a patient (e.g., Lazard 1998;Lyons 1968;Tsunoda 1985). Under this semantic definition, features such as agency, volitionality and affectedness are important aspects distinguishing transitive from intransitive clauses. Givón (1995) claims that the prototypical transitive clause describes an event that is non-durative, bounded, non-perfect and realis. Thus, the prototype of a transitive event is fast-paced, completed, real and perceptually and cognitively salient.
From a structural perspective, a basic transitive clause must have two arguments and no distinction is made based on the relationship between the two arguments (Jacobsen 1985). The more traditional view considers transitivity a property of verbs not clauses (Lazard 1998). Under this definition, verbs such as kick and eat (i.e., eventives) are as transitive as mean and know (i.e., states). In other definitions based on structural considerations the type of arguments is taken into account to classify a clause as transitive or intransitive: clauses with accusative patients or ergative agents are considered transitive, whereas a clause that has two arguments but no patient, for example, is not (Drossard 1991;Helbig and Buscha 1993).
These different approaches to transitivity all have in common that they conceive of transitivity as a binary phenomenon. A clause or a verb is transitive or intransitive and there is nothing in between these two opposite categories. A more general view of transitivity was proposed by Hopper and Thompson (1980), who propose that transitivity should be modelled as a scale. Based on cross-linguistic evidence, they put forth ten co-varying parameters (discussed in Section 2), all of which describe the effectiveness with which an action takes place. Working within a functional approach to language, they also claim that the features of transitivity derive from its discourse function, namely the fact that high transitivity is associated with foregrounding and low transitivity with backgrounding. The most important aspect of their approach was to view transitivity as a scale, or a continuum, as opposed to the more traditional binary view. As a result, clauses can be more or less transitive than others and no one parameter determines the transitivity status of a clause. In fact, Hopper and Thompson point out that a clause with a 1-place predicate such as leave could potentially be more transitive than a clause with a 2-place predicate like eat if other elements of the clause all score high in transitivity. Because of the profound influence that it has had in linguistics, I adopt their proposal to transitivity as was first published while acknowledging that others have proposed improvements to the way the parameters should be organized (e.g., Malchukov 2006) or even to what the parameters should be (e.g., Givón 1995;Tsunoda 1985). The goal of this study is to take their proposal that transitivity is a property of the clause at face value and operationalize the parameters as numerical variables so that we can obtain a single transitivity score for each clause. This score, called the transitivity index (TI) in Guajardo (2021), is used to study the Spanish causatives dejar 'let' and hacer 'make'.
The goal of the paper is therefore twofold: (i) to demonstrate how to use and calculate the TI and (ii) to contribute to the study of causative choice in Spanish.

Clausal transitivity
In a seminal paper, Hopper and Thompson (1980) develop the proposal that transitivity is best understood as a property of the whole clause, which can be broken down into 10 parameters. All the parameters are binary, except for INDI-VIDUATION, which subsumes a number of semantic features of the object. The parameters are shown in Table 1 and the features of the object comprising INDIVIDUATION are shown in Table 2.
The parameter PARTICIPANTS refers to whether the predicate has one participant (low) or two or more participants (high). KINESIS distinguishes between states (low) and non-states (high). ASPECT describes telicity, where telic predicates (high) are  distinguished from atelic predicates (low). PUNCTUALITY refers to punctual (high) and non-punctual (low) events. VOLITIONALITY concerns features of the subject, distinguishing between volitional (high) and non-volitional subjects (low). AFFIRMATION refers to whether the clause is affirmative (high) or non-affirmative (low). MODE concerns the modality of the clause, distinguishing between realis (high) and irrealis (low), and AGENCY refers to whether the subject of the clause is agentive (high) or non-agentive (low). The last two parameters describe features of the object. AFFECTEDNESS concerns the degree to which an action is transferred to a patient. Clauses with totally affected objects are considered more transitive than those with non-affected objects. Last, INDIVIDUATION is made up of six features describing the grammatical object. For example, a concrete and count Noun Phrase (NP) is more individuated than an abstract and mass NP and therefore higher in transitivity.
Based on cross-linguistic evidence, Hopper and Thompson further propose the transitivity hypothesis in (1). 1. If two clauses (a) and (b) in a language differ in that (a) is higher in transitivity according to any of its subcomponents then, if a concomitant grammatical or semantic difference appears elsewhere in the clause, that difference will also show (a) to be higher in transitivity (Hopper and Thompson 1980: 255).
The transitivity hypothesis proposes that the values of each of the transitivity parameters will co-vary systematically. For example, if a language distinguishes between telic and atelic predicates in its morphology and requires overt marking of the object that appears with telic predicates (high transitivity) then their hypothesis predicts that the objects should also bear markings of high transitivity such as being highly individuated (Hopper and Thompson 1980: 255). Tsunoda (1985) suggests that the transitivity hypothesis as it stands is too strong because not all parameters can be expected to co-vary to the same degree or even to co-vary at all. For example, he argues that the correlation between AFFECTEDNESS and AGENCY is non-existent as one can kill someone with the same efficacy whether it is done accidentally or intentionally. Likewise, VOLITIONALITY and AGENCY almost describe the same property as it is very difficult to picture a subject who is volitional but non-agentive or non-volitional but agentive (Tsunoda 1985: 392). In fact, subsequent work has proposed that volitional involvement is a prerequisite for agenthood (e.g., Dowty 1991;Lehmann 1991;Van Valin and Wilkins 1996) so it is likely that these two parameters can be replaced by a single one.
Others have pointed out as a possible weakness of the proposal the fact that some sort of hierarchy among the parameters is missing. For example, Givón (1985) and Malchukov (2006) suggest distinguishing between A-features, V-features and O-features, depending on which argument of the clause they pertain to. Malchukov (2006) puts forth the transitivity scale in (i). He proposes a weaker form of the transitivity hypothesis whereby only parameters that are semantically related (i.e., placed adjacently on the scale) will show systematic co-variation. This is an important observation constraining the type of co-variation likely to be found in natural language, and it makes clear predictions that can be tested empirically. However, while the non-hierarchical nature of the parameters may be a disadvantage of Hopper and Thompson's proposal, an overarching scale like that in (i) is also problematic because it ignores the role of the construction where the parameters are computed. In other words, the scale in (i) suggests that the relationship among the parameters is static and does not vary by construction. In addition, by collapsing the O parameters into Individuation one disregards the possibility that within this supra-parameter there might also be a hierarchy among its members. The TI proposed in Guajardo (2021) addresses this issue by acknowledging that the parameters must be hierarchically organised, but, crucially, ensuring that the parameter hierarchy is dynamic and determined construction by construction. I show how to formalise this idea in Section 5.  (Malchukov 2006: 333) Although the transitivity hypothesis was first proposed to account for obligatory morphological marking, if languages develop morphological systems in line with this hypothesis then we should expect to find that in general, ceteris paribus, languages will still show sensitivity to the transitivity parameters even in contexts where no overt obligatory marking is present. After all, obligatory morphological marking begins in iconic contexts where certain features tend to co-vary frequently (Bybee et al. 1994). For example, a language may not distinguish between count and mass nouns morphologically, but speakers may still be sensitive to this distinction regardless of the lack of overt morphology, which may have consequences in various grammatical constructions. Thus, researchers have studied the effect of transitivity in different areas of the grammar with or without obligatory morphological marking. In what follows, this will be illustrated with studies based on Spanish. Clements (2006) uses the transitivity parameters to investigate non-anaphoric se in Spanish (e.g., Se venden sillas "Chairs are sold", Se rompió el florero 'The vase broke'), thus not reflexive or reciprocals. Clements proposes that the clitic se has two distinct functions in relation to transitivity: it can reduce transitivity by Transitivity on a continuum decreasing the valency of a verb by one argument or by disallowing the appearance of a nominal or pronominal subject of an intransitive verb. Furthermore, she shows that the presence of se co-varies with higher transitivity whereas its absence correlates with lower transitivity. The former corresponds to aspectual differences of verbal minimal pairs with and without se such as comer 'to eat' and comerse 'to eat up'. The forms with se occur with count NPs and bare plurals are not possible. The latter (i.e., lower transitivity) concerns middle, passive, unaccusative, antipassive and impersonal uses of se.
Vázquez Rosas (2006) uses this framework to study Spanish reverse psychological predicates (e.g., Le molestan las moscas 'Flies annoy her'). A subclass of these predicates allows the experiencer to be marked with either the accusative or the dative clitic and Vázquez Rosas shows that this alternation is also governed by transitivity; higher levels of transitivity favour accusative-marking and lower levels favour dative marking. In particular, she argues that accusative marking signals dynamic and telic events with physically affected objects while the dative clitic appears in stative and atelic contexts with objects that are psychologically affected. Importantly, Spanish does not differentiate in the morphology between affected or non-affected objects, or telic and atelic predicates, but the transitivity parameters proved to be useful in characterizing these predicates.
Ganeshan (2019) also investigates case alternation of Spanish clitics in reverse psychological predicates. She finds the alternation seems to be tied to the agentivity of the subject and affectedness of the object, such that accusative appears with agentive subjects and affected objects and dative with the opposite values of those two features.
An important difference in how transitivity is used in this article is that the alternation between dejar 'let' and hacer 'make' is not grammatical as these two causatives are not synonymous with each other. The studies presented above have all used the transitivity parameters to try to narrow down the contexts in which one of two synonymous elements is more likely to occur. However, the two causative predicates this paper is concerned with have different meanings and cannot alternate without a drastic change in meaning. Thus, this paper is an attempt to employ the transitivity parameters at the lexical level so I will return to this issue in the Discussion section. In the next section, I describe the causative construction and some of the previous research that has compared dejar and hacer in Spanish.

The causatives dejar and hacer
The two causatives dejar 'let' and hacer 'make' constitute factive constructions where the causer lets or makes the embedded event happen, which in turn comprises a second participant. In addition, hacer is said to constitute positive causation while dejar negative causation (Soares da Silva 1999).
The specific causative construction to be studied in this paper is one in which the causative takes an infinitival complement and the subject of the infinitive (i.e., the causee) is realized as a pronominal clitic (1-a-b In (1a) the causative hacer 'to make' is preceded by the clitic los 'them' and followed by the infinitive abandonar 'to abandon'. In (1b) the causative dejar 'to let' is preceded by the clitic lo 'him' and followed by the infinitive abordar 'to aboard, to get on'. In both cases, the clitic in the matrix clause is the logical subject of the infinitival clause. A peculiarity of this construction is that the third-person clitic can appear in either the accusative (2a-3a) or the dative case (2b-3b). 2 Note that in Spanish thirdperson clitics are the only clitics where a case distinction is found between accusative and dative. The generalization has been that intransitive verbs take an accusative clitic and transitive verbs a dative clitic (e.g., Aissen and Perlmutter 1983;Comrie 1976;Rosen 1990). However, case marking of the clitic in this construction is highly variable (Labelle 2017). Some have proposed that the difference in case can be explained by directness of causation (Enghels 2012;Strozer 1976;Treviño 1994). For example, if an intransitive predicate appears with a dative clitic then the causation is considered indirect. Likewise, a transitive predicate with an accusative clitic is said to mark direct causation (Moore 2010;Strozer 1976). Moore (1996) also observes that when hacer takes an accusative clitic the referent must be animate (4a); an inanimate accusative clitic is ungrammatical (4b). In contrast, no such restrictions hold with dative clitics (5) (4a) is ambiguous depending on the syntactic function assigned to the clitic la 'her'. In one reading la 'her' is the subject of the embedded verb esconder(se) 'to hide' and the clitic can only have an animate referent. In the second reading, la is the object of the infinitive and it can have an animate or an inanimate referent. In (4b), the clitic is the subject of the infinitive perder 'to lose' and the referent el coche 'the car' is inanimate so this sentence is ungrammatical. In (5) the clitic is realized in the dative case so it is free to have an inanimate referent. However, there are cases in which the accusative clitic can, in fact, refer to an inanimate NP. For example, in (6) hacer appears with an accusative clitic and the sentence is fully grammatical.

6.
Lo hice arrancar enseguida. it.ACC made.1SG.PAST start.INF right away 'I made it start up right away' In (6), the clitic lo is singular, masculine and accusative and the most natural way to interpret the sentence is that I made a machine (e.g., a car) start right away. I return to this issue in Section 7.2. In the next section, I review some of the previous related work on the two Spanish causatives.

Previous work on Spanish causatives
Although most work on Spanish causatives has focused on the study of hacer, a small number of papers discuss the two causatives hacer and dejar by comparing their behaviour with respect to different semantic and syntactic features. I describe some of their findings below and discuss how the present work can build, and shed more light, on our current knowledge of these predicates. Ruiz-Sánchez (2006) compares dejar and hacer against Vendler's (1967) lexical aspect of the infinitive verb (i.e., states, activities, accomplishments and achievements). The data come from examples created by the author to illustrate the contexts in which each causative is more likely and the analysis is restricted to animate subjects. She concludes that hacer implies intentionality, direct causation and unwillingness of the causee for the event to take place. She also claims that hacer makes reference to the whole event for states, accomplishments and achievements but with activities it makes reference to the beginning of the event. In addition, states, accomplishments and activities imply high involvement of the causer whereas achievements denote low involvement. Causative dejar also implies intentionality on the part of the causer, but contrary to hacer, it refers to indirect causation, willingness and control of the causee for the event to happen and low causer involvement across the four lexical aspectual categories. Enghels (2012) studies both causative constructions in relation to the differences between positive and negative causation and the case marking of the causee realized as a clitic. The data come from CREA (Corpus de Referencia del Español Actual 'Corpus of Reference of Contemporary Spanish) (RAE 2008) and the analysis is limited to Peninsular Spanish. She claims that the case of the clitic is independent of the transitivity status of the infinitive verb. She follows Soares da Silva (2001) in distinguishing hacer from dejar in terms of positive and negative causation, respectively, and aims to establish whether this semantic difference can be tied to the variability in clitic case. Her point of departure is the claim that accusative clitics denote direct causation and dative clitics indirect causation (e.g., Moore 1996). She concludes that when the causer lacks control or coercion (e.g., inanimate subjects) then hacer favours the dative clitic whereas the reverse is true for the accusative clitic. She also finds that the behaviour of dejar is more complex because the case of the clitic depends on the specific semantics of the causative. She identifies three basic meanings: (i) "to cause" prefers accusative, (ii) "not to permit" prefers dative, and (iii) when it means "not to oppose" case assignment is dependent on the semantics of the subordinate event. Overall, she reports the dative is found more often than the accusative clitic with both causatives regardless of whether the infinitive is transitive or intransitive.
In a comparative study, Enghels and Roegiest (2012) compare dejar with infinitival or subjunctive complement clauses in a sample of 1,000 sentences from CREA. They find that dejar mostly appears with animate subjects (80%) but the subject is not always in control as is the case with hacer. They relate this lack of control on the part of the subject to the frequent use of dejar in their data with intransitive verbs and inanimate causees. In addition, they report that dejar appears mostly with a dative clitic. When the accusative clitic is used, the object tends to be either inanimate or feminine.
While these studies highlight important characteristics of the causative constructions, the methodologies impose some limitations on the generalizations observed. Ruiz-Sánchez's (2006) study focuses only on animate subjects and the examples are constructed by the author, a fact that undermines the generalizations made in the paper because one single sentence per condition is simply not enough data to rely on. An important caveat of Enghels (2012) and Enghels and Roegiest (2012) studies is the focus on Peninsular Spanish. As mentioned in footnote (10) in Enghels (2012: 22), Peninsular Spanish uses the dative clitic for masculine animate direct objects (a phenomenon known as leísmo), thus a morphologically dative clitic cannot be interpreted as marking the causee as an indirect object. 3 This makes the data difficult to interpret, weakening the conclusion that both causatives prefer the dative clitic. Methodologically, although both studies are a step in the right direction by using corpus data, no statistical analysis is conducted, thus it is difficult to assess the true effect of the percentage differences reported.
The present study addresses these issues by using a relatively large data sample of over 4,500 sentences from a corpus of 19 Spanish-speaking countries. Fundamentally, Peninsular Spanish is not included in the sample for the reasons just explained about leísmo. Moreover, the data will be analysed with advanced statistical methods to arrive at a fine-grained understanding of the linguistic elements involved in causative constructions.

Research questions, hypotheses and predictions
The generalizations and claims presented in Sections (1-3) allow us to formulate clear research questions, hypotheses and predictions that can be tested empirically with the help of statistical modelling. I will first introduce the guiding research questions followed by the hypotheses and end the section with the predictions that follow from the previous literature. The three research questions (RQ) I try to answer are the following (note that questions (ii-iii) are dependent on (i)). i. Can transitivity correctly predict which causative will appear in a specific context? ii. Which parameters are the most important in distinguishing between the two causatives? iii. Are there other linguistic elements of the clause such as tense, person, number and clitic case that can help distinguish between the two causative predicates?
If it turns out that transitivity is not a property that can distinguish the two causatives, then we must stop there. However, if a relationship can be established between transitivity and the causatives then more specific questions can be pursued. RQ (ii) seeks to determine which parameter(s) helps the most in distinguishing between the two causatives. RQ (iii) is concerned with linguistic variables beyond the transitivity parameters that may help constrain the semantic contexts of each causative. An important aspect worth highlighting is that, in reference to RQ (i), the goal is not just to see whether transitivity is a statistically significant factor but also to determine how big of an effect it has. We can surmise that we may find a significant but relatively small effect of transitivity or we may find a bigger effect, which would indicate a much stronger relationship between transitivity and the causative predicates.
These four research questions together with some of the previous findings lead us to the formulation of the following hypotheses: Hypothesis 1: The causatives dejar and hacer will be predictable from the transitivity parameters.
Hypothesis 2: The causative hacer will be more transitive than dejar.
Hypothesis 3: The case of the clitic will be a reliable cue for causative choice.
Transitivity on a continuum

Predictions
The findings from previous work on the Spanish causatives that has found features such as agency and animacy of the causer to be relevant aspects in causative constructions leads to the formulation of Hypothesis 1. The prediction is that each causative can be accurately characterized by assigning specific values to each parameter of the transitivity scale. If the null hypothesis is true, however, then we do not expect the models to have a predictive power higher than chance. Hypothesis 2 follows what we know about the semantics of hacer and dejar, so the expectation is that hacer will be characterized by higher values of transitivity (i.e., PARTICIPANTS = transitive, AFFECTEDNESS = affected, INDIVIDUATION = individuated, etc.). Since accusative clitics have been found to be associated with higher transitivity (Ganeshan 2019), I expect the accusative clitic will occur more often with hacer than with dejar.
In the next section, I explain the methodology for data extraction, calculation of the TI and the statistical methods used for the analysis.

Methodology
All the statistical analysis was performed in R version 4.0.3 (R Core Team 2020). 4 The main analysis is done within the Bayesian inference framework by means of mixed-effects logistic regression models of a dataset with over 4,500 sentences. The statistical analysis consists of two different models, Model-1 and Model-2, described in Section 5.3.

Data extraction and annotation
The dataset used in this paper is the same dataset that Guajardo (2021) used for his study on clitic case alternation in causative constructions. The data were extracted from Corpus del Español WebDialects and NOW versions (News on the Web) (Davies 2002). The current web interface of the corpus allows for extraction of a maximum of 500 random concordances per search, so 500 random instances were extracted of both causatives with each clitic followed by an infinitive (la + DEJAR + INF, las + DEJAR + INF, le + DEJAR + INF, etc.). Since the accusative clitic inflects for gender as well as number, this resulted in having twice as many accusative clitics than dative clitics (500 × 8 = 4,000 vs. 500 × 4 = 2000). Therefore, to obtain a more balanced sample 2000 more sentences were extracted with the dative clitic from the NOW version of the corpus (500 for each causative + clitic number combination). Both corpora are made up of texts from the Internet, including newspapers, blogs and general websites so it is safe to assume that they have equivalent registers for the present study. The WebDialects corpus has nearly two billion words and the NOW corpus has 5.5 billion words. 5 The resulting dataset contained data from 21 Spanish speaking countries including the USA. Two countries were removed for the analysis. Spain was removed due to the reasons discussed in Section 3 about leísmo. The USA data was also removed because in the USA there are a lot of speakers from other varieties as well as non-native speakers so this would add extra noise to the data. After removal of duplicates and false positives the resulting dataset contained 4,589 sentences where 2,157 contain dejar and 2,432 contain hacer, which translates into a 0.47 and 0.53 relative proportion, respectively. Table 3 shows all the variables and the corresponding levels used in the analysis. The data were manually annotated with the transitivity parameters except for VOLITIONALITY and two of the subcomponents of INDIVIDUATION, namely proper names versus common and referential versus non-referential. VOLITIONALITY was not included because, as Tsunoda (1985) pointed out, it is unlikely to find Transitivity on a continuum contexts in which VOLITIONALITY does not equal AGENCY so I only coded for AGENCY. The other two subcomponents were not relevant because there were no proper names in the dataset and the objects in this construction tend to be referential. 6 Four more variables were added: CASE, PERSON, NUMBER OF SUBJECT and TENSE. CASE refers to the case of the clitic, the other three refer to features of the causative verb. In addition, two variables were used as random effects in the statistical models, namely VERB and COUNTRY. VERB refers to the infinitive verb in each sentence and COUNTRY to the variety of Spanish in the corpus. 7 Due to data sparsity (i.e., few data points for some levels of a variable), the variables TENSE and PERSON were binarized such that TENSE was coded as past versus non-past and PERSON as third versus non-third.

The transitivity index
The TI is a weighted continuous measure of transitivity ranging from 0 to 1 (lowest to highest transitivity). In line with the discussion in Section 2 about the lack of a hierarchy among the transitivity parameters, the term weighted refers to the way the index is calculated, which takes into consideration the importance of each individual parameter in a specific construction. This is important because it means that the parameter weights can change across constructions. Without the weights, we would be assuming that the parameters are all equally important across constructions and there are good reasons to believe this is not correct. For example, in differential object marking, languages differ in where they draw the line between marked and unmarked objects. This line has been shown to lie somewhere between definiteness, animacy or specificity of the object (e.g., Aissen 2003;Bossong 1991;Comrie 1979). As these features characterise the object, it seems logical to assume that they will be more important in determining the contexts for differential object marking than a verbal parameter such as MOOD. This characteristic of the index is key to its explanatory power and its potential as a standard measure against which different constructions both within the same language and across different languages can be compared.
The calculation of the index involves four steps: (i) subsetting the dataset, (ii) training 1,000 random forests, (iii) calculating the variable importance of each random forest and (iv) averaging over all the variable importances to obtain each parameter weight. In what follows, I explain each step in detail.
The first step consisted of creating a random subset with 20% of the total data (i.e., 917 sentences). This dataset was then used to train 1,000 random forests of 3,000 trees. These data were only used in this step and were not used anywhere else in the analysis (except for the descriptive statistics). For each random forest, the conditional variable importance was computed yielding 1,000 variable importance scores for each parameter. The final weight of each transitivity parameter is the average over the total 1,000 variable importance values.
The final parameter weights are presented in Table 4 in decreasing order. 8 The weights show that the most important parameter in distinguishing between the two causatives is AFFIRMATION, followed by AGENCY OF SUBJECT and KINESIS. Regarding AFFIRMATION, the data show that hacer appears 96% of the time in an affirmative sentence while dejar appears 60% of the time in the same context. The second most important parameter, AGENCYSUBJ, shows that hacer and dejar appear 42 and 67% of the time with an agentive subject, respectively. KINESIS shows that dejar appears 86% of the time with a non-stative verb while with hacer this figure goes down to 63%. The three least important parameters are COUNT, AFFECTEDNESS and NUMBER OF OBJECT.
The last step consisted of replacing each high transitivity value with the corresponding weight and the low transitivity value with 0. For example, if AFFIRMATION had the value affirmative, this was replaced with 0.12965 and if it was non-affirmative it received 0. The TI for each sentence was obtained by adding together each individual parameter (AFFIRMATION + AGENCY OF SUBJECT + KINESIS + MOOD …, etc.). The index was normalized between 0 and −1 for easier interpretation.

Statistical analysis
The remaining data were partitioned into a training and testing dataset. This was done in order to test the prediction performance of the models on unseen data. The training dataset contained 75% of the remaining data (2,755 sentences) and the testing dataset the remaining 25% (917 sentences). Two different Bayesian mixedeffects models were tested: Model-1 with only the TI as a predictor and Model-2 with the four additional variables. The Bayesian models were fitted using the Stan modelling language (Carpenter et al. 2017) with the brms package (Bürkner 2017). 9 To test whether the TI or any of the other variables showed evidence of an effect, I calculated Bayes factors. The Bayes factor allows us to calculate the probability of rejecting the null hypothesis of no effect for each parameter given the data. To do this, I calculated a null region such that if an effect fell within this region it was practically equivalent to the null hypothesis (Kruschke 2010). The null region is automatically computed with the rope_range function in the bayestestR package and it was (−0.18, 0.18). The interpretation of Bayes factors is as follows ( Jeffreys 1961): BF < 1 evidence in favour of the null hypothesis (the parameter does not contribute to explaining the outcome), BF = 3-10 there is moderate evidence, BF = 10-30 there is strong evidence, BF = 30-100 there is very strong evidence and BF > 100 extreme evidence.
In addition, the predictive power of Model-1 and Model-2 is compared to assess the extent to which transitivity on its own can account for causative choice.

Results
I first present the descriptive results and then I introduce and explain the results of the statistical models. A complete table with the model diagnostics is in the Appendix as well as a figure with trace plots for each parameter in each model. Figure 1 shows the relative frequency of each causative across the four added variables, and Figure 2 the distribution of the TI for each causative verb. The descriptive results are based on the entire dataset of 4,589 sentences.

Descriptive statistics
In Figure 1, there are two major differences between the two causatives. In plot (A), dejar appears with first or second persons more often than hacer does (0.23 dejar vs. 0.10 hacer), which mostly appears in the third person (0.90). Similarly, the  Transitivity on a continuum number feature of the subject seems to distinguish between the two causatives. In plot (B), hacer appears with singular subjects more often than dejar does (0.72 hacer vs. 0.47 dejar). Conversely, dejar is more likely than hacer with plural subjects (0.28 hacer vs. 0.53 dejar). The differences in clitic case are quite small as shown in plot (C). Specifically, the relative frequency of the accusative clitic is 0.57 for dejar and 0.51 for hacer, while the dative clitic has a relative frequency of 0.43 with dejar and 0.49 with hacer. TENSE, in plot (D), clearly makes no distinction between the two clitics as both causatives are equally likely in either tense.
The boxplot in Figure 2 shows the overall distribution and mean of transitivity across the two causative verbs. The mean for dejar is 0.63 while for hacer it is 0.76. A Mann-Whitney-Wilcoxon test confirms this difference is statistically significant (w = 2,222,658, p < 0.0001, effect size r = 0.13). The plot also shows that hacer mostly appears in higher transitivity contexts with some outliers (the black dots) at the lower levels of transitivity, suggesting a more consistent and constrained behaviour. On the other hand, dejar covers a much broader transitivity range, suggesting it can appear in a larger number of transitivity contexts.

Model-1
Model-1 contains the TI as the only fixed-effect with a random slope on COUNTRY, and VERB as a random intercept. The results are shown in Table 5.
The Bayes factor (BF) for the TI is larger than 6,000 indicating that the index shows extremely robust evidence against the null hypothesis that transitivity does not account for causative verb choice. The positive coefficient mean estimate of 3.52 (CI = 2.36, 4.53) means that an increase of transitivity favours hacer. Thus, hacer is associated with higher levels of transitivity as is visually shown in Figure 3. This figure shows the predicted probability of hacer and dejar as a function of the TI per the model. At the lowest transitivity levels, dejar has a predicted probability of over 0.90 and this starts to decrease as the TI goes up. Clearly, the opposite is true for hacer, which reaches a probability of over 0.81 at the highest transitivity level.

Model-2
Model-2 contains the four additional variables. Model selection was conducted by comparing different models with the loo package (Vehtari et al. 2020), which performs leave-one-out cross-validation of each model (Vehtari et al. 2017). The best model contains the TI and CLITIC CASE as single terms and the interaction PERSON*NUMBER OF SUBJECT. In addition, the TI and CLITIC CASE were modelled as random slopes on COUNTRY, and VERB was a random intercept as in Model-1. Figure 4 shows the posterior distribution intervals of each variable in Model-2. Figure 4 indicates that hacer is less likely to appear with a singular non-third person causer (i.e., hacer is less likely with first or second persons). In addition, even though the model including CLITIC CASE was assessed as the best model (compared to models without this variable), the BF for CASE shows no evidence of an effect (BF = 0.094). This can be confirmed visually by looking at the posterior distribution for CASE. Since 0 is inside the posterior distribution, this means that it is possible that the coefficient estimate may be 0. The coefficient estimate for the TI is 4.13 (CI = 3.01, 5.04), which is larger than the estimate in Model-1, with a BF of 10,000.
Regarding the BF of the other variables in Model-2, PERSON shows strong evidence with a BF of 28.83 and NUMBER OF THE SUBJECT shows extreme evidence with a BF larger than 10,000. The interaction PERSON*NUMBER OF SUBJECT shows moderate evidence (BF = 6.10). Figure 5 shows the marginal effects of transitivity by causative predicate. As in Model-1, we see that an increase in transitivity lowers the probability of dejar and increases that of hacer. However, there are some differences between these predictions and the predictions of Model-1, so I discuss these in Section 6.4. Figure 6 shows the marginal effects of the interaction PERSON*NUMBER OF SUBJECT, which show the predicted median of all drawn posterior samples. The confidence intervals are Bayesian predictive intervals. The interaction PERSON*NUMBER OF SUBJECT is driven by third person subjects, which behave differently based on whether they are singular or plural. More specifically, a singular third person subject favours hacer with a predicted probability of 0.73, while this figure goes down to 0.47 if the subject is plural.   Transitivity on a continuum

Model comparison
As I said above, I am also interested in the predictive power of the models. The goal is to find out how much of the data can be explained by the TI alone and how much the model improves by the addition of the three variables in Model-2. I will compare the models' performance both on training and testing data to assess how well the model can generalize beyond the training set. The confusion matrix with the models' performance on training and new data appears in Table 6.
Both models perform equally well on the training data and nearly the same also on the testing set. The models reach 0.87 and 0.80-0.81 accuracy on training and new data, respectively. Most of the improvement of Model-2 is obtained because it correctly predicts more cases of dejar than Model-1. While Model-1 can correctly predict 322 cases of dejar Model-2 predicts 330 (i.e., a 2% increase). If you recall from the boxplot in Figure 2, dejar occupied a much wider range of transitivity, so Model-1 has more difficulty in predicting dejar based solely on transitivity. The addition of the three variables PERSON, NUMBER OF SUBJECT and CASE appears to help Model-2 identify better the contexts in which dejar is most likely.
In the previous section, I presented the predicted probabilities of each causative as a function of the TI in Figures 3 and 6, and, while both models show that high transitivity is associated with hacer, I also noted that the predicted probabilities are not exactly the same. More concretely, the probability range of dejar in Model-1 is 0.89-0.19, this means that at the lowest level of the index dejar has a probability of 0.89 and at the highest end of transitivity its probability goes down to 0.19. In Model-2, the probability range changes to 0.96-0.30 and the predictions in Table 6 show that Model-2 is slightly better at predicting dejar than Model-1, suggesting that this probability range is a much more accurate representation of the contexts for dejar. Thus, once other variables are controlled for, the probabilities of dejar increase but the range also becomes slightly narrower because the contexts are more constrained. The same observations apply to hacer but in the opposite direction. That is, the range for hacer in Model-1 is 0.81-0.11 and in Model-2 0.70-0.04. This shows that hacer is extremely unlikely at the lowest levels of transitivity, but it also shows that when other variables are controlled for, the probability of hacer at the highest levels of transitivity is lower.
Another important comparison is the effect size of the TI. In Model-1, the coefficient estimate is 3.53 (CI = 2.34, 4.53) and in Model-2 the estimate is 4.13 (CI = 3.01, 5.04). This results in an effect size of the TI of 34.12 in Model-1 and 62.08 in Model-2. Thus, the effect size is larger in Model-2 with a narrower coefficient's credible interval. This indicates that the effect of the index is larger and that its certainty is higher. Consequently, when other predictors can account for some of the variance in the data, the TI becomes an even stronger predictor of the causatives.

Discussion
In general, the analysis shows that the TI can correctly predict the causative predicates. I will now address the hypotheses laid out in Section 4 and then I will discuss how these results compare to previous studies. I will conclude with possible future avenues of research.

Hypotheses
Hypothesis 1: The causatives dejar and hacer will be predictable from the transitivity parameters.
Hypothesis 1 is clearly borne out. We saw in the model comparison that transitivity alone can correctly predict 80% of the sentences, and additional variables such as PERSON and NUMBER OF SUBJECT minimally increase the predictive power of the model. This is a clear indication that the lexical choice between dejar and hacer is highly influenced by the semantic features comprising transitivity.
Hypothesis 2: The causative hacer will be more transitive than dejar.
The results also support Hypothesis 2. As is clear from Figures 3 and 6, the chances of hacer increase and those of dejar decrease as transitivity goes up. Of course, this is true of both models with or without the additional variables. In fact, the effect size of the TI in Model-2 is larger than in Model-1. We saw that the increase in predictive power of Model-2, albeit small, was mostly due to an increase in accuracy in predicting dejar, confirming the observation in Section 6.1 that dejar covers a much wider range of transitivity and, therefore, other features of the clause might be necessary to better characterise the semantic contexts of dejar.
Hypothesis 3: The case of the clitic will be a reliable cue for causative choice.
Hypothesis 3 was not borne out. We saw in Figure 1 that both causatives appear with both clitics at very similar proportions. This was confirmed in Model-2, where the BF for CLITIC CASE was 0.092, suggesting that clitic case is not predictive of the causatives.

Comparison with previous findings
In this section, I evaluate the following three claims by Moore (1996), Ruiz-Sánchez (2006) and Enghels (2012).
(a) Hacer places selectional restrictions on the causee such that it can only take an accusative object provided the causee is animate (Moore 1996) (b) Intentionality has been attributed to hacer but lack of intentionality to dejar (Ruiz-Sánchez 2006) (c) The dative clitic is more common than the accusative with both causatives (Enghels 2012 To assess the claim in (b) we must look at AGENCY OF SUBJECT. If the claim holds, then it is expected that agentive subjects will appear with hacer at a higher proportion than with dejar. The data show that dejar appears 67% of the time with an agentive subject while hacer only appears 42% of the time in this context. Therefore, agentive subjects disfavour hacer and the claim is not supported by the data. The examples in (8) illustrate the type of non-agentive subjects with hacer in the sample. The difference between these results and Ruiz-Sánchez's is probably because she studied animate subjects and the present study includes all types of subjects. Thus, hacer may appear more often with agentive subjects when these are animate. Unfortunately, I did not code for animacy of the subject, so I leave this possibility as an open question. The claim in (c) says that both causatives should appear more often with the dative clitic. As shown in Figure 1C, hacer appears equally likely with both clitics (0.51 accusative vs. 0.49 dative) whereas dejar appears slightly more often with the accusative clitic. However, we found no evidence of CLITIC CASE being a predictor for the causative verbs in Model-2. At the very least, the raw data indicates that we cannot generalise that the dative clitic appears more often than the accusative clitic in this construction in the Spanish varieties included in the study. A likely explanation for the difference between the present study and Enghels's (2012) study is that she focused only on Peninsular Spanish. When Peninsular Spanish is removed from the sample, the amount of data containing dative clitics decreases substantially so we find no preference for the dative clitic.
In sum, the data in the present study do not support the above claims. Needless to say, there can be a myriad of explanations for why the results differ. First, the present study focused on American varieties of Spanish whereas most of the studies discussed have focused on Peninsular Spanish or have discussed the construction more generally without mentioning a specific variety. Second, the scope and data type of the studies also differ. For example, Ruiz-Sánchez (2006) is only concerned with animate causers and uses self-constructed examples and Moore (1996) is a theoretically focused investigation that also uses introspective examples.
Before I conclude, I would like to highlight the significance that transitivity can predict a lexical difference so well. In most, if not all, transitivity studies the phenomenon investigated usually involves a choice of nearly synonymous expressions, where the phenomenon per se is of a grammatical nature (e.g., clitic case with reverse psychological predicates, case marking of arguments, differential object marking, possessive structures, etc.). In this paper, transitivity is used to study two causative verbs that are not synonymous with each other and use of one usually precludes use of the other. Naively speaking, lexical choice should be driven by the speaker's intentionality of what they want to express and should therefore be independent of the linguistic context in which the lexical item appears (excluding idioms and collocations). At least, that is what one would expect if no other assumptions were made. The results presented herein suggest that, even at the lexical level, the linguistic context plays a crucial role in favouring one lexical item over another. This idea is not new and it has been studied previously in corpus linguistics (e.g., Gries 2010; Gries and Divjak 2009;Heylen et al. 2012Heylen et al. , 2015Schütze 1998) but what is new about it is the application of transitivity to a lexical alternation. In light of this finding, one cannot help but wonder where this type of information could be stored. Is transitivity part of the semantics of lexical entries, or is it an epiphenomenon simply arising from the semantics of the causative predicates? Ganeshan (2019) claims that the lexical entries of reversepsychological predicates must be based on transitivity and causation. In my view, it is more likely that transitivity is an epiphenomenon resulting from the types of situations that different predicates describe, rather than being an inherent component of the semantics of a predicate that must be included in the lexicon. In fact, positing that transitivity should be part of a verb's lexical entry runs counter to the core proposal of Hopper and Thompson (1980) that transitivity is a property of the whole clause and not just the verb. Based on what we know from grammaticalisation research, it is likely that morphosyntactic reflexes of transitivity arise exactly from the co-occurrence of a morpheme and a specific semantic context. There is no need for this information to be specified in the lexicon.
Last, but not least, an additional implication of any corpus-based statistical analysis is the question of whether these models represent natural grammatical systems or are simply statistical descriptions of linguistic data Milin et al. 2016). Needless to say, these are complex empirical questions but ones that I contend we should pursue.

Future research and directions
The results reported in the present study open the door for the transitivity parameters to a broader range of linguistic phenomena. Most research has tended to focus on a subset of the parameters and transitivity has mostly been used to study grammatical alternations that are similar in meaning. A natural next step is to investigate what other non-synonymous lexical alternations can be accounted for by transitivity. In addition, those phenomena that have been found to be sensitive to some of the transitivity parameters should be put under statistical scrutiny to see whether the claims still stand.

Conclusion
In this paper, I applied the TI to the study of two causative predicates in Spanish. By means of advanced statistical analyses, I have shown that the two Spanish causatives dejar and hacer can be accurately predicted by the TI. The models show that causative hacer is associated with higher levels of transitivity, and person and number features of the causer are also significant predictors that help distinguish between the two predicates.
The method used to calculate the TI solves one of the most problematic issues with Hopper and Thompson's parameters, namely the lack of hierarchical structure among the parameters. This hierarchy is dynamic and re-calculated every time the index is applied on a new construction, making it ideal for cross-linguistic comparisons along the transitivity continuum.