A variationist perspective on the comparative complexity of four registers at the intersection of mode and formality

: In this paper, we operationalize register differences at the intersection of formality and mode, and distinguish four broad register categories: spoken informal (conversations), spoken formal (parliamentary debates), written informal (blogs), and written formal (newspaper articles). We are specifically interested in the comparative probabilistic/variationist complexity of these registers – when speakers have grammatical choices, are the probabilistic grammars regulating these choices more or less complex in particular registers than in others? Based on multivariate modeling of richly annotated datasets covering three grammatical alternations in two languages (English and Dutch), we assess the complexity of probabilistic grammars by drawing on three criteria: (a) the number of constraints on variant choice, (b) the number of interactions between constraints, and (c) the relative importance of lexical conditioning. Analysis shows that contrary to theorizing in variationist sociolinguistics, probabilistic complexity differences between registers are not quantitatively simple: formal registers are consistently the most complex ones, while spoken registers are the least complex ones. The most complex register under study is written-formal quality newspaper writing. We submit that the complexity differentials we uncover are a function of acquisitional dif ﬁ culty, of on-line processing limitations, and of normative pressures.


Introduction
This paper marries the study of language complexity to register analysis via the variationist method. In what follows, we discuss each of these research orientations in turn. The study of LANGUAGE COMPLEXITY has a long history in applied linguistics and Second Language Acquisition (SLA) research, where researchers typically consider complexity a gauge for learners' proficiency in the target language, a descriptor for performance, and an index to benchmark development (see Ortega 2012 for discussion). But more recently, language complexity has also attracted attention in more theoretically oriented fields such as cross-linguistic typology, dialectology, and sociolinguistics (see Ehret et al. 2021 for a recent literature review). We now know that human languages, and varieties/dialects of the same language, can differ in their complexityfor example, the grammars of creole languages are demonstrably simpler than the grammars of "older" languages (McWhorter 2001). Complexity differentials across languages or dialects are theoretically interesting because they cannot have biological or communicative reasons: human beingswhatever their language backgroundare endowed with the exact same linguistic capacities, and especially in elementary speech situations, languages tend to have similarly complex communicative functions. With all that being said, a crucial theme in the empirical literature on language complexity concerns the question of how to best measure complexity. Many analysts distinguish between measures of absolute complexity and measures of relative complexity (Miestamo 2009). Relative complexity measures boil down to difficulty, for example, for language users and language learners (Kusters 2003;Trudgill 2001). Absolute complexity measures (also known as "objective" measures) include quantitative complexity (e.g., the length of the minimal description of a linguistic system; Dahl 2004), redundancy-induced complexity (a.k.a. "baroque accretion"; see McWhorter 2001: 126), or irregularity-induced complexity (see e.g., Nichols 2013).
We have mentioned before the mantra that in elementary speech situations, languages tend to have similarly complex communicative functions. But of course not all speech situations are (equally) elementary, and so one would naturally expect that situational language varieties, a.k.a. registers, differ in complexity. This is why in the field of REGISTER STUDIES, the issue of (comparative) complexity within and across registers (with a particular focus on specialized registers, such as academic prose) has received considerable attention as well. In register studies, the dominant approach is text-linguistic in nature, and as such uses methodologies such as Multidimensional (MD) analysis (see, e.g., Biber 1988;Biber and Conrad 2012). MD analysis is designed to investigate the functional relationship between linguistic variation and the situational context: in a bottom-up fashion, registers are identified based on co-occurrence patterns of linguistic features in individual corpus texts. Building on the sizable text-linguistic literature utilizing MD analysis, the Register-Functional (RF) approach to grammatical complexity (see the papers in Biber et al. 2021a) draws inspiration from the fact that variation is inherent in human language, and that text-linguistic variation has a functionalcontextual basis. But unlike classical complexity research (e.g., in SLA research and in applied linguistics), the RF approach eschews the customary notion that textual complexity is proportional to structural length and elaboration (see, in particular, Biber et al. 2011).
Similar to RF approaches, the study we report in this paper avoids simplistic complexity measures such as structural length and elaboration. But unlike RF approaches, we will not be engaging in text-linguistic analysis either. Rather, we seek new horizons by exploring the intersection between register studies, complexity analysis, and VARIATIONIST LINGUISTICS. Variationist linguistics takes an interest in variation between "alternate ways of saying 'the same' thing" (Labov 1966: 188). The focus is therefore on intra-speaker variability (or "variability in the linguistic signal within a given language", in the parlance of van Hout and Muysken 2016: 250). Variationist linguists carefully account for competing variants and use quantitative (typically multivariate/multifactorial) methodologies to model the probabilistic factors that constrain how language users choose between semantically and functionally equivalent variants (for seminal work, see Labov 1969;Gries 2003;Bresnan et al. 2007). Crucially, the basic unit of analysis in variationist linguistics is individual linguistic choices (and not texts, as in textlinguistic approaches). This is another way of saying that variationist linguists are primarily interested in the probabilistic conditioning of linguistic choices (and not so much in text frequencies, as in text linguistics). For a more detailed discussion of the differences between text linguistics and variationist linguistics, see Biber et al. (2016) and Szmrecsanyi (2019).
PROBABILISTIC GRAMMARS (a.k.a. "variable grammars" in variationist sociolinguistics parlance), then, refer to the set of constraints and their probabilistic effects on how people choose between variants of a particular alternation/variable (Szmrecsanyi et al. 2019: 2). We exemplify with the well-known dative alternation in English. To encode dative relations, speakers and writers of English may use two semantically and functionally roughly equivalent structural patterns: the ditransitive dative variant, as in for example, someone gives you a DVD, and the prepositional dative variant, as in someone gives a DVD to you (see Section 2.1 for more discussion and exemplification). Bresnan et al. (2007) explore the conditioning factors that constrain language users' dative choices in American English (telephone) conversations, as sampled in the Switchboard corpus of US American English (Godfrey et al. 1992). Bresnan et al. find that in that particular register and regional variety, variation between the two dative options is constrained by about 10 probabilistic constraints, including, for example, pronominality of the recipient/theme, discourse accessibility (pragmatics), constituent length, and animacy of the recipient. If, for example, the recipient is inanimate (as in the browser sends a connection request to that site) instead of animate (as in someone gives you a DVD), Bresnan et al.'s regression model predicts that the odds for the prepositional dative increase by a factor of about 4. This is the probabilistic effect that inanimate recipients have on dative choice in telephone conversations. The combined effects of all relevant constraints, then, constitutes the probabilistic grammar of the dative alternation.
Previous work indicates that probabilistic grammars, for instance of the dative alternation in English, may be register-specific, in that constraints may have different effects in different registers (Engel et al. 2021;Szmrecsanyi and Engel 2021). In this paper, we investigate if register-specific probabilistic grammars also differ in terms of complexity. We are specifically going to be interested in the comparative probabilistic complexity of registers, and take a variationist approach to conceptualizing complexity. When speakers have grammatical choices, are the probabilistic grammars regulating these choices more or less complex in particular registers than in others?
This paper operationalizes register differences as follows. We investigate four registers defined at the intersection of formality and mode in line with Koch and Oesterreicher (2012). Formality is defined by situational characteristics such as the setting (private vs. public), topic/communicative purpose, and relationship between participants. In this view, formality constitutes a continuum, but note that the registers chosen here do not necessarily represent endpoints of this continuum. We specifically cover the following registers: -SPOKEN INFORMAL: conversations between family members and friends -SPOKEN FORMAL: parliamentary debates -WRITTEN INFORMAL: blogs/chats -WRITTEN FORMAL: quality newspaper articles On the linguistic side, as case studies we will be studying three grammatical alternations: the dative alternation in English (someone gives you a DVD ∼ someone gives a DVD to you); the future temporal reference alternation in English (alien life forms we will come across ∼ alien life forms we are going to come across); and the dative alternation in Dutch (ik geef Sophieke ne friet ∼ ik geef ne friet aan Sophieke). More information about these alternations will be provided in Section 2.
On the operational plane, we assess the complexity of probabilistic grammars by drawing on three criteria: (a) the number of constraints (e.g., recipient animacy in the dative alternation) on variant choice, (b) the number of interactions between constraints (e.g., in the dative alternation, does recipient animacy have a stronger effect when the recipient is comparatively short?), and (c) the relative importance of lexical conditioning (e.g., in the dative alternation, do particular recipient lemmas strongly favor either dative variant?). The probabilistic complexity measures we utilize here pertain to absolute as well as relative complexity. For one thing, increases in the number of constraints, in the number of interactions between the constraints, and in the importance of lexical conditioning will increase the length of the description of probabilistic grammars, thus adding to absolute complexity. At the same time, it is plausible that absolutely more complex (that is, lengthier) grammars are, all other things being equal, more difficult to acquire and handle for language users, thus increasing relative complexity (see, e.g., Audring 2017: fn1).
The rationale behind our investigation is that the comparative probabilistic complexity of registers is largely unexplored, but has important implications for theorizing in variation studies and beyond. The customary view in variationist sociolinguistics is that stylistic variationand, by extension, register variationare "quantitatively simple" (Guy 2005: 562; see also e.g., Labov 2010: 265;Rickford 2014: 596). This is another way of saying that probabilistic grammars should be fairly identical across registers. Hence, major complexity differentials across stylistic and (again, by extension) situational varieties are not predicted. In this paper, we rely on multivariate/multifactorial analysis (specifically, mixed-effects logistic regression analysis and conditional inference tree analysis) to empirically debate this prediction, the research question being: Is it true that probabilistic complexity does not vary substantially as a function of register? Analysis shows that probabilistic complexity differences between registers are, in fact, not quantitatively simple.
This paper is structured as follows. Section 2 will discuss the alternations, methods, and data sources. Section 3 explains the analysis techniques and complexity criteria that this study relies on. Section 4 presents the results. In Section 5, we discuss our findings against the backdrop of what we know about register and complexity variation. Section 6 offers some concluding remarks.

Alternations and data
In this paper, we investigate three alternation patterns, two in English and one in Dutch. For English, we cover the dative alternation and the future temporal reference alternation. For Dutch, we will investigate the dative alternation, which is formally rather parallel to the dative alternation in English. We would like to emphasize that in the context of the present paper, the selection of alternations subject to study is not a conceptual issue. As in Biber (1988)-inspired research, we use alternations as features to assess complexity differentials across registers. That being said, the rationale behind the selection of these particular alternations is the following. There is a huge variationist literature on both the dative alternation in English and the future temporal reference alternation in English, so that we can draw on a sizable body of literature. Note also that the two alternations cover different types of grammatical variation: The dative alternation is essentially a constituent order alternation, while the future temporal reference alternation is a substitution alternation in which different modal verb constructions alternate. Further, we study variation in English because register differences in English are comparatively well understoodconsider that much previous MD and RF work is concerned with English. The case study on the Dutch dative alternation we include as a cross-linguistic plausibility check: Do complexity differences in Dutch, a language that is closely related to English, resemble those that we see in English? The dative alternation is a suitable target phenomenon in this context because of its formal and functional similarity in both English and Dutch.
In the remainder of this section, we sketch the alternations and datasets subject to analysis. Detailed dataset documentations are available as supplementary materials at https://osf.io/gdmtr/.

The dative alternation in English
In English, there are two ways of expressing dative relations: the ditransitive dative construction (recipient-theme order), and the prepositional dative construction (theme-recipient order), as in (1) This alternation is very well understood. In their seminal study, Bresnan et al. (2007) discussed in the Introduction sectionidentify 10 probabilistic constraints on the dative alternation. Recent work has demonstrated that probabilistic constraints such as animacy and length have been subject to weakening in real time (Wolk et al. 2013), while the effects of pronominality and length differ across regional varieties (Röthlisberger et al. 2017). Importantly, there is also preliminary evidence that the effect of theme definiteness differs between spoken and written English (Theijssen et al. 2013), suggesting that probabilistic grammars may be register-specific.

Corpus materials
We tap into the following data sources: -SPOKEN INFORMAL: conversations between family members and friends as provided in the Spoken BNC 2014 (∼11.4 million words) (Love et al. 2017); private setting, private topics and various communicative purposes (Biber et al. 2021a), mostly close relationships between the participants -SPOKEN FORMAL: parliamentary debates from the House of Commons (∼59.4 million words) (Marx and Schuth 2010); public/institutionalized setting, public topics, persuasive and informative purposes (see Ilie 2015), (mostly) distant relationships between the participants (members of own party or opposition party, wider public) -WRITTEN INFORMAL: British English blogs part of the GloWbE (∼148 million words) (Davies and Fuchs 2015); public setting, private and public topics, various communicative purposes (Biber and Egbert 2018), mixed audience: writers and readers may or may not know each other closely -WRITTEN FORMAL: online newspaper articles which have been scraped from the websites of The Independent (∼113.5 million words) (Bušta et al. 2017); public setting and topics, informative purpose, distant relationships between writer(s) and reader(s) We randomly sampled 650 variable tokens of the dative alternation with give per register, yielding a dataset of N total = 2,600 observations. This dataset had a balanced distribution of dative variants per register: from each corpus, we included an equal number of ditransitive dative and prepositional dative tokens. Tokens were automatically extracted and sampled, manually checked for the variable context, and manually annotated for the constraints subject to analysis.

Defining the variable context
For reasons of space, we keep the description of the variable context short and to the point. Consistent with previous research (Bresnan et al. 2007;Röthlisberger et al. 2017;Theijssen et al. 2013), we excluded instances of non-canonical word order, particle verbs, passivized and relativized constituents, clausal or gerundial constituents, and fixed expressions. The full definition including exemplification can be found in the supplementary materials, available at https://osf.io/gdmtr/.

Annotation for probabilistic constraints
In what follows, we give a short overview of the constraints subject to analysis. Again, we refer the reader to the documentation for more detail (https://osf.io/gdmtr/).
Most of the following probabilistic constraints (except for verb sense, which is not constituent-specific) were annotated for both the recipient and the theme constituents separately.

The future temporal reference alternation in English
Future temporal reference (henceforth: FTR) in English is overtly expressed by means of either the modal verb will, or the semi-modal construction be going to. Consider (8): In practice, however, experts think the most likely alien life forms we will come across are going to be some kind of alien microbes. (The Independent, 2018-07-02) The variationist literature suggests that the choice between will and be going to is conditioned by factors such as clause type and sentence type: be going to is favored in subordinate clausesespecially protasesand interrogative sentences, while will is favored in main clauses and declarative sentences (e.g., Denis and Taglia Other predictors, such as proximity of future time reference or polarity are more variable and may be subject to grammaticalization processes (Tagliamonte et al. 2014). As to register differences, we know that be going to is more common in informal language use than in formal language use (see, e.g., Mair 1997).

Corpus materials
Tokens were retrieved from the same corpus materials as in our study on the English dative alternation (see Section 2.1.1). Again, we created a random balanced sample of N total = 2,600 observations of will and be going to.

Defining the variable context
Following Denis and Tagliamonte (2018), we excluded be going to in past tense contexts, and will in tag-questions (for more details, see the supplementary materials at https://osf.io/gdmtr/). We also distinguish between apodosis (11a) versus protasis (11b) in conditional clauses versus non-conditional clauses (11c) (11) a. apodosis: If you put that in the washing machine it'll be fine (BNC2014, SUVQ) b. protasis: It should be proof that if mitochondrial treatment is going to take place then doctors should begin as soon as they can, they said. (13) a. first person: It looks excellent and well worth reading, I am sure I could learn from it so I am going to buy it! (GloWbE-GB, blogs) b. second person: You'll have to de-stress (BNC2014, S6YA) c. third person: Ofcom is going to take up the cudgels, and I hope it will do that sooner rather than later. Most previous research on the Dutch dative alternation is not particularly datadriven. One notable exception is Geleyn (2017), who on the basis of a corpus consisting of novels demonstrates that the Dutch dative alternation is governed by largely the same predictors as the English dative alternation.

Corpus materials
The aim was to select corpora that are maximally comparable to those investigated for English. In this endeavor, we restricted attention to Belgian Dutch (as opposed to Netherlandic Dutch).
-SPOKEN INFORMAL: informal conversations between family members and friends (face-to-face and via telephone), and informal interviews with teachers of Dutch as provided in the Spoken Dutch Corpus (∼2 million words) (Oostdijk 2000); private setting, private topics and various communicative purposes (Biber et al. 2021a), mostly close relationships between the participants -SPOKEN FORMAL: parliamentary debates from the Flemish parliament (∼9.2 million words) (Marx and Schuth 2010); public/institutionalized setting, public topics, persuasive and informative purposes (Ilie 2015), (mostly) distant relationships between the participants (members of own party or opposition party, wider public) -WRITTEN INFORMAL: chats from the Dutch SoNaR corpus (∼10 million words) (Oostdijk 2000); public setting, private and public topics, various communicative purposes, mixed audience: writers and readers may or may not know each other closely -WRITTEN FORMAL: online newspaper articles which have been scraped from the websites of De Morgen (∼22 million words) (Bušta et al. 2017); public setting and topics, informative purpose, distant relationships between writer(s) and reader(s) We sampled tokens of both variants from these corpora. Due to sparse data in the informal corpora, the resulting dataset of N total = 2,110 tokens is not perfectly balanced across registers: there are only 220 tokens (189 ditransitive dative, 31 prepositional dative) from the spoken informal register, and 590 tokens (325 ditransitive dative, 265 prepositional dative) from the written informal register, while 650 tokens are included from both formal registers. Keeping in mind the customary rule of thumb that the maximal number of parameters in a regression model should be calculated as the frequency of the less frequent outcome divided by 10 (Hosmer and Lemeshow 2000: 346-347), this means that the regression model covering the spoken informal register will be unreliable. We will be flagging this issue in Sections 4 and 5.

Defining the variable context
The variable context was defined in the same way as for the English dative alternation. The full definition, including exemplification, is available in the supplementary materials at https://osf.io/gdmtr/.

Annotation for probabilistic constraints
The predictor set matches that of the English dative alternation. For more details, see the supplementary materials at https://osf.io/gdmtr/.
-ANIMACY: animate (humans, animals) versus inanimate (including collective nouns), as in (19) (22): Als ik een klas van twintig leerlingen heb dan probeer ik het eerste If I a class of twenty students have then try I the first trimester uhm [ de leerlingen elk] given [ een taak] new te geven. trimester uhm the students each a task to give. 'If I have a class of 20 students, I try to give each student a task in the first trimester.' (Spoken Dutch Corpus, fv400152) -COMPLEXITY: simple versus complex (post-modification of the head), as in (23)

Analysis methods and complexity criteria
We will use two analysis methods in this paper. The first one, binary logistic regression analysis, is the workhorse analysis technique in variationist linguistics, and quantifies the simultaneous effect of multiple individual explanatory factors (or: constraints/predictors/independent variables) on a binary dependent variable, such as dative or future marker outcomes. We utilize a modern refinement of logistic regression analysis known as mixed-effects logistic regression (Pinheiro and Bates 2000). In addition to so-called fixed effectswhich are classically estimated predictors suited for assessing predictor significance and effect directionsmixed-effects modeling also cover so-called random effects designed to capture variation dependent on open-ended, potentially hierarchical and unbalanced groups (see Wolk et al. 2013: 399-400 for more discussion). Specifically, we will include in our analysis so-called by-item random effects to measure lexical effects: what is the extent to which, for example, particular theme lemmas favor particular dative constructions? What is the extent to which particular lexical verbs favor particular future markers? This is information that mixed-effects regression models are ideally suited to provide.
In addition, we will be conducting a series of conditional inference tree analyses. Szmrecsanyi et al. (2016: 113-114) succinctly summarize the basic idea behind conditional inference trees as follows: conditional inference trees […] predict outcomes by recursively partitioning the data into smaller and smaller subsets according to those predictors that co-vary most strongly with the outcome. Informally, binary splits in the data are made by trying to maximize the homogeneity or "purity" of the data partitions with respect to the values of the outcome (e.g., all s-genitives vs. all of-genitives). At each step, the dataset is recursively inspected to determine the variable (and its values) that yields the purest split in the data. This splitting process is repeated until no further split that significantly reduces the impurity of the data partitions can be found. The result is visualized as a flowchart-like decision tree.
Crucially, each split can be interpreted as an interaction between predictors: the more one moves downward a tree, the more the effect of particular predictors depends on predictors further up in the tree. For an accessible introduction to conditional inference tree analysis from a linguistic angle, see Tagliamonte and Baayen (2012). Conditional inference trees are therefore extremely convenient and elegant tools to detect interaction effects in a bottom-up fashion, without the model fitting/model selection/model pruning complexities that regression models would incur.
Based on output from the aforementioned analysis methods, we subsequently use three criteria (or: metrics) to assess the complexity of probabilistic grammars: 1. SHEER NUMBER OF CONSTRAINTS ON VARIATION: This is the one probabilistic complexity criterion that has actually been proposed in the literature (Shin 2014): the number of constraints on variation is proportional to probabilistic complexity. Hence, a variety or register where, say, the dative alternation is constrained by 10 factors is probabilistically more complex (with regard to that alternation!) than a variety/register where we find only five significant constraints. The rationale is that (a) as per absolute-quantitative complexity definitions (Dahl 2004), more constraints require more description; and (b) probabilistic grammars involving more constraints are arguably harder to acquire and so more complex, as per the relative complexity criterion (Kusters 2003). We operationalize this criterion by counting the number of significant constraints in register-specific regression models. 1 2. NUMBER OF INTERACTIONS BETWEEN CONSTRAINTS: Even a small set of constraints can engender complexity if the constraints interact robustly. For example, the dative alternation may be constrained by no more than, say, three constraints in some hypothetical register, but three constraints may still give rise to up to three 2-way interactions and one three-way interaction, which yields a fairly complex constraint system after all. The rationale is that as per the definition of absolute-quantitative complexity, interactions require more description, and hence induce complexity. We operationalize this criterion by counting the number of nodes in conditional inference trees. 3. RELATIVE IMPORTANCE OF LEXICAL CONDITIONING: Some constraints on variation are lexical in nature. For example, it is well known that particular dative verbs may be biased towards particular dative variants (Gries and Stefanowitsch 2004).
Our assumption that lexical constraints induce probabilistic complexity is primarily motivated by relative complexity considerations: consider that while end weight effects, which are likely to be rooted in processing, seem to be acquired relatively early (Tanaka 1987), verb bias in the dative alternation appears to be observable only in advanced learners (Wolk et al. 2011), presumably because the acquisition of verb bias requires sufficient exposure to the 1 We acknowledge that the sheer number of constraints could probably also be extracted from conditional inference tree models (although isolating main effects from interaction effects would pose challenges). Because we need regression models anyway to assess criterion #3 (lexical conditioning), we choose to infer the sheer number of constraints from regression models, as in Shin (2014). We would further like to acknowledge that the identification of significant constraints relies on an interpretation of p-values at the customary alpha level of 0.05. Constraints that score p-values greater than 0.05 will not count as significant constraints. Needless to say, absence of statistical significance does not equate with evidence of absence of substantial significance. The full label for criterion #1 is therefore "sheer number of constraints that pass the customary significance threshold of alpha = 0.05". specific lexical items. Further, as regards absolute complexity it is clear that lexical conditioning will increase the length of the shortest possible description of probabilistic grammars. We operationalize this criterion by including in mixed-effects regression modeling by-item lexical random effects, and subsequently gauging the explanatory power of these random effects.
It should be noted here that from a purely statistical point of view, criterion 1 (sheer number of constraints) and criterion 2 (number of interactions) both address quantitative model complexity: in conjunction, these criteria gauge the number of parameters in multivariate models. However, on the interpretational plane, main effects and interactions strike as conceptually distinct in nature, which is why we would like to keep them apart analytically. In all, then, our results will be based on 24 multivariate models (12 regression models and 12 conditional inference treesfour each per alternation). This comparatively large number of models may strike some as excessive at first glance. Consider, however, that we investigate three alternations in four registers with regard to three complexity criteria. Consider also that text-linguistic studies on register in the spirit of Biber (1988) are based on a frequency analysis of dozens of features in potentially hundreds of texts, which can easily sum up to thousands of frequency measurementsand few would argue that this rich empirical basis is excessive. Because in this paper we engage in variationist/probabilistic linguistics, we need probabilistic measurements, and these (unlike frequency measurements) require prior multivariate modeling. This is why the number of models we calculate is proportionate, given our research objectives.

Results
This section reports the multivariate models upon which our complexity analysis is based. Specifically, for each of the three alternations, we report register-specific regression models (relevant to complexity criteria #1 and #3; see previous Section) and register-specific conditional inference trees (complexity criterion #2). Because of the comparatively large number of models we report, we will keep the summaries below short and to the point. Table 1 indicates that the regression model for the written formal register includes seven significant constraints, while the model for the written informal register includes six; regression analysis further shows that the model for the spoken informal register includes only five significant constraints, and that the model for the spoken formal register is the least complex one, with only four significant constraints. Interestingly, the four registers differ not only with regard to the number of constraints involved, but also in terms of the set of constraints. Apart from WEIGHT RATIO, none of the other constraints are consistently included in all four probabilistic grammars.

Regression analysis
As shown in Table 2, fixed (i.e. non-lexical) effects alone explain more variance in the two informal registers than in the two formal registers (as evidenced by the R 2 -values). The σ 2 ThemeLemma values indicate that the random effect for theme lemma covers most variance in the spoken informal register, and least variance in the spoken formal register. Table : Overview of significant and non-significant constraints in the English dative alternation based on logistic mixed-effects regression analysis of four registers. Significance codes: '***'p < .; '**'p < .; '*'p < .; 'ns'p ≥ .. C-index values range between . and  and indicate the goodness-of-fit of the regression models: the closer to , the better the fit.

Conditional inference tree analysis
Conditional inference tree analysis also reveals complexity differentials. By counting the sheer number of constraints, informal registers (Figures 1 and 2) seem to be less complex than formal registers (Figures 3 and 4).  Relative importance of lexical conditioning in the English dative alternation. Lexical random effect (intercept adjustment) subject to analysis: THEME LEMMA (RECIPIENT LEMMA did not turn out to have substantial explanatory power). R  marginal : % variance explained by fixed effects; R  conditional : % variance explained by both random and fixed effects; σ  : mean random effect variance of the model; SD: standard deviation; N: number of theme lemmas distinguished in the model (infrequent lemmas were binned into an 'other' category).

Spoken informal
Spoken formal Written informal Written formal

Regression analysis
Results from regression models are summarized in Table 3. The model with the largest number of significant constraintssix to be exactis the model for the spoken formal register. The model for the written formal register includes five significant constraints, while the models covering the informal registers include four significant constraints. As for the specific constraints involved, we observe some overlap between the spoken formal, written informal, and written formal registers.  Overview of significant and non-significant constraints in the English FTR alternation based on logistic mixed-effects regression analysis of four registers. Significance codes: '***'p < .; '**'p < .; '*'p < .; 'ns'p ≥ .. C-index values range between . and  and indicate the goodness-of-fit of the regression models: the closer to , the better the fit.

Written informal (C = .)
Written formal (C = .) Lexical conditioning seems to be most powerful in the spoken formal register, and least important in the spoken informal register, where the lexical random effect is actually not significantly improving model fit, as shown in Table 4. The power of lexical conditioning is about equal in both written registers.

Conditional inference tree analysis
Conditional inference tree analysis suggests that there are only two interactions (nodes) in the spoken informal register ( Figure 5). In the two formal registers Table : Relative importance of lexical conditioning in the English FTR alternation. Lexical random effect (intercept adjustment) subject to analysis: LEXICAL VERB LEMMA. R  marginal : % variance explained by fixed effects; R  conditional : % variance explained by both random and fixed effects; σ  : mean random effect variance of the model; SD: standard deviation; N: number of theme lemmas distinguished in the model (infrequent lemmas were binned into an 'other' category).

Spoken informal
Spoken formal Written informal Written formal    A variationist perspective (Figures 6 and 8), conditional inference tree analysis uncovers four interactions each, while the conditional inference tree for the written informal register has five interactions (Figure 7).

Regression analysis
Mixed effects logistic regression analysis (Table 5) reveals that the models covering the informal registers both include five significant predictors, while the model for the spoken formal register includes six significant predictors. The model for the written formal register has seven significant predictors. We further note that WEIGHT RATIO and THEME DEFINITENESS are significant in all registers. However, THEME ANIMACY, THEME COMPLEXITY, THEME FREQUENCY, THEME THEMATICITY, and RECIPIENT GIVENNESS are not significant in any register. Both RECIPIENT LEMMA and THEME LEMMA lexically condition the Dutch dative alternation, although THEME LEMMA accounts for more variance than RECIPIENT LEMMA. Across Table : Overview of significant and non-significant constraints in the Dutch dative alternation based on logistic mixed-effects regression analysis of four registers. Significance codes: '***'p < .; '**'p < .; '*'p < .; 'ns'p ≥ .. C-index values range between . and  and indicate the goodness-of-fit of the regression models: the closer to , the better the fit. Note that because of data sparsity, the model covering the spoken informal register is unreliable.

Written informal (C = .)
Written formal (C = .) RECIPIENT ANIMACY ns * ns *** THEME ANIMACY ns ns ns ns RECIPIENT PRONOMINALITY ns ns *** ns THEME PRONOMINALITY * n s n s n s RECIPIENT DEFINITENESS * n s n s * THEME DEFINITENESS *** * *** *** RECIPIENT GIVENNESS ns ns ns ns THEME GIVENNESS ns ** ** ns RECIPIENT COMPLEXITY ns ns ns *** THEME COMPLEXITY ns ns ns ns WEIGHT RATIO (LENGTH) * *** ** *** RECIPIENT FREQUENCY ns ** ns * THEME FREQUENCY ns ns ns ns RECIPIENT THEMATICITY * n s n s n s THEME THEMATICITY ns ns ns ns VERB SENSE ns *** * *** registers, the random effect for RECIPIENT LEMMA explains most variance in the spoken formal register, and least variance in the written formal register (see Table 6). The random effect for THEME LEMMA explains most variance in the written informal register, and least variance in the spoken formal register. Lexical conditioning via THEME LEMMA is generally more important in the written registers than in the spoken registers.

Conditional inference tree analysis
The conditional inference tree for the spoken informal register includes three interactions (Figure 9), the tree for the spoken formal register includes six interactions (Figure 10), and the trees for the written informal ( Figure 11) and the written formal registers ( Figure 12) include five and seven interactions, respectively.

Discussion
Recall that in Section 3, we defined three criteria to assess the complexity of register-specific probabilistic grammars: 1. Sheer number of constraints on variation 2. Number of interactions between constraints 3. Relative importance of lexical conditioning In what follows, we evaluate the regression models and conditional inference trees reported in the preceding section with regard to these criteria. We begin with the dative alternation in English. The relevant information is summarized in Table 7. The overall complexity ranking, based on all criteria, is displayed in (25): 2  2 In determining the overall complexity ranking, all criteria were weighted equally. Registers were assumed to have the same rank when the rank aggregation differentials were smaller than 0.5.
Written formal > spoken informal ≈ written informal > spoken formal The fairly uncontested "winner" in terms of overall complexity is the written formal register. At the other end of the continuum, we find the spoken formal register, which is overall the least complex register (though we note that the spoken formal register does exhibit a fairly high number of interactions). Spoken informal and written informal take the middle road and are fairly similar in terms of overall complexity.
Information about the FTR alternation in English is provided in Table 8. The overall complexity ranking is shown in (26): Spoken formal ≈ written formal > written informal > spoken informal Spoken formal and written formal are overall about equally complex, with spoken formal having a slight edge. The clearly least complex register is spoken informal. Written informal exhibits medium complexity.
Lastly, Table 9 assesses the dative alternation in Dutch. The overall complexity ranking is shown in (27): Written formal ≈ spoken formal ≈ written informal > (spoken informal) Written formal, spoken formal, and written informal all have about the same level of overall complexity, though written formal is slightly ahead of the other two. The spoken informal register is substantially less complex than the other registers. That said, we bracket the spoken informal register because of unreliability due to data sparsity. The dative alternation in English: number of constraints (according to regression analysis), number of interactions (according to conditional inference tree analysis), and relative importance of lexical conditioning (according to regression analysis). The overall complexity ranking (rightmost column) is determined by calculating the arithmetic mean of the complexity ranks according to individual criteria. Lower-numbered ranks indicate more complexity.

Number of interactions
Relative importance of lexical conditioning (σ  ThemeLemma ) Rank aggregation (mean rank) In the big picture, then, we conclude that there is quite a bit of variability regarding the overall complexity rankings across case studies, in that the complexity hierarchies are far from identical. (In addition, of course, there is also a fair amount of inter-criterion variability in each case study.) With that being said, however, the following three generalizations emerge from the hierarchies in (25), (26), and (27): 1. Written formal is the most complex register in two of the three case studies covered in this paper (the dative alternation in English as well as in Dutch). In the English FTR case study, written formal and spoken formal actually share the top complexity rank. 2. When we restrict attention to the endpoints of the hierarchies, consistently across the three case studies formal registers are the most complex ones. 3. Also consistently across the three case studies, spoken registers are the least complex ones.
What does it all mean? Recall first that one of our theoretical points of departure in this paper was the prediction that variational differences between styles and, by extension, registers are "quantitatively simple" (Guy 2005: 562; see also e.g., Labov 2010: 265;Rickford 2014: 596). In other words, probabilistic grammars should be fairly identical across registers. Based on our data we can assert fairly confidently that in terms of the probabilistic conditioning of grammatical variation, differences between registers are not quantitatively simple: registers do differ in terms of probabilistic complexity.
Second, contra quantitative simplicity claims one could have plausibly hypothesized that the complexity of particular registers should be proportional to the extent to which (a) particular registers are harder to master than others, (b) production is unconstrained by on-line processing limitations, and (c) production is subject to normative pressures. And these hypotheses are more in line with our results. A case in point is our finding that formal registers are consistently the most complex ones in the case studies investigated here. This pattern certainly ties in with gut feelings that many register analysts share about complexity variance. Consider face-to-face conversation: Biber et al. (2021b: 3) note how "most native speakers can fluently produce conversational discourse […] with no special effort at all, in real time with no planning, revision, or editing"thus compared to a demonstrably formal register such as academic prose (which requires training to master and effort to produce), conversation would seem to be comparatively noncomplex. We submit that the formal registers we cover in our study (parliamentary speeches and quality newspaper writing) are, like academic prose, characterized by acquisitional complexity and production complexity, implying that not all language users equally master and/or acquire them.
As to differences between spoken and written registers, our analysis shows that the written formal register (quality newspaper writing) tends to be (one of the) most complex registers under study here, while spoken registers are consistently the least complex ones. Why is that? We know that spoken registers "are produced and comprehended in real-time, setting a cognitive ceiling for the syntactic and lexical complexity typically found in these [registers]" (Biber 1988: 163). The online nature of spoken language (especially conversation) is why the production of spoken language is subject to processing and production constraints and biases (e.g., Hawkins 1994;MacDonald 2013) in a way that the production of written language is not. Monitoring is another important factor here: while speech (and especially informal, vernacular speech) is "the style in which the minimum attention is given to the monitoring of speech" (Labov 1972: 208), written language is more "governed by prescription" (D'Arcy and Tagliamonte 2015: 255). Prescriptivist pressures and the additional monitoring that comes with them, in turn presumably induce probabilistic complexities in written registersand also, of course, in more formal registers.

Conclusion
In this paper, we have investigated the comparative probabilistic/variationist complexity of four registers at the intersection of formality and mode. We operationalized complexity as a function (a) of the extent to which variation is regulated by more language-internal constraints, (b) of the extent to which constraints interact with each other, and (c) of the extent to which variation is lexically conditioned. Based on datasets covering grammatical alternations in two Germanic languages (English and Dutch), we demonstrated that formal registers are consistently the most complex ones, while spoken registers are the least complex ones. The most complex register under study is written-formal quality newspaper writing. On the interpretational plane, we have argued that complexity differentials are a function of acquisitional difficulty, of on-line processing limitations, and of normative pressures.
The methodology that we have adopted in this study is novel in that the vast majority of previous studies on the register-complexity nexus have adopted textlinguistic methodologies in the spirit of Biber (1988). This line of research has yielded a large body of invaluable results, some of which we reviewed in the Introduction section. But with that being said, what we did in this study is to focus not on texts or text frequencies, but on choice contexts and on the probabilistic conditioning of grammatical variation: when speakers and writers have a choice between different ways of expressing the same meaning or function, how complex is the choice-making process? To be clear, we believe that this is a complementarynot competingperspective which is, however, well worth exploring more in future research. According to our results, register differences and complexity differentials between registers are not only about text frequencies and cooccurrence patterns of linguistic features. Rather, register also shapes the arguably deeper probabilistic conditioning of linguistic variation. In other words, register turns out to be an even more fundamental determinant of linguistic variation and of language complexity than has been traditionally recognized.
Limitations of this study and directions for future research include the following. For one thing, our comparatively coarse operationalization of register may have failed to capture finer-grained complexity differentials within individual registers. What is more, the registers we have selected for investigation in this paper may also vary in terms of communicative purposes, a fact that may be confounded with the concept of formality that we rely on (see e.g., Biber and Egbert 2018). Future research should investigate these issues in more detail. Second, more research examining other alternations (beyond the dative alternation and the future temporal reference alternation) is required for the sake of obtaining a fuller picture about register-induced complexity variation. Last but not least, the investigation needs to be extended to other languages beyond English and Dutch. This kind of analysis will open up exciting avenues for new research agendas at the intersection of variationist sociolinguistics, register analysis, and cross-linguistic typology.