Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
November 30, 2023
Abstract
Despite extensive research on the ba -construction in Chinese, the diachronic change in the alternation between the ba and jiang constructions has received little attention. The present study takes a multifactorial approach to examine the factors that probabilistically condition the alternation based on diachronic data across twelve centuries. The results suggest two general trends. First, the odds of the ba -construction have increased over time at the expense of the jiang -construction. Second, over time, the effect size of the significant preference for the jiang -construction in informal genres has reduced from the 10th to the 19th century, and this preference has disappeared in modern times; accordingly, both informal and formal genres have converged to favor the ba -construction in modern times. Regression modeling also shows that there are both stable linguistic constraints (parallelism/syntactic priming, verb type, NP2 animacy, and NP2 length) and fluid constraints (adjunct semantics, and genre). This study advances our knowledge of the two disposal constructions and their evolution, sheds light on the Principle of No Synonymy (Bolinger, Dwight. 1977. Meaning and form . New York: Longman; Goldberg, Adele E. 1995. Constructions: A construction grammar approach to argument structure . Chicago: The University of Chicago Press; Goldberg, Adele E. 2002. Surface generalizations: An alternative to alternations. Cognitive Linguistics 13(4). 327–356), and makes a methodological contribution to the empirical testing of hypotheses. It can also provide insight into grammatical alternations in Mandarin.
Unable to retrieve citations for this document
Retrieving citations for document...
Open Access
November 16, 2023
Abstract
The present paper focusses on the historical development of the relationship between the English core modals can, could, shall, should, will, would, may, might and must and the negator not . It explores whether semantic and morphosyntactic factors, particularly the emergence of do -support in Early Modern English, the increase in the popularity of contracted forms such as won’t in the nineteenth century and the loss of core modals in the twentieth century, had an influence on negation rates. Large-scale empirical analyses of modal use in historical corpora of British prose fiction published between ca. 1500 and 1990 reveal that many modals—particularly high-frequency will, would, can and could —indeed attract not . The establishment of the contractions n’t, ’ll and ’d had the strongest effect on the modal-negation system after 1500. The availability of the contracted modals ’ll and ’d led to a functional split whereby will and would became much more strongly associated with negation while contracted ’ll and ’d repel not -negation.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
November 1, 2023
Abstract
The study of crosslinguistic variation in word meaning often focuses on representational and concrete meanings. We argue other kinds of word meanings (e.g., abstract and (inter)subjective meanings) can be fruitfully studied in translation corpora, and present a quantitative procedure for doing so. We focus on the cross-linguistic patterns for lemmas pertaining to truth and reality (English true and real ), as these abstract meanings been found to frequently colexify with particular (inter)subjective meanings. Applying our method to a corpus of translated subtitles of TED talks, we show that (1) the abstract-representational meanings are colexified in patterned ways, that, however, are more complex than previously observed (some languages not splitting a ‘true’-like from ‘real’-like terms; many languages displaying further splits of representational meanings); (2) some non-representational meanings strongly colexify with representational meanings of ‘truth’ and ‘reality’, while others also often colexify with other fields.
Unable to retrieve citations for this document
Retrieving citations for document...
Open Access
October 30, 2023
Abstract
Language change is a cultural evolutionary process in which variants of linguistic variables change in frequency through processes analogous to mutation, selection and genetic drift. In this work, we apply a recently-introduced method to corpus data to quantify the strength of selection in specific instances of historical language change. We first demonstrate, in the context of English irregular verbs, that this method is more reliable and interpretable than similar methods that have previously been applied. We further extend this study to demonstrate that a bias towards phonological simplicity overrides that favouring grammatical simplicity when these are in conflict. Finally, with reference to Spanish spelling reforms, we show that the method can also detect points in time at which selection strengths change, a feature that is generically expected for socially-motivated language change. Together, these results indicate how hypotheses for mechanisms of language change can be tested quantitatively using historical corpus data.
Unable to retrieve citations for this document
Retrieving citations for document...
Open Access
August 24, 2023
Abstract
Loanwords are lexical terms borrowed from foreign languages by transliterating the original sound of the borrowed words with the recipient language’s consonants and vowels. This paper focuses on lexical borrowing in the Korean language from a diachronic perspective. Based on approximately 9,500 Korean loanwords extracted from a corpus of women’s magazine articles of residential sections (the Korean Contemporary Residential Culture Corpus), we investigated the alteration of loanword usage from 1970 to 2015. Having introduced our definition of Korean loanwords in phonological and morphological terms, we performed statistical analysis particularly with type/token frequency and cultural/core loanwords, along with semantic analysis with Period Representative Loanword (PRL). We argue that, in addition to its gradual and rapid increase over time, Korean loanword usage underwent a remarkable evolution in the 1990s.
Unable to retrieve citations for this document
Retrieving citations for document...
Open Access
August 3, 2023
Abstract
In many languages, the present perfect has grammaticalized, gradually displacing the preterit. Within Spanish, this has been documented with the grammaticalization of the present perfect in Peninsular Spanish. To examine this possibility in two Latin American varieties, this study examined present perfect/preterit variation of 36 speakers from Lima and Mexico City from the PRESEEA corpus. While Lima Spanish presented overall more present perfect than Mexico City Spanish, a similar internal constraint hierarchy is predictive of present perfect use in both speech communities. However, Lima Spanish demonstrated a change in progress toward an expansion of the preterit among younger speakers with the indeterminate temporal reference as locus of change. The findings suggest that present perfect grammaticalization may not always be the most common cross-linguistic pathway but rather is subject to source constraints, which may lead to another pathway in which the preterit expands at the expense of the present perfect.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
June 15, 2023
Abstract
Although there is a long tradition of research analyzing the grammatical complexity of texts (in both linguistics and applied linguistics), there is surprisingly little consensus on the nature of complexity. Many studies have disregarded syntactic (and structural) distinctions in their analyses of grammatical text complexity, treating it instead as if it were a single unified construct. However, other corpus-based studies indicate that different grammatical complexity features pattern in fundamentally different ways. The present study employs methods that are informed by structural equation modeling to test the goodness-of-fit of four models that can be motivated from previous research and linguistic theory: a model treating all complexity features as a single dimension, a model distinguishing among three major structural types of complexity features, a model distinguishing among three major syntactic functions of complexity features, and a model distinguishing among nine combinations of structural type and syntactic functions. The findings show that text complexity is clearly a multi-dimensional construct. Both structural and syntactic distinctions are important. Syntactic distinctions are actually more important than structural distinctions, although the combination of the two best accounts for the ways in which complexity features pattern in texts from different registers.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
June 12, 2023
Abstract
Conversion is a common feature of present-day English, leading to many ‘heterosemous’ words that express related meanings across multiple word classes. Especially common is verb/noun heterosemy, as in flow or hand , both of which can be used as verbs or as nouns. The prevalence of verb/noun heterosemy sets English apart from closely related Germanic languages and is one respect in which English behaves as a language with high boundary permeability. This paper investigates how verb/noun heterosemy has been evolving in Recent English (1920s–2010s). Using quantitative analysis within a large sample of 877 heterosemous words, it is shown that associations between specific words and word classes have been weakening over the last century. More precisely, within our sample, heterosemous words on average tend to develop towards more balanced heterosemy, whereby their association to either one word class or another becomes less pronounced. The findings suggest that English is in the process of a long-term drift towards greater boundary permeability. As high boundary permeability has been associated with low reliance on inflectional morphology in a language, this could be a long-term consequence of the overall loss of inflections earlier in the history of the language.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
June 8, 2023
Abstract
In the absence of a diachronic corpus or a synchronic corpus tagged for speakers’ age, substantiating the presence of semantic change and the stage of change ― initial or advanced ― are challenging tasks. In the present study I introduce three methods for overcoming such difficulties by extracting various kinds of evidence from a synchronic corpus not tagged for speakers’ age. All three methods are based on speakers’ metalinguistic activity. Two of them are of a psycholinguistic nature and the third is of a sociolinguistic nature. Not only do these methods provide data hitherto overlooked by researchers for detecting semantic change, but they can also minimize the researchers’ need for interpretative interventions with regard to speakers’ communicative intentions, thus improving the quality of the analysis.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
May 8, 2023
Abstract
As a cognitive ability to construe events in alternate ways, aspectuality has aroused many researchers’ academic attention; however, the concatenation of aspect markers in a clause is understudied in previous studies. The present paper follows a bidimensional approach of aspect to conduct a corpus-based aspectual analysis of verb concatenation with imperfective markers zhe (henceforth VCIMs zhe ) in Mandarin. Specifically, to construe the cognitive inference mechanism of aspect, a multifactorial analysis of VCIMs zhe by the statistical techniques of multiple correspondence analysis, conditional inference trees and conditional random forests is carried out to explore the prototypical temporal features of verbs in two slots, predict the aspectual meanings of two imperfective markers zhe , and also discuss the conditional importance of factors such as durativity, dynamicity, telicity, boundedness, and slot in identifying the situation types of two verbs or verb phrases in VCIMs zhe . Methodologically, a usage-based multifactorial analysis of VCIMs zhe complements previous introspective studies on aspect marking. Theoretically, a corpus-based aspectual account of VCIMs zhe , one type of complex viewpoint aspects, expands traditional studies on Chinese aspect system, supplies evidence for aspect typology cross-linguistically, and provides reference for second language acquisition of usage patterns of zhe by non-native speakers.
Unable to retrieve citations for this document
Retrieving citations for document...
Abstract
We investigate the optional omission of the infinitival marker in a Swedish future tense construction. During the last two decades the frequency of omission has been rapidly increasing, and this process has received considerable attention in the literature. We test whether the knowledge which has been accumulated can yield accurate predictions of language variation and change. We extracted all occurrences of the construction from a very large collection of corpora. The dataset was automatically annotated with language-internal predictors which have previously been shown or hypothesized to affect the variation. We trained several models in order to make two kinds of predictions: whether the marker will be omitted in a specific utterance and how large the proportion of omissions will be for a given time period. For most of the approaches we tried, we were not able to achieve a better-than-baseline performance. The only exception was predicting the proportion of omissions using autoregressive integrated moving average models for one-step-ahead forecast, and in this case time was the only predictor that mattered. Our data suggest that most of the language-internal predictors do have some effect on the variation, but the effect is not strong enough to yield reliable predictions.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
April 27, 2023
Abstract
The methodological debates surrounding keyword analysis have given rise to a wide range of keyness metrics. The present paper delineates four dimensions of keyness, which distinguish between frequency- and dispersion-related perspectives. Existing measures are then organized according to these dimensions and evaluated with regard to their performance on a specific keyword analysis task: The identification of key verbs in academic writing. To this end, the rankings produced by 32 different metrics are evaluated against an established academic word list. Further, the reliability of measures is assessed, to determine whether they produce stable rankings across repeated studies on the same pair of text varieties. We observe notable differences among metrics with regard to these criteria. Our findings provide further support for the superiority of the Wilcoxon rank sum test and text-dispersion–based measures, and allow us to identify, within each dimension of keyness, metrics that may be given preference in applied work.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
March 28, 2023
Abstract
Classification trees and random forests offer a number of attractive features to corpus data analysts. However, the way in which these models are typically reported – a decision tree and/or set of variable importance scores – offers insufficient information if interest centers on the (form of) relationship between (multiple) predictors and the outcome. This paper develops predictive margins as an interpretative approach to ensemble techniques such as random forests. These are model summaries in the form of adjusted predictions, which provide a clearer picture of patterns in the data and allow us to query a model on potential nonlinear associations and interactions among predictor variables. The present paper outlines the general strategy for forming predictive margins and addresses methodological issues from an explicitly (corpus) linguistic perspective. For illustration, we use data on the English genitive alternation and provide an R package and code for their implementation.
Unable to retrieve citations for this document
Retrieving citations for document...
Open Access
March 27, 2023
Abstract
This paper describes patterns of number use in spoken and written English and the main factors that contribute to these patterns. We analysed more than 1.7 million occurrences of numbers between 0 and a billion in the British National Corpus, including conversational speech, presentational speech (e.g., lectures, interviews), imaginative writing (e.g., fiction), and informative writing (e.g., academic books). We find that four main factors affect number frequency: (1) Magnitude – smaller numbers are more frequent than larger numbers; (2) Roundness – round numbers are more frequent than unround numbers of a comparable magnitude, and some round numbers are more frequent than others; (3) Cultural salience – culturally salient numbers (e.g., recent years) are more frequent than non-salient numbers; and (4) Register – more informational texts contain more numbers (in writing), types of numbers, decimals, and larger numbers than less informational texts. In writing, we find that the numbers 1–9 are mostly represented by number words (e.g., ‘three’), 10–999,999 are mostly represented by numerals (e.g., ‘14’), and 1 million–1 billion are mostly represented by a mix of numerals and number words (e.g., ‘8 million’). Altogether, this study builds a detailed profile of number use in spoken and written English.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
February 13, 2023
Abstract
English verbs can combine with an object-like (or Objoid) element consisting of a possessive and a superlative. These Superlative Objoids do not add a participant to the event but function like manner adverbs ( they work their hardest , i.e. they work extremely hard ). This paper is the first to use diachronic evidence from a corpus of Late Modern American English to trace the recent history of Superlative Objoid Constructions (SOC). In particular, it aims to assess whether the construction has become entrenched to the extent that it can give rise to analogical extension. Secondly, the evidence is used to model, within the framework of Construction Grammar, the horizontal and vertical links between the SOC and its (potential) relatives in the constructional network of transitivity changing constructions.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
February 13, 2023
Abstract
Nepali is typologically rare in terms of nominal classification systems, as it is one of the few languages of the world having simultaneously two gender systems (human/non-human, masculine/feminine) and one numeral classifier system (distinguishing features such as human, round-shaped objects, and long objects among others). Such a rare co-occurrence of different nominal classification systems is highly relevant for investigating linguistic complexity, as languages generally do not have several systems of the same type fulfilling the same functions. However, no corpus-based quantitative analyses have been conducted on the productive use of nominal classification systems in Nepali. The current paper aims at filling this gap by providing a token-based study from the Nepali National Corpus (∼20 million words). Our preliminary results show that there is in fact little formal overlap between the classifier and the gender systems.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
December 12, 2022
Abstract
One way to resolve the actuation problem of metaphorical language change is to provide a statistical profile of metaphorical constructions and generative rules with antecedent conditions. Based on arguments from the view of language as complex systems and the dynamic view of metaphor, this paper argues that metaphorical language change qualifies as a Self-Organized Criticality state and the linguistic expressions of a metaphor can be profiled as a fractal with spatio-temporal correlations. Synchronously, these metaphorical expressions self-organize into a self-similar, scale-invariant fractal that follows a power-law distribution; temporally, long range interdependence constrains the self-organization process by the way of transformation rules that are intrinsic of a language system. This argument is verified in the paper with statistical analyses of twelve randomly selected Chinese verb metaphors in a large-scale diachronic corpus.
Unable to retrieve citations for this document
Retrieving citations for document...
Requires Authentication
Unlicensed
Licensed
March 30, 2017
Abstract
This paper addresses the issue of coalescence of frequent collocations and its consequences for their realization and mental representation. The items examined are ‘semi-modal’ instantiations of the type V- to -V inf , namely have to, used to, trying to and need to , in American English. We explore and compare their realization variants in speech, considering the effects of speech-internal and extra-linguistic factors (speech rate, stress accent, phonological context, speech situation, age of the speaker), as well as possible effects of analogy with established contractions like gonna, wanna . Our findings show a high degree of coalescence in the items under study, but no clear pattern of contraction. The propensity for contraction in analogy to gonna/wanna is strongly affected by phonological properties – it is inhibited by the presence of a fricative in have/used to . Moreover, the most frequent reduced realizations are conservative in terms of transparency and still allow morphological parsing of the structure. More radical contractions are restricted to rapid and informal speech, and less entrenched as variants. This shows the limitations of reduction as a frequency effect in light of the balance between articulatory ease and explicitness in speaker–hearer interaction. Even in highly frequent and strongly coalesced items, reduction (articulatory ease) is restricted by a tendency to retain cues to morphological structure (explicitness). Finally, we propose a network of pronunciation variants that includes representation strengths as well as analogy relations across constructional types.