Dutch features several morphemes with “privative” semantics that occur as left-hand members in compounds (e.g., imitatieleer ‘imitation leather’, kunstgras ‘artificial grass’, nepjuwelen ‘fake jewels’). Some of these “fake” morphemes display great categorical flexibility and innovative adjectival uses. Nep, for instance, is synchronically attested as an inflected adjective (e.g., neppe cupcake ‘fake cupcake’). In this paper, we combine an extensive corpus study of eight Dutch “fake” morphemes with statistical methods in distributional semantics and collexeme analysis in order to compare their semantic and morphological properties and to find out which factors are the driving forces behind their exceptional “extravagant” morphological behavior. Our analyses show that debonding and adjectival reanalysis are triggered by an interplay of two factors, i.e., type frequency and semantic coherence, which allow us to range the eight morphemes on a cline from more schematic to more substantive “fake” constructions.
In this study, we present a corpus-based statistical analysis of eight Dutch “fake” morphemes that display, to a more or lesser degree, “extravagant” morphological properties. We combine an extensive corpus study with statistical methods in distributional semantics and collexeme analysis in order to compare the semantic and morphological properties of these morphemes and to find out which factors are the driving forces behind the exceptional morphological behavior of some of these “fake” morphemes.
Let us first define what we understand by “fake” morphemes. We adopt a broad definition in order to include all morphemes that, at least in some of their uses, allow a “privative” meaning. This means that the compounds and phrases they form part of allow the proposition “(a) fake X is not (an) X” (Cappelle et al. 2018: 9). A neppistool ‘fake gun’, for instance, is not a type of gun, even though it shares certain properties with a true gun (Van der Wouden 2019). A neppistool may physically resemble a true gun, but may not be used to shoot metal bullets, for instance. Obviously, depending on the amount of shared properties with a “true” gun, some fake guns will be “faker” than others (think of a colorful plastic toy gun compared to a real-looking gun as used in movies). With respect to English fake, although often considered a textbook example of a privative adjective, Cappelle et al. (2018: 9) go so far as to contest “that fake is an across-the-board privative adjective, since a fake article, for instance, is most definitely still an article” (Cappelle et al. 2018: 9). A fake article is called fake because it lacks some of the prototypical properties of an authentic newspaper article, especially the property of providing true (i.e., fact-checked) content. Cappelle et al. (2018: 39) even show that fake is used in a privative sense in only half of the most frequent [fake + noun] combinations. Their study clearly demonstrates that the interpretation of fake is highly context-dependent. Not only the semantic properties of the noun play a crucial role in the interpretation of the combination of fake with a noun but also the speakers’ cultural knowledge, as the following case illustrates:
To give another example, when there is discussion of fake handbags, what is meant is not things that look like handbags but that in fact lack the function of handbags; rather, fake handbags is more likely to be used in the sense of “real handbags that are passed off as being manufactured by a well-known designer brand”. This is based on our world knowledge about handbags, namely that they are not just functional objects but, perhaps in the first place, accessories whose make is important. (Cappelle et al. 2018: 40)
As stated above, we concentrate on a set of “fake” morphemes that are prone to privative use in contemporary Dutch, although they are not necessarily being used privatively across-the-board. We selected eight “fake” morphemes that are used in nominal or adjectival compounds, excluding privative adjectives (e.g., vals ‘false’, schijnbaar ‘apparent’, kunstmatig ‘artificial’) and affixes (e.g., pseudo-, niet- ‘not-’, -achtig ‘-ish, -like’). Below, we provide an example and the etymological source item for each of these morphemes. Nep (1) originally comes from the German noun Nepp (‘unfair deal, fake copy’) or the German verb neppen ‘cheat’, but was already attested as a verb and noun in Dutch at the beginning of the twentieth century. Imitatie (2) had been borrowed into Dutch from the French noun imitation around 1650. The items from (3) to (7) are indigenous Dutch morphemes. Lastly, we include the English loan fake since it can be used in Dutch compound-like sequences, as shown in (8).
As emerges from their etymologies, all these Dutch left-hand components originally derive from autonomous nouns and/or verbs. Fake was probably borrowed into Dutch as a noun, even though it already existed as an adjective and verb in English before that date.
Interestingly, in synchronic corpus data several of these morphemes can be found in typically adjectival contexts as well (see Section 2). Since these adjectival uses are hardly mentioned in the dictionaries, we consider them as “innovative” ─ or even “extravagant” (see Section 2) ─ uses. For instance, in the Van Dale dictionary of Dutch kunst, namaak and imitatie are only listed as nouns, while fop and lok are only listed as verbs, and schijn occurs both as a verb and a noun. Only fake and nep have separate entries as adjectives. The following examples, all taken from the Dutch 2014 TenTen Web Corpus (nlTenTen14) (Kilgarriff et al. 2014), illustrate the synchronic orthographic and morphological variation of certain “fake” constructions (i.e., the combination nep ‘fake’ + herinneringen ‘memories’):
In (9) nepherinneringen forms a compound, and in (10) nep is orthographically separated from its head noun. Example (11) shows that nep can even be used as an inflected adjective. This synchronic variation is suggestive of a cline from compounding to adjectival use, in which the compound construction acts as a bridging context for the reanalysis of the nominal/verbal compound constituent into an adjectival modifier (cf. Van Goethem and De Smet 2014; Van Goethem and Koutsoukos 2018). This form of reanalysis has been identified before as an instance of “debonding” (Norde 2009) and is typically accompanied by category change (see further in Section 2).
The aim of this study is twofold. First, we compare the eight “fake” morphemes with respect to their semantic-distributional and morphological profiles to identify the specific functional and formal properties of each morpheme. Our second aim is to explore which factors trigger the “extravagant” behavior, i.e., the debonding and categorical flexibility, of some of these “fake” morphemes. As mentioned before, the English loan fake is included in this study because it forms part of privative compound-like sequences in Dutch. However, since it was already attested as an adjective in English before being borrowed into Dutch (cf. footnote 2), we do not claim a parallel process of debonding and category change in the latter case, and include this morpheme mainly as a point of comparison for the semantic and distributional analysis of its Dutch counterparts.
On the methodological level, we will show how the combination of corpus analysis (extraction of comparable data samples, manual annotation of the semantic and morphological properties) with up-to-date statistical techniques allows us to detect both robust and more subtle differences between the “fake” morphemes. In addition, this mixed method enables us to identify the factors responsible for the recent morphological changes affecting some of them.
The results of our analysis will be theoretically embedded in the frame of Construction Grammar (cf. Hoffmann and Trousdale 2013). Crucial to this model is the concept of “constructions”, i.e., conventionalized form-meaning pairings, as the basic units of language. Constructions vary in size and complexity and exist at different levels of schematicity (i.e., levels of abstraction): a distinction can be made between fully schematic constructions (abstract grammatical patterns), semi-schematic constructions combining lexically filled positions with open slots, and substantive micro-constructions or fully idiomatic expressions. Based on Booij’s (2010) theory of Construction Morphology and the general principle in Construction Grammar that words are constructions (e.g., Goldberg 2006), we argue that our eight “fake” morphemes form part of semi-schematic morphological constructions with a filled and an open slot. The schema in (12) represents the compounding construction in which the “fake” morpheme is the left-hand member of a nominal or adjectival compound.
As a rule, Dutch compounds are spelled as one word. However, in order to improve readability, a hyphen must be used in particular contexts, for instance to avoid a sequence of vowels at the constituent boundary (e.g., imitatie-eiken ‘imitation oak’) or in the case of special characters (capital letters, numbers, symbols, etc.) on either side of the constituent boundary (e.g., namaak-Vespa ‘fake Vespa’). Nevertheless, actual language use contains many deviations from these rules, such that the spelling of compounds often seems to be a matter of free orthographic variation – compare namaakeieren ‘fake eggs’ in (12a) and namaak-eieren in (12b). In the schema in (13), on the other hand, the “fake” morpheme is written as a separate word, which may be considered another spelling variant, but which may also imply that the “fake” morpheme is used as an attributive adjective (13a), or as an adjective-modifying adverb (13b). Unambiguous examples of adjectivehood are inflected adjectives (such as (11) above), or cases with adverbial modification, such as 100% as in (13a).
In Section 2, we first discuss the concepts of “debonding” and “extravagance”. In Section 3, we present our case study of Dutch “fake” morphemes. The results of our analysis and our conclusions are presented in Sections 4 and 5, respectively.
“Fake” morphemes are examples of evaluative morphology, which is a well-established field in morphological theory (e.g., Bauer 1997; Grandi and Körtvélyessy 2015). The concept of evaluation involves “a mental process by which objects of extra-linguistic reality are assessed from the point of view of quantity (big vs small) and quality (good, bad, nasty, nice, nasty, etc.)” (Körtvélyessy 2014: 296). Thus far, the study of evaluative morphology has been largely restricted to the morphological expression of diminution and augmentation, and their pejorative or ameliorative connotations (e.g., Dressler and Barbaresi 1994), whereas other crucial functions such as the expression of “approximation” or “privative semantics” have been largely ignored. In the case of “fake” semantics, an object is lacking one or more of its prototypical features (see Section 1), which may result in a negative evaluation (bad quality). Therefore, it should also be considered an instance of evaluative semantics.
Evaluative morphology (partially) overlaps with “expressive morphology” (Szymanek 1988; Zwicky and Pullum 1987) and “extragrammatical morphology” (Dressler and Barbaresi 1994; Dressler and Karpf 1995; Mattiello 2013). While expressive morphology refers to word-formation with playful or poetic effects, the notion “extragrammatical” highlights that the morphological operations in this field often deviate from productive rules of word-formation that allow a prediction of a regular output. Mattiello (2013), for instance, considers abbreviations (e.g., BRB = Be Right Back), blends (e.g., beaulicious < beautiful + delicious) and reduplicatives (e.g., okey-dokey) as instances of extragrammatical morphology, as these word-formation processes stand out as less predictable and therefore more creative and salient compared to derivation and compounding. This recalls Keller’s (1994) maxim “talk in such a way that you are noticed”, recast as the “maxim of extravagance” by Haspelmath (1999). In this vein, the term “extravagant morphology” has been coined to refer to any attention-attracting word-formation process (Eitelmann and Haumann 2019).
Within the domain of evaluative morphology, it has already been shown that intensifying morphemes display “extravagant” morphological properties. Affixoids, i.e., morphemes “which look like parts of compounds, and do occur as lexemes, but have a specific and more restricted meaning when used as part of a compound” (Booij 2009: 208), are a case in point. In their study of three Germanic intensifying prefixoids (Dutch kei ‘boulder’, German Hammer ‘hammer’ and Swedish kanon ‘cannon’), Norde and Van Goethem (2018) showed that these morphemes are subject to categorical flexibility to different degrees. For instance, German Hammer is not only used as an intensifying prefixoid (14) but may also act as an intensifying adverb (15) and as an (inflected) evaluative adjective (16) in colloquial usage (examples from COW14, cf. Schäfer 2015):
The preceding examples illustrate the process of “debonding”, which is “a composite change whereby bound morphemes (clitics, affixes, affixoids) in a specific context develop into free morphemes” (Norde 2009: 186). Several studies have demonstrated that debonding of intensifying prefixoids is a creative process of lexical innovation in Germanic languages, which may lead to the emergence of new intensifying adverbs or evaluative adjectives (cf. Battefeld et al. 2018; Norde and Van Goethem 2014, 2015, 2018; Van Goethem and De Smet 2014; Van Goethem and Hiligsmann 2014; Van Goethem and Hüning 2015). The present paper investigates to which extent the adjectival or adverbial uses of Dutch morphemes with “fake” semantics may be considered instances of debonding. We return to this issue in the next section, but first, let us have a look at the following examples, all drawn from the Dutch nlTenTen14 web corpus (Sketch Engine, cf. Kilgarriff et al. 2014) (see also Section 3.1):
In (17) namaak is found in attributive position, and in (18) imitatie is used in coordination with the adjective echt ‘real’. In example (19) nep is modified by the degree adverb te ‘too’, and in (20) the item features regular adjective inflection (neppe). Examples (21) and (22) show that nep can also be used as an adverbial modifier of an adjective or a verb. Finally, the comparative form of the loanword fake in example (23) indicates that fake may be used as an adjective in Dutch. All these examples show that certain “fake” morphemes may undergo debonding and categorical change to a more or lesser degree. The only possible exception is example (23), as the adjectival usage may have been borrowed from English directly (cf. footnote 2).
Since debonding involves the shift from a bound to a free morpheme, it is counterdirectional to grammaticalization, and can therefore be considered a case of degrammaticalization. Although debonding is relatively rare where affixes and clitics are concerned (for examples see Norde 2009: 190─227), it appears to be more frequent in affixoid constructions (Norde and Van Goethem 2018). The privative morphemes discussed in the present paper are not affixoids (they do not have a meaning different from the free morpheme they derive from), but their use as compounding members is a prerequisite for the adjectival and adverbial constructions. As such, the novel usages of privative morphemes represent a type of debonding that has not been covered in previous studies.
In this paper, we argue that debonding (like degrammaticalization generally) is a type of extravagant morphology, which has two theoretical implications for the relation between extravagance and (de)grammaticalization. Firstly, as we argue below, it is degrammaticalization, not grammaticalization, that is extravagant. Secondly, this means that extravagance cannot be invoked as an argument for the unidirectionality of change.
This challenges two major claims in a seminal paper by Haspelmath (1999), which introduces “the maxim of extravagance”. Referring to Keller (1994)’s maxim “talk in such a way that you are noticed”, Haspelmath (1999: 1043) argues that speakers introduce innovations in language to be noticed and, ultimately, to achieve social success. In Haspelmath’s view, grammaticalization is a “side effect of the maxim of extravagance, that is, speakers’ use of unusually explicit formulations in order to attract attention” (Haspelmath 1999: 1043).
Haspelmath introduces his maxim of extravagance in order to be able to explain why the two conflicting tendencies toward optimal clarity, on the one hand, and optimal economy, on the other “do not cancel each other out, leading to stasis rather than change” (Haspelmath 1999: 1052). In other words: if speakers want to get their message across in the most effective way, but at the lowest possible articulatory cost, why would languages change at all? After all, a more elaborate expression implies more effort, whereas a reduced expression is less effective. Therefore, Haspelmath argues that speakers innovate because they want to be “socially successful”, following the maxim of extravagance. One example of such extravagant innovations is the use of motion verbs in future contexts, which eventually results in new periphrastic futures (such as to be going to in English or aller in French). Haspelmath goes on to argue that “degrammaticalization is by and large impossible because there is no counteracting maxim of ‘anti-extravagance’” (Haspelmath 1999: 1043).
Although we agree with Haspelmath (1999) that there exists an asymmetry between grammaticalization and degrammaticalization in the sense that grammaticalization is more frequently attested and more cross-linguistically regular (Norde 2009), we do not think this asymmetry is due to the maxim of extravagance. Quite the reverse, we would argue that the pervasiveness and regularity of grammaticalization changes are not compatible with a conscious effort to be noticed (see Geurts 2000a, 2000b for a careful argument why grammaticalization can be explained as the cumulative effect of communicative maxims and interindividual variation, or Von Mengden and Kuhle (2020), who convincingly argue that grammaticalization is an emergent phenomenon). Instead, we consider the changes involved in degrammaticalization as noticeable and extravagant, precisely because they run counter to the “predictable” path of grammaticalization. Instead of arguing that degrammaticalization is incompatible with extravagance, we claim, on the contrary, that it is particularly innovative and extravagant. Debonding of bound items and their categorical flexibility, in particular, are bound to be noticed.
Our data were drawn from the Dutch 2014 TenTen Web Corpus (nlTenTen14), available on the SketchEngine platform (cf. Kilgarriff et al. 2014). This corpus belongs to the TenTen corpus family, currently available in more than 30 languages, and consists of texts collected from the Internet. The nlTenTen14 corpus contains 2,253,777,579 words. About three quarters of the Dutch corpus is extracted from the .nl domain; thus, the corpus mainly represents Netherlandic Dutch.
We extracted random samples of 2000 tokens of each “fake” morpheme, making sure to include both its bound (compound) and free (debonded/adjectival) use. The CQL query allowed us to exclude a range of false positives a priori, such as derivatives beginning with the fake morpheme (e.g., schijnbaar ‘apparent’), homonymous verbs (e.g., schijnen ‘appear’) or homonymous nouns used in frequent collocations (e.g., schijn bedriegt ‘appearances are deceptive’, geen schijn van kans ‘no chance’). An example of the query we used to extract the schijn- constructions is given in (24):
Subsequently, we selected the first 500 relevant tokens pro morpheme, with the exception of lok, for which only 388 relevant instances were found. Each occurrence was manually annotated for its morphological and semantic properties in Excel. The statistical analyses were conducted in R (Levshina 2015; R Core Team 2016).
When annotating the corpus examples, we noticed that quite a few occurrences contained sequences of different “fake” morphemes (25), and that some of them even appear to be interchangeable (26):
In order to compare the meaning and distribution of our eight “fake” morphemes, we make use of two different distributional approaches to semantics: Multiple distinctive collexeme analysis (3.2.1) and Semantic Vector Spaces (SVS) (3.2.2).
In order to compare the meanings of our eight “fake” morphemes, we first performed a distributional-semantic analysis, namely, Multiple Distinctive Collexeme analysis (Levshina 2015: 248─251; Stefanowitsch and Gries 2003). This method is based on the co-occurrence frequencies of words that occur in near-synonymous constructions.
More specifically, the Multiple Distinctive Collexeme analysis compares the observed frequency of a specific slot filler (R1 in our case study, i.e., the item that the “fake” morpheme modifies: e.g., wasabi in ) to the expected frequency of that R1, and automatically computes if specific R1s are attracted to one of the constructions. We computed the Fisher exact test p-values to measure the attraction score between each morpheme and each R1. As a cut-off value for distinctiveness, we set 1.3 for log-transformed values (with 10 as the logarithm base), which approximately corresponds to the p-value of 0.05 (cf. Levshina 2015: 245).
Table 1 shows the top 20 of distinctive collexemes of nep (of a total of 46 distinctive collexemes). The most distinctive collexeme, wimper ‘eyelash’ occurs 19 times in combination with nep, only once in combination with kunst, and never with the remaining “fake” morphemes. This makes it a significant distinctive collexeme of nep (logp > 15). Although, as we will see in the next section, nep and fake form a cluster based on a SVS analysis and indeed share a number of collexemes (e.g., profiel ‘profile’, tweet, site), Table 1 also indicates that nep has a large number of distinctive collexemes with respect to fake (e.g., wimper ‘eye lash’, nagel ‘nail’, brief ‘letter’).
Table 2 shows the top 10 of distinctive collexemes (with a logp > 1.3) for all “fake” morphemes. All in all, we found 56 distinctive collexemes for fake, 22 for fop, 42 for imitatie, 46 for nep, 16 for kunst, 46 for schijn, 22 for lok and 40 for namaak. However, these figures cannot be compared at face value, since they need to be checked against the type frequencies of each “fake” morpheme. We return to this issue in Section 3.2.3.
|1||fur||speen ‘teat’||leer ‘leather’||wimper ‘eyelash’||stof ‘material’||heilig ‘holy’||vogel ‘bird’||product ‘product’|
|2||profiel ‘profile’||duik ‘dive’||bont ‘fur’||nagel ‘nail’||gras ‘grass’||beweging ‘movement’||fiets ‘bike’||geneesmiddel ‘medicine’|
|3||speech||gesprek ‘interview’||suède ‘suede’||geld ‘money’||mest ‘manure’||veiligheid ‘security’||eend ‘duck’||artikel ‘article’|
|4||account||zwam ‘fungus’||leren ‘leather’||brief ‘letter’||gebit ‘teeth’||zekerheid ‘certainty’||puber ‘adolescent’||sigaret ‘cigarette’|
|5||foto ‘photo’||neus ‘nose’||spiritualiteit ‘spirituality’||wapen ‘weapon’||licht ‘light’||constructie ‘construction’||jood ‘Jew’||goederen ‘goods’|
|6||advertentie ‘advertisement’||winkel ‘shop’||kaviaar ‘caviar’||factuur ‘invoice’||aas ‘bait’||huwelijk ‘marriage’||homo ‘gay’||versie ‘version’|
|7||tekst ‘text’||cadeau ‘gift’||parel ‘pearl’||borst ‘breast’||hars ‘resin’||zelfstandigheid ‘independence’||auto ‘car’||god ‘god’|
|8||naam ‘name’||lijn ‘(telephone) line’||vuurwapen ‘firearm’||beveiligingssoftware ‘security software’||lens ‘lens’||vertoning ‘performance’||tiener ‘teenager’||Ugg|
|9||gedoe ‘stuff’||mop ‘joke’||marmer ‘marble’||brood ‘bread’||maan ‘moon’||zwangerschap ‘pregnancy’||duif ‘pigeon’||Viagra|
|10||name||opdracht ‘task’||lederen ‘leather’||herinnering ‘memory’||vezel ‘fibre’||oplossing ‘solution’||agent ‘policeman’||medicijn ‘medicine’|
The SVS analysis (Levshina 2015: 323─332; Levshina and Heylen 2014) starts from the idea that words that occur in similar contexts tend to have comparable meanings, following Firth’s (1957) adage that “you shall know a word by the company it keeps” (Levshina and Heylen 2014: 22). The SVS method is a radically data-driven technique that can be seen as an extension of collocational analysis, such as the one presented in the previous section. Semantic vectors are information strings that indicate “the weighted co-occurrence frequencies between target words and their contextual features” (Levshina 2015: 323). These frequencies are transformed into vectors, and the cosine of the angle between these vectors can be used as a measure of semantic proximity (Levshina 2015: 328─331; Levshina and Heylen 2014: 23─24). There are many variants of SVS analysis, depending on the number of target words and context size, as well as on the type of dependency relation between target word and context words (see Levshina and Heylen 2014: 25─29 for details). Our study is a small-scale type of SVS with only eight target words and a minimal context (R1). The advantage of small contexts in SVS analysis is that they tend to result in tighter taxonomic relations (Levshina and Heylen 2014: 26). Applied to our case study, SVS analysis is used to identify clusters of (near-)synonyms, based on the R1s of each “fake” morpheme.
Table 3 is a similarity matrix that indicates the degree of similarity between the semantic vectors of each “fake” morpheme. The cosine values were computed using the cossim function in the Rling package (Levshina 2015). The resulting scores range from 0 (no overlap at all between the R1s) and 1 (complete overlap of the R1s). Obviously, every morpheme has a perfect collocational overlap with itself. This is shown by the diagonal scores in the matrix (Table 3) that are all equal to 1. The matrix furthermore reveals some interesting patterns:
fake/nep have the highest similarity scores (>0.06);
nep/namaak (>0.05) and imitatie/namaak (>0.04) are relatively similar to each other;
the morphemes fop, kunst, lok and schijn display low cosine similarity toward the other morphemes (<0.03);
no collocational overlap at all is found between the pairs fop/kunst, fop/lok, imitatie/lok and kunst/fop.
Once these cosine similarity measures have been established, they can be transformed into distances, using the pam function in the cluster package (Maechler et al. 2019; for details on this method see Levshina 2015: 330─331). This method enables us to draw a cluster dendogram (Figure 1) that provides a visualization of the results of the SVS analysis. In order to obtain the optimal number of clusters, this function computes the “average silhouette width”, which ranges from 0 (no clusters at all) to 1 (perfect separation of all clusters) (Levshina 2015: 311). The highest average silhouette width is obtained with seven clusters, although it is still quite low (0.25). The resulting cluster dendogram in Figure 1 is drawn using the ward.D2 algorithm, which minimizes the increase in the variance in the distances between the members of the clusters (Levshina 2015: 311). As Figure 1 shows, fake and nep cluster together, whereas imitatie and namaak belong to the same overarching branch. The other four morphemes, kunst, lok, fop and schijn, all form individual branches, which indicates that they have unique semantic-distributional properties.
The semantic-distributional analyses explained in the preceding sections have first of all shown that fake and nep are most similar semantically: their semantic-distributional profiles are most alike, although with still a relatively high number of specific collexemes. These morphemes stand out as the most “neutral” “fake” morphemes, and nep could be considered the Dutch equivalent of English fake. A closer look at the dataset shows that fake and nep are indeed quite often interchangeable (27─28), but that – unsurprisingly – fake more often collocates with English loanwords than nep (e.g., fake fur, fake account, fake speech).
A second conclusion is that imitatie and namaak also have relatively comparable distributional-semantic profiles in terms of semantic vectors. A comparison of their distinctive collexemes (see Table 2), however, reveals that imitatie primarily refers to faithful copies of expensive natural materials (leather, fur, caviar, marble, etc.), whereas namaak more often has a connotation of fraud (medicine, article, cigarettes or brand names such as Uggs).
Finally, the remaining four “fake” morphemes have more specific and unique semantic-distributional profiles. Based on their most distinctive collexemes (Table 2), their constructional meanings can be described as follows:
kunst-constructions refer to an “artificial, synthetic, not natural X” (e.g., kunststof ‘plastic’, kunstgras ‘artificial turf’);
schijn-constructions refer to an “apparent X”; schijn- expresses a superficial resemblance (e.g., schijnzwangerschap ‘phantom pregnancy’), often accompanied by a pejorative connotation (e.g., schijnheilig ‘hypocritical’, schijnveiligheid ‘false safety’);
fop-constructions refer to a “fake X with the intention to fool somebody, meant as a joke” (e.g., fopspeen ‘dummy’, fopneus ‘fake party nose’, fopgesprek ‘joke interview’);
finally, lok-constructions refer to a “fake X in order to lure somebody, intended as a trap” (e.g., lokvogel ‘decoy (bird)’, lokfiets ‘bait bike’).
These conclusions are corroborated by the distinctive collexeme ratios of each morpheme, i.e., its number of distinctive collexemes divided by its number of types in the corpus sample. Figures 2 and 3 show that fake, imitatie, nep and namaak have the lowest distinctive collexeme ratios, while fop, kunst, lok and schijn have the highest scores. This means that the former morphemes have relatively few distinctive R1s with respect to their (high) number of types, whereas the latter morphemes have a relatively high proportion of distinctive R1s on their total (low) number of types. The former morphemes can therefore be considered to have more “neutral” “fake” semantics, while the latter four have more specific semantic-distributional profiles. In the case of kunst and lok, for instance, more than half of their number of types are distinctive with respect to the other “fake” morphemes.
Having examined the semantic-distributional behavior of the eight “fake” morphemes, we now turn to their morphological properties. More specifically, we are interested in the extent to which each morpheme allows debonding and categorical flexibility. We consider debonding here as a purely orthographic criterion; it implies that the compound is written apart (e.g., in schijn democratie ‘fake democracy’). Sequences written as one word (e.g., kunstgebit ‘false teeth’) and hyphenated sequences (e.g., imitatie-Vespa ‘counterfeit Vespa’) have been considered instances of bound use (see Section 1 on the use of hyphenation in Dutch compounds). In our comparison, we include the English loan fake, even though it may be argued that sequences such as fake tattoo were borrowed directly and hence have never been written together. This implies that, strictly speaking, we cannot be certain whether free fake is indeed an instance of debonding, but it may nevertheless be interesting to compare the spelling variation of fake constructions to the other seven. The fake frequencies are therefore included in the tables and figures below, with the proviso that separate fake does not necessarily reflect debonding.
Categorical flexibility, on the other hand, refers to all morphological or syntactic features that flag “adjectivehood”: predicative use, gradation, inflection, coordination with another adjective, scope over an NP, verb or adjective modification (in the case of adverbial use) (cf. examples 17─23).
|Debonding ratio||Adjectivehood ratio|
|fake||372/500 = 0.744||77/500 = 0.154|
|fop||15/500 = 0.030||0|
|imitatie||179/500 = 0.394||40/500 = 0.080|
|lok||3/388 = 0.008||0|
|namaak||152/500 = 0.304||44/500 = 0.088|
|nep||150/500 = 0.300||93/500 = 0.186|
|schijn||15/500 = 0.020||2/500 = 0.004|
Strikingly, we observe a similar dichotomy between two groups of morphemes as we found in the semantic-distributional analysis: fake, imitatie, nep and namaak have relatively high degrees of debonding and/or adjectivehood, whereas fop, schijn, lok and kunst do not show this “extravagant” morphological behavior. The latter morphemes are (almost) exclusively found in their bound forms (e.g., fopwetenschapper ‘pseudo-scientist’, schijnbeweging ‘feint (movement)’, lokagent ‘fake policeman’, kunstgebit ‘false teeth’) and do not show any signs of adjectivehood.
Imitatie and namaak have a relatively high degree of debonding (>30%). This may be due to their length and the fact that they form relatively complex compounds (e.g., imitatie sneeuwkettingen ‘imitation snow chains’, namaak merkartikelen ‘counterfeit branded goods’). When they combine with (capitalized) proper names referring to brands, debonding also occurs: for instance, namaak Gucci ‘counterfeit Gucci’, imitatie iPad ‘imitation iPad’. However, both morphemes show few signs of adjectivehood: mostly some predicative use (29), but no inflection at all.
Applying the spelling criterion to fake, this morpheme is written separately most often (74.4%), but, as we argued above, this is due to the fact that it is an English loanword and is not bound in English “fake” constructions, such as fake account. When the R1 is a Dutch word, it is bound in about a quarter of the occurrences in the sample (e.g., fake-advertentie ‘fake advertisement). Its degree of adjectivehood in Dutch, mostly manifested through predicative use, as in (30), is however rather low (15.4%).
An important proviso is in order at this point. Since fake ends in -e, which is the ending of inflected Dutch adjectives, we have been unable to determine the presence of inflection of fake in our written data. In spoken Dutch, however, fake may be inflected (in which case it is pronounced as two syllables, not one), as shown in the study by Goublomme (2019) of the integration of fake in both written and spoken Dutch. Her analysis of survey data from 185 native speakers revealed that almost half of them inflect fake in spoken discourse in syntactic contexts where Dutch adjective inflection is required, as in (31). Based on these facts, we may assume that the degree of adjectivehood of fake in our corpus sample is in reality somewhat higher than the reported ratio.
Finally, nep stands out as the Dutch “fake” morpheme with the highest degree of adjectivehood (18.6%). More than 60% of its debonded uses involve signs of categorical flexibility, including inflection and gradation, as in (32─33).
To sum up, so far we noticed that the “fake” morphemes with the most neutral “fake” semantics undergo debonding more often and show more categorical flexibility than the ones with more specific semantic-distributional profiles. In the next section, we analyze if the same dichotomy applies to the productivity of the morphemes under study.
In order to establish whether there exists a correlation between debonding and productivity, we computed both the type/token ratio (TTR) and the potential productivity (PP) of all eight “fake” morphemes.
|fake||282/500 = 0.564||218/500 = 0.436|
|fop||67/500 = 0.134||43/500 = 0.086|
|imitatie||245/500 = 0.490||192/500 = 0.384|
|kunst||29/500 = 0.058||10/500 = 0.020|
|lok||43/388 = 0.111||21/388 = 0.054|
|namaak||284/500 = 0.568||223/500 = 0.446|
|nep||318/500 = 0.636||249/500 = 0.498|
|schijn||135/500 = 0.270||88/500=0.176|
The productivity ratios tally with the figures for distinctive collexemes discussed in Section 3.2. On the one hand, nep, fake, namaak and imitatie show relatively high productivity ratios, both regarding the number of different types they create (TTR) and the number of hapax legomena (PP) (e.g., nepzwanger ‘falsely pregnant’, namaakwielrenner ‘fake (amateurish) cyclist’). Schijn, fop, lok and kunst, on the other hand, are the least productive morphemes. Nep is the most productive “fake” morpheme and kunst the least productive one. Illustrative of this difference in productivity is the fact that nep- forms 318 different types among which 249 are hapaxes in the data set of 500 occurrences (TTR = 0.64, PP = 0.50), while kunst- features only 29 different types, among which 10 are hapaxes (TTR = 0.06, PP = 0.02). Moreover, the most frequent type kunststof ‘synthetic material’ covers 64.4% (322/500) of its tokens. These differences in frequency distributions are illustrated in Figure 6.
Given these differences, it can be useful to plot potential productivity against type frequency, a method proposed in Baayen and Lieber (1991: 818─819), which they term “global productivity”. Global productivity of the eight “fake” morphemes is shown in Figure 7, with potential productivity on the y-axis and type frequency on the x-axis. This figure shows a clear cline from nep, which has highest global productivity, to kunst, which has lowest global productivity.
From the preceding analyses, it is clear that two groups of “fake” morphemes can be distinguished based on their semantic-distributional profile, their morphological properties and their productivity. Fake, nep, namaak and imitatie are semantically close to each other, allow a relatively high degree of debonding and – albeit to a lesser extent – of categorical flexibility. Moreover, they are productive morphemes. Schijn, fop, lok and kunst have semantically distinctive profiles, hardly allow debonding or adjectival use and show low productivity. In the next three sections, we will explore the relation between these three parameters in a stepwise fashion. Section 4.1 focuses on the relation between productivity and debonding, Section 4.2 studies how the semantic profile relates to productivity, and Section 4.3 investigates the relationship between the three parameters.
In order to answer the question whether increasing productivity significantly correlates with increasing debonding, we performed a correlation analysis with linear regression modeling (Levshina 2015: 115─170) between TTR and debonding ratio. Since debonding of fake was already predicted by its English origin, this item was excluded from the analysis.
Figure 8 indicates a significant correlation between productivity (TTR) and the extent of debonding of the seven morphemes (p = 0.004396 (**); Adjusted R2 = 0.8288; cor (Pearson’s) = 0.910). This result implies that increase in productivity is accompanied by increase in debonded use. Again, kunst on the one hand and nep on the other can be identified as the two opposite endpoints of the cline.
In order to explain this positive correlation between productivity and debonding, we need to take into account a third parameter, namely, semantic coherence. Barðdal (2008) claims that there exists an inverse correlation between type frequency and semantic coherence of a construction, as represented in Figure 9:
[…] the higher the type frequency of a construction, the lower the degree of semantic coherence is needed for a construction to be productive. Conversely, the lower the type frequency of a construction, the higher degree of semantic coherence is needed for a construction to be extendable. (Barðdal 2008: 35)
Partially filled constructions (or constructional idioms) typically impose formal and/or semantic restrictions on the items that can fill their open slots, and these restrictions constrain the productivity of the construction (see, for instance, Gyselinck and Colleman (2016) on the pseudo-reflexive resultative construction in Dutch).
We argue that this inverse correlation between type frequency and semantic coherence also holds for our “fake” morphemes. Nep, fake, namaak and imitatie have the highest type frequency. Additionally, the semantic-distributional analysis in Section 3.2 indicated that nep and fake are semantically the most neutral “fake” morphemes, followed by imitatie and namaak which display some privileged connotations. These four morphemes show the lowest distinctive collexeme ratios, which we consider a proxy for their (relatively low) semantic coherence. Schijn, fop, lok and kunst have been shown to display low type frequency and high distinctive collexeme ratio, which suggest a relatively high semantic coherence: in Section 3.2.3 we have identified the specific semantic restrictions each of these four morphemes imposes on its R1. Consequently, the eight “fake” morphemes can be ranged on Barðdal’s cline according to their degree of semantic coherence and type frequency. However, Barðdal’s correlation in Figure 9 is schematic, so it might be interesting to explore whether an actual regression analysis yields a similar picture. Therefore, to visualize the relation between type frequency and semantic coherence based on our corpus data, we performed a second regression analysis, in which we took the distinctive collexeme ratio (see Figure 3 in Section 3.2.3) as a measure for semantic coherence. The regression line in Figure 10 bears a strong resemblance to Barðdal’s schematization, reflecting a strong inverse relationship (the correlation is negative): p = 0.0004905 (***); Adjusted R2 = 0.8664; cor (Pearson’s) = −0.9410214.
In the preceding sections, we have shown that the type frequency of the “fake” morphemes positively correlates with their degree of debonding and that there exists an inverse correlation between their distinctive collexeme ratio and type frequency, the former taken as a proxy of semantic coherence. This implies that high type frequency goes together with low semantic coherence and high degree of debonding, and, conversely, that low type frequency correlates with high semantic coherence and low degree of debonding.
Earlier research suggests an inverse correlation between semantic coherence and debonding. For instance, Battefeld et al. (2018) show that evaluative prefixoids such as German Spitze ‘top’, Dutch top ‘top’ or Swedish skit ‘shit’ debond far more often than compounding elements that have a specific meaning. Similarly, Norde and Van Goethem (2018: 489) found that evaluative prefixoids have a far higher debonding ratio than their equivalents in simile constructions. For example, Dutch kei (literally ‘boulder’) is separated from the adjective with which it collocates in 48% of all compounds where kei has intensifying meaning (e.g., kei vatbaar ‘very prone’), but only in 12% of collocations where it has simile meaning (e.g., keihard ‘boulder hard’, i.e., ‘hard as a boulder’). In both these studies, the evaluative/intensifying prefixoids have lower semantic coherence due to substantial semantic bleaching, which goes hand in hand with higher debonding ratios. We assume that the same holds true for the Dutch “fake” morphemes. Low semantic coherence, as in the case of the constructions with nep, fake, namaak and imitatie, may result in weak morphological coherence and trigger debonding, potentially followed by adjectival reanalysis. Conversely, high semantic coherence, as attested in the constructions with schijn, fop, lok and kunst, may correlate with strong morphological coherence and prevent debonding of the “fake” morpheme.
Adopting a constructionist point of view (cf. Section 1), we claim that the eight Dutch “fake” constructions can be ranged on a cline from [+schematic] to [+substantive]. The nep-construction, for instance, shows low semantic coherence and high type frequency, and can therefore be considered more “schematic” than, for instance, the kunst-construction. The latter forms part of semantically coherent compounds with a limited number of types and can therefore be considered more “substantive”. Interestingly, only the most schematic constructions of the cline undergo debonding: they form productive schemas with few semantic restrictions on their open slot and therefore resemble syntactic patterns of the attributive [ADJ N]NP type (e.g., neppasppoort/ vals paspoort ‘fake/false passport’). This similarity may trigger the adjectival reanalysis of a morpheme such as nep. The least schematic constructions of the cline occur as coherent, often lexicalized compounds (e.g., fopspeen ‘dummy’, kunststof ‘synthetic’): they clearly belong to the domain of lexical morphology – and not of syntax – and hence do not undergo debonding or reanalysis into a syntactic construction.
Figure 11 summarizes this relation between semantic coherence (distinctive collexeme ratio), type frequency and debonding.
In this paper, we have concentrated on the “extravagant” properties of evaluative morphology. More specifically, we have argued that debonding and categorical flexibility of a set of Dutch “fake” morphemes should be considered an instance of extravagant morphology. Consequently, we refute Haspelmath’s (1999) claims that extravagance is a property of grammaticalization and that, as a result, degrammaticalization cannot occur.
Through a combination of an extensive corpus analysis and statistical methods (SVS, Multiple Distinctive Collexeme analysis, Regression), we have explored the factors that trigger the debonding and categorical flexibility of some Dutch “fake” morphemes, and that prevent others from debonding.
Our analyses show that this “extravagant” behavior is triggered by an interplay of two factors: type frequency and semantic coherence. These parameters allow us to range the eight “fake” morphemes on a cline from [high type frequency]/[low semantic coherence] to [low type frequency]/[high semantic coherence]. We argue that morphemes belonging to the upper part of the cline (nep, fake, namaak, imitatie) undergo debonding more easily due to their low degree of semantic and morphological coherence and their resemblance to the syntactic/schematic attributive construction (cf. the cline nepherinneringen/nep herinneringen/neppe herinneringen/valse herinneringen ‘fake/false memories’). Conversely, morphemes belonging to the lower part of the cline display strong semantic and morphological coherence and the (lexicalized) compounds they form are not subject to debonding or further reanalysis into adjectives.
Funding source: Belgian National Research Fund (F.R.S.-FNRS)
Award Identifier / Grant number: Research Credit FNRS-CDR J.0211.20 ACCROSS
Baayen, Harald. 2009. Corpus linguistics in morphology: Morphological productivity. In Anke Lüdeling & Merja Kytö (eds.), Corpus linguistics. An international handbook, 900─919. Berlin & New York: Mouton de Gruyter. Search in Google Scholar
Barðdal, Jóhanna. 2008. Productivity: Evidence from case and argument structure in Icelandic. Amsterdam & Philadelphia: John Benjamins. Search in Google Scholar
Battefeld, Malte, Torsten Leuschner & Gudrun Rawoens. 2018. “Evaluative Morphology” in German, Dutch and Swedish: Constructional networks and the loci of change. In Kristel Van Goethem, Muriel Norde, Evie Coussé & Gudrun Vanderbauwhede (eds.), Category change from a constructional perspective, 229─262. Amsterdam & Philadelphia: John Benjamins. Search in Google Scholar
Booij, Geert. 2009. Compounding and construction morphology. In Rochelle Lieber & Pavol Štekauer (eds.), The Oxford handbook of compounding, 201─216. Oxford: Oxford University Press. Search in Google Scholar
Booij, Geert. 2010. Construction morphology. Oxford: Oxford University Press. Search in Google Scholar
Cappelle, Bert, Pascal Denis & Manuela Keller. 2018. Facing the facts of fake: A distributional semantics and corpus annotation approach. Yearbook of the German Cognitive Linguistics Association 6(1). 9─42. https://doi.org/10.1515/gcla-2018-0002. Search in Google Scholar
Dressler, Wolfgang Ulrich & Lavinia Merlini Barbaresi. 1994. Morphopragmatics. Diminutives and intensifiers in Italian, German, and other languages. Berlin & New York: Mouton de Gruyter. Search in Google Scholar
Dressler, Wolfgang Ulrich & Annemarie Karpf. 1995. The theoretical relevance of pre- and protomorphology in language acquisition. In Geert Booij & Jaap van Marle (eds.), Yearbook of morphology, 99─122. Dordrecht: Kluwer. Search in Google Scholar
Eitelmann, Matthias & Dagmar Haumann. 2019. Extravagant morphology. Workshop organized at the 52nd Annual Meeting of the Societas Linguistica Europaea (SLE 2019), Leipzig University, 21─24 August 2019. Search in Google Scholar
Firth, John Rupert. 1957. A synopsis of linguistic theory 1930─1955. In John Rupert Firth (ed.), Studies in linguistic analysis, 1─32. Oxford: Blackwell. Search in Google Scholar
Goldberg, Adele. 2006. Constructions at work: The nature of generalization in language. Oxford: Oxford University Press. Search in Google Scholar
Goublomme, Florette. 2019. Het gebruik van het adjectief “fake” in het Nederlands. Corpusstudie en enquête bij moedertaalsprekers Nederlands over het gebruik van “fake” in geschreven en gesproken taal [The use of the adjective “fake” in Dutch. Corpus study and survey of native Dutch speakers on the use of “fake” in written and spoken language]. Bachelorpaper supervised by K. Van Goethem Université catholique de Louvain. Search in Google Scholar
Grandi, Nicola & Körtvélyessy, Lívia (eds.), 2015. Edinburgh handbook of evaluative morphology. Edinburgh: Edinburgh University Press. Search in Google Scholar
Gyselinck, Emmeline & Timothy Colleman. 2016. Je dood vervelen of je te pletter amuseren? Het intensiverende gebruik van de pseudoreflexieve resultatiefconstructie in hedendaags Belgisch en Nederlands Nederlands [To be bored to death or to enjoy oneself tremendously? The intensifying use of the pseudo-reflexive resultative construction in contemporary Belgian and Dutch Dutch]. Handelingen: Koninlijke Zuid-Nederlandse Maatschappij voor Taal- en letterkunde en geschiedenis 69. 103─136. Search in Google Scholar
Hoffmann, Thomas & Graeme Trousdale (eds.). 2013. The Oxford handbook of Construction Grammar. Oxford: Oxford University Press. Search in Google Scholar
Keller, Rudi. 1994. Language change: The invisible hand in language. London: Routledge. Search in Google Scholar
Kilgarriff, Adam, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý & Vít Suchomel. 2014. The Sketch Engine: Ten years on. Lexicography 1. 7─36. https://doi.org/10.1007/s40607-014-0009-9. Search in Google Scholar
Körtvélyessy, Lívia. 2014. Evaluative morphology. In Rochelle Lieber and Pavol Štekauer (eds.), The Oxford handbook of derivational morphology, 296─316. Oxford: Oxford University Press. Search in Google Scholar
Levshina, Natalia. 2015. How to do linguistics with R. Data exploration and statistical analysis. Amsterdam & Philadelphia: John Benjamins. Search in Google Scholar
Levshina, Natalia & Kris Heylen. 2014. A radically data-driven Construction Grammar: Experiments with Dutch causative constructions. In Ronny Boogaart, Timothy Colleman & Gijsbert Rutten (eds.), Extending the scope of construction grammar (Cognitive Linguistics Research 54), 19─46. Berlin & New York: Mouton de Gruyter. Search in Google Scholar
Mattiello, Elisa. 2013. Extra-grammatical morphology in English: Abbreviations, blends, reduplicatives, and related phenomena. Berlin & Boston: Mouton de Gruyter. Search in Google Scholar
Maechler, Martin, Peter Rousseeuw, Anja Struyf, Mia Hubert & Hornik Kurt. 2019. cluster: Cluster analysis basics and extensions R package version 2.0.9. Search in Google Scholar
Norde, Muriel. 2009. Degrammaticalization. Oxford: Oxford University Press. Search in Google Scholar
Norde, Muriel & Kristel Van Goethem. 2014. Bleaching, productivity and debonding of prefixoids. A corpus-based analysis of “giant” in German and Swedish. Lingvisticae Investigationes 37(2). 256–274. https://doi.org/10.1075/li.37.2.05nor. Search in Google Scholar
Norde, Muriel & Kristel Van Goethem. 2015. Emancipatie van affixen en affixoïden: Degrammaticalisatie of lexicalisatie? [Emancipation of affixes and affixoids: Degrammaticalization or lexicalization?]. Nederlandse Taalkunde 20(1). 109─148. https://doi.org/10.5117/nedtaa2015.1.nord. Search in Google Scholar
Norde, Muriel & Kristel Van Goethem. 2018. Debonding and clipping of prefixoids in Germanic: Constructionalization or constructional change? In Geert Booij (ed.), The construction of words (Studies in Morphology 4), 475─518. Cham etc.: Springer. Search in Google Scholar
Schäfer, Roland. 2015. Processing and querying large web corpora with the COW 14 architecture. In Piotr Bański, Hanno Biber, Evelyn Breiteneder, Marc Kupietz, Harald Lüngen, & Andreas Witt (eds.), Proceedings of the 3rd Workshop on Challenges in the Management of Large Corpora (CMLC-3), 28─34. Mannheim: Institut für Deutsche Sprache. Search in Google Scholar
Stefanowitsch, Anatol & Stefan Th. Gries. 2003. Collostructions: Investigating the interaction of words and constructions. International Journal of Corpus Linguistics 8(2). 209─243. https://doi.org/10.1075/ijcl.8.2.03ste. Search in Google Scholar
Szymanek, Bogdan. 1988. Categories and categorization in morphology. Lublin: Redakcja Wydawnictw Katolickiego Uniwersytetu Lubelskiego. Search in Google Scholar
Van Goethem, Kristel & Hendrik De Smet. 2014. How nouns turn into adjectives. The emergence of new adjectives in French, English and Dutch through debonding processes. Languages in Contrast 14(2). 251─277. https://doi.org/10.1075/lic.14.2.04goe. Search in Google Scholar
Van Goethem, Kristel & Philippe Hiligsmann. 2014. When two paths converge: Debonding and clipping of Dutch reuze ‘lit. giant; great’. Journal of Germanic Linguistics 26(1). 31─64. https://doi.org/10.1017/s1470542713000172. Search in Google Scholar
Van Goethem, Kristel & Matthias Hüning. 2015. From noun to evaluative adjective: Conversion or debonding? Dutch top and its equivalents in German. Journal of Germanic linguistics 27(4). 366─409. https://doi.org/10.1017/s1470542715000112. Search in Google Scholar
Van Goethem, Kristel & Nikos Koutsoukos. 2018. Morphological transposition as the onset of recategorization. The case of luxe in Dutch. Linguistics 56(6). 1369─1412. Search in Google Scholar
von Mengden, Ferdinand & Anneliese Kuhle. 2020. Recontextualization and language change. To appear in Folia Linguistica Historica 41. Search in Google Scholar
Wouden, Ton van der. 2019. nep-. Taalportaal. Retrieved from https://taalportaal.org/taalportaal/topic/pid/topic-13998813294731184 (accessed 05 December 2019). Search in Google Scholar
Zwicky, Arnold M. & Geoffrey K. Pullum. 1987. Plain morphology and expressive morphology. General session and parasession on grammar and cognition. In Jon Aske, Natasha Beery, Laura Michaelis & Hana Filip (eds.), Proceedings of the Thirteenth Annual Meeting of the Berkeley Linguistics Society, 330─340. Berkeley: Berkeley Linguistics Society. Search in Google Scholar
© 2020 Walter de Gruyter GmbH, Berlin/Boston