Multilingual lexical transfer challenges monolingual educational norms: not quite!

: Foreign language learners frequently use words from their previously acquired language(s) in the target language, especially if these languages are related (Ringbom, Håkan. 2001. Lexical transfer in L3 production. In Jasone Cenoz, Britta Hufeisen & Ulrike Jessner (eds.), Cross-linguistic in ﬂ uence in third language acquisition: Psycholinguistic perspectives , 59 – 68. Clevedon: Multilingual Matters). Such insertions are referred to as ‘ lexical transfer ’ , commonly divided into ‘ transfer of form ’ and ‘ transfer of meaning ’ (Bardel, Camilla. 2015. Lexical cross-linguistic in ﬂ uence in third language development. In Hagen Peukert (ed.), Transfer effects in multilingual language development , 111 – 128. Amsterdam: John Benjamins; Ringbom, Håkan. 2001. Lexical transfer in L3 production. In Jasone Cenoz, Britta Hufeisen & Ulrike Jessner (eds.), Cross-linguistic in ﬂ uence in third language acquisition: Psycholinguistic perspectives , 59 – 68. Clevedon: Multilingual Matters). Lexical transfer challenges the monolingual habitus prevailing in foreign language classes which requires students to rely exclusively on the target language and inhibit other in ﬂ uences. Thus, in such English classes, students should avoid the use of different languages and ideally only produce monolingual English output. In this context, the current study investigates the use of lexical transfer instances in short English texts written by bilingual (Russian/Turkish-German) and monolingual (German) secondary school students (initially attending year 7) from a longitudinal perspective. It assesses i) whether the students increasingly adhere to the imposed normative rules and ii) what in uence as of


Introduction
The English foreign language classroom in Germanyand also other countriesis characterized by various explicit and implicit norms that guide the education process and at least partially determine its failure or success. First, current norms mostly prescribe British English as the standard variety, although American English has gained a lot in acceptability over the recent decades (Syrbe and Rose 2018: 158). Other regional norms, however, are by and large considered unacceptable (Siemund et al. 2012). Second, the main target register systematically developed during English studies is written academic English, in spite of the current emphasis on communicative success rather than grammatical correctness (KMK Bildungsstandards 2012). All written tests especially in the Gymnasiale Oberstufe (the higher academic track) presuppose a high proficiency in this register and students are penalized for using non-norm-abiding language. Third, the German English foreign language classroom, especially in the upper grades, displays a strong monolingual orientation in that English serves both as the target language and the language of instruction, despite the multilingual turn in language education (Berthele 2020;Fuller 2020;García and Li Wei 2014;May 2014;Melo-Pfeifer 2018). The use of German or other languages carries the stigma of incompetence and both students and teachers make every effort to avoid it (Fuller 2020: 173). Language mixing is avoided (Fuller 2020: 167). This philosophy increases the time spent on the task of learning English by using it as much as possible, which is believed to enhance educational success. It represents the dominant philosophy in EFL classrooms around the world (see Malabarba 2019: 244-246 who views this as the legacy of Krashen 1985).
Against this background, it can come as no surprise that any form of crosslinguistic influencebe it in the form of code-switching or learner errors due to transferis heavily stigmatized and diagnostic of an insufficient language learning process (Fuller 2020). We find this interesting, as code-switching into the opposite direction can be widely encountered in spoken and also written German, there being innumerable lexical borrowings (e.g. Teenager, flirten, etc.). The use of English words and phrases in German is associated with prestige, modernity, and stylishness. The speech of adolescents is replete with English (lost, cringe, no front, random, smooth, etc.). The lexical item lost was designated the Jugendwort (youth word) in 2020. 1 We also believe that the use of English lexical items in the German studying classroom is considerably more acceptable than that of German in the English classroom. Apparently, the inherent deficit philosophy targets certain areas, but spares others. This certainly reveals something about the relative social prestige of the two languages (de Swaan 2001), but also about the unfortunate reproduction mechanisms of power relations in the education system (Bourdieu 1991;Gogolin 2008). Moreover, since contemporary research is suggestive of the fact that the languages in the multilingual's mind enrich and support one another, rather than standing in one's way, the normative monolingual approach is likely to forfeit important potential (Bonnet and Siemund 2018;Cummins 2007).
In the initial phases of foreign language acquisition students cannot but make errors and code-switch (Bardel 2015: 121). This is a natural response to the new communicative demands imposed on them. The goal of the foreign language classroom is to work towards the target language norms including the avoidance of code-switches. A natural expectation, then, is that the number of code-switches decreases along the learning trajectory.
The present study follows up the use of German lexical material in English learner data ('lexical transfer') over a period of two and a half years. We compare three learner groups, namely (i) learners of English with a monolingual German background, (ii) bilingual Russian-German learners of English, and (iii) bilingual Turkish-German learners. The students with a bilingual background effectively learn English as their third language (L3 acquisition). With 1,553 lexical transfer tokens drawn from sixty students during four measurement points (MP), the sample used here is certainly meaningful. Additionally, the study monitors English proficiency and controls for the background variables of age, gender, school type, and socio-economic status (SES).
The present study's research interest concerns the impact of the above background variables on lexical transfer over time. We are approaching the data with the following hypotheses in mind. First, there should be a general decrease of lexical transfer tokens over time (H1). This follows from the normative considerations introduced above and the assumption that the students show general learning progress. Second, lexical transfer primarily originates in German and not the heritage languages (H2), as the heritage students are unbalanced bilinguals and German is their dominant language. Third, we expect bilingual learners of English to be more norm-conscious (H3). This hypothesis is based on the observation that bilingualism is known to foster metalinguistic awareness (Herdina and Jessner 2002). Sensitivity to lexical transfer should rise with metalinguistic awareness. Accordingly, they can be expected to produce fewer instances of lexical transfer in the school context and they should also show a stronger decrease over time. Fourth, female students should manifest lower lexical transfer ratios than male students (H4), since female gender is widely associated with higher levels of norm-sensitivity and norm-conformity. Generally, women appear to respond more strongly to overt rather than covert prestige (Cameron 2003;Eckert 1989). Finally, fifth, we expect effects of school type and socio-economic status such that students at the more prestigious institutions from high SES homes reveal higher levels of compliance with curricular norms (H5) (see Kultusministerkonferenz 2019; Stanat et al. 2016 Lexical transfer refers to the effects of vocabulary knowledge in one language on the lexical system in another language. An important distinction drawn in research on lexical transfer is that between form-based and meaning-based lexical transfer (Bardel 2015: 117-120;Ringbom 2001: 60). 2 Ringbom's classification (1987) is probably the best-known attempt to study lexical transfer, as he was among the first to draw this distinction. Form-based lexical transfer occurs when complete lexical bundles (form plus meaning) of a learner's background language are transferred to the target language. They are usually easy to identify. For Ringbom, transfer of form occurs in "complete language switches and in the use of deceptive cognates" (2001: 60). These words may be modified to fit the formal characteristics of the target language. Meaning-based cross-linguistic influence involves lexical items which form part of the target language lexicon but are not semantically appropriate in a given context (Lindqvist 2012: 257). Ringbom (2001: 60) subcategorizes semantic transfer "in calques, i.e. loan translations of multi-word units, and in semantic extension on the basis of the pattern of another language."

Factors influencing lexical transfer
Transfer can be observed between all language systems known to language users in any direction (Jarvis and Pavlenko 2008: 21-22) and it can either be positive or negative (Jarvis and Pavlenko 2008: 25;Odlin 2003: 438-439). During L3 production, "lexical elements seem to be easily mixed into an L3 from all previously acquired or learnt languages" (Bardel 2015: 113). Such (co)activation of previously acquired languages is governed by several factors.
To begin with, (psycho)-typological similarity is considered to be a significant factor in lexis acquisition. For example, Ringbom (2001: 65) stressed that typology is one of the most important prerequisites for cross-linguistic influence. Similarly, De Angelis and Selinker (2001: 49-56) suggest that the connection between L2 and L3 is stronger than between L3 and L1, especially when L2 and L3 are typologically similar. Moreover, subjective similarities and differences between languages have been shown to be important for transfer as well (De Angelis 2005;Odlin and Jarvis 2004;Williams and Hammarberg 1998). In such cases, one speaks of psychotypology (Kellerman 1983), which cannot easily be disentangled from typology.
In addition, Dewaele (1998) demonstrates that the order of language acquisition and the level of proficiency may be crucial in defining the source language. He studied non-target-like lexemes ("lexical inventions") in the advanced English/French interlanguage of Dutch L1 speakers (Dewaele 1998: 471). One group of the participants spoke French as their L2 and English as their L3, another had English as their L2 while French was their L3. Dewaele's (1998: 488) results show that during L3 production both L1 and L2 may be activated, but speakers of several languages tend to rely on the language(s) they are highly proficient in. Hence, it may be a learner's L2 that is most actively used, thus causing lexical transfer.
Furthermore, increasing proficiency in the target language has been shown to result in a decrease of non-target lexemes (Lindqvist 2009: 294). De Angelis and Selinker (2001: 50-56) investigated L3 English written production by two adult multilinguals and demonstrate that lack of linguistic knowledge during early stages of language acquisition tends to be compensated by relying on previously acquired languages. Bardel (2015: 121) reports that "in the beginning, when the vocabulary is restricted, lexical gaps need to be filled" which justifies why language learners use their word knowledge from other languages during early stages of language acquisition. Agustín-Llach in her recent research on spontaneous vocabulary production by intermediate EFL learners has shown that cognate use is the most frequent type of cross-linguistic influence for Spanish monolinguals and bilinguals who speak other languages as L1 in addition to Spanish (2019a: 57).

Methodology
The data for this study come from the large-scale longitudinal project on multilingual development Mehrsprachigkeitsentwicklung im Zeitverlauf (Multilingual development: a longitudinal perspective (MEZ); Gogolin et al. 2017) that was carried out at the University of Hamburg from 2014 to 2019, sampling various secondary schools across Germany (see also Brandt et al. 2017;Gogolin 2021;Gogolin et al. 2021). The research aim of MEZ is to investigate how linguistic and non-linguistic factors influence multilingual development of secondary school students in Germany.

Participants
The participants of the current study represent a stratified subgroup of the MEZ project. We randomly selected 20 Russian-German and 20 Turkish-German heritage speakers and a monolingual German group (n = 20) along a number of defined conditions. In every group, half of the participants attended Realschule (the lower or vocational-oriented secondary school track) and another half Gymnasium (the higher or university-bound school track). 3 We here exclusively focus on the younger cohort, which, at the beginning of the MEZ study, attended school year 7 (age 12 and 13). Data from all four measurement points (MP) are included, that is, MP 1 (middle of school year 7), MP 2 (beginning of school year 8), MP 3 (end of school year 8), MP 4 (end of school year 9).
The bilingual participants of this study are students with a migration background 4 who speak a heritage language, namely Russian or Turkish, and German, the majority language. These students can be categorized as unbalanced bilinguals (see Gogolin 2021;Lorenz et al. 2020). They fit Montrul's definition of heritage speakers (2010: 4, 2016: 17), because the language they use most 3 In Germany, there are typically four years of primary education. Secondary education is divided into different types of schools, usually depending on the teachers' assessment based on school grades obtained in the final year of primary education. Normally, students have to reach a certain threshold to be allowed to attend the university-bound school track (for more detailed information see Kultusministerkonferenz 2019). 4 A person is considered to have a migration background if they or at least one parent did not acquire German citizenship by birth (Statistisches Bundesamt 2018: 4). frequently is German, and the use of the heritage languages is restricted to communication with family members, typically their parents or grandparents. Furthermore, in many households, German is used along with the heritage languages Russian or Turkish, as reported by 33 out of the 40 bilinguals in the current study.
All MEZ participants learn English as a first foreign language in school. For the bilingual heritage speakers, this is the third language. All participating students had studied English for at least two years at the beginning of the project and had reached at least beginner/intermediate levels in English.

Instruments of data collection
All students were presented with a picture description task during each of the MPs to elicit comparable picture descriptions. The picture story Breakfast in Germany was offered during MP 1, Picnic Party during MP 2, and A Trip to Hamburg during MP 3. In MP 4, the participants were randomly assigned one of the above-mentioned tasks. Ideally, this would have yielded four texts per student, but two Russian-German heritage speakers did not participate in the last measurement point. Thus, the learner corpus of the study consists of 238 English texts, and a total of 32,433 word tokens. 5 Background information, such as gender, type of school, and family socioeconomic background (HISEI) 6 was gathered from all participants with questionnaires completed by the students, parents, and teachers. Moreover, MEZ also includes attitudinal and motivational variables but these did not turn out to contribute to the regression model fitted for the current study and are therefore not considered here. To assess their proficiency levels in English, the participants had to take a gap-filling C-test at every measurement point. These English C-tests were based on those used in the DESI-study (Klieme et al. 2006) and consisted of four short English texts each time.

Analysis and coding
We compiled a machine-readable learner corpus from the written output of the four picture description tasks. During the coding process, we went through all files at least twice to ensure consistency. On noticing a case of lexical transfer from the background languages German (GER), or the heritage languages Russian (RUS) and Turkish (TUR), we marked it and calculated the number of types and tokens of this lexeme to assess the type-token-ratios of lexical transfer among different groups of participants.
Lexical transfer was coded according to the classification scheme developed in Rahbari et al. (2019). We differentiated between cases of form-and meaning-based transfer. Form-based transfer was analyzed along three different dimensions: integration, motivation, and transfer of orthographic conventions. The following examples illustrate phonemic, morphological, and orthographic integration: 7 ballong from GER Luftballons 'balloons', <ng> for the [ŋ] sound (=phonemic integration); deked from GER decken 'cover', past tense ending <ed> was added (=morphological integration); salz from GER Salz 'salt', not capitalized [in German all nouns are capitalized] (=orthographic integration).
In addition, it was marked whether lexical transfer was motivated or not. In the former case, we differentiated between false and real cognates though not in the latter: salat from GER Salat 'lettuce' or from RUS cалат (salat) 'lettuce' 8 (=false cognate); Brötchen from GER Brötchen 'buns' (=no motivation). Then, transfer of orthographic conventions from the background languages into the English texts (capitalization, joint spelling, grapheme replacement or other) was marked in the data: Table (=capitalization); orangejuice (=joint spelling); glaß <glass> 'glass' (=grapheme replacement); appel from GER Apfel 'apple' (=other orthographic convention).
Under the category of meaning-based transfer, we grouped the cases of semantic extension and loan translation. The main difference between these two labels lies in the number of lexemes: cases of semantic extension include one lexeme and cases of loan translation consist of two or more lexemes or of compound nouns, 9 for example: many cheese from GER viel 'much/many' or from 7 The cases of lexical transfer are marked in the dimension of integration if the supplier language lexemes follow the phonemic, morphological, and orthographic constrains of the English language. 8 The picture story shows lettuce but not salad. 9 Note that some cases of loan translation instances involve both lexical and syntactic features. Their non-target-like use can be motivated by lexical and morpho-syntactic features of the background languages. RUS много (mnogo) 'much/many' (=semantic extension); much things from GER viel 'much/many' or from TUR çok 'much/many' (=semantic extension); you must pay it all from GER alles bezahlen 'it all pay' or from RUS оплатить это все (oplatit' eto vse) (=loan translation). 10 Naturally, the learner corpus contains many cases where form-and meaningbased transfer can be observed at the same time, for instance: breads from GER Brote literally 'breads', here 'buns' (=false cognate, GER Brote is a false cognate with ENG breads, or semantic extension, the meaning of the noun bread is extended); cock from GER kochen 'boil/cook' (=false cognate, GER kochen is a false cognate with ENG cook, or semantic extension, the meaning of the verb cook is extended); you have all from GER alles 'everything' (=false cognate, GER alles is a false cognate with ENG all, or semantic extension, the meaning of all is extended) or from RUS все (vse) 'everything' (=semantic extension, the meaning of all is extended). 11 Moreover, we excluded cases where it was unclear whether it was a learner mistake or influence from any of the background languages. Also, we decided not to analyze sentences that were written in German or where code-switches as in example (1) could be observed. (1) Ihr könnt in the Hamburg laden gehen und erinnerungen kaufen you can in the Hamburg store go and souvenirs buy 'You can go to the Hamburg store and buy souvenirs.' The matrix language in such sentences is German, visible here in the position of the verb phrase. The auxiliary verb occupies the second position (könnt) whereas the main verbs follow the noun phrases (gehen, kaufen).

Procedure
In the current study, we exclusively distinguish between type and token frequency of lexical transfer instances. We further disregard different subcategories of formbased or meaning-based lexical transfer. Types are unique instances of lexical transfer per text, whereas tokens represent the absolute number of lexical transfer instances, including repetitions. In order to assess the degree of normativity of the 10 The word order in Russian could be slightly different (это все оплатить (eto vse oplatit') 'it all pay'), however, both options exclude the use of a preposition. 11 Note that there are some cases of lexical transfer in the RUS-GER sub-corpus that could be analyzed as form-and meaning-based lexical transfer from German, but only as meaning-based transfer from Russian, as in such cases German and English share more cognates than Russian and English.
English writing, the main analysis is based on lexical transfer tokens instead of types. We normalized these to the basis of 100 words because the student texts differ in length. 12 This ensures comparability. In a first step, we investigate how these frequencies develop over time and how they interact with proficiency in English (based on the C-test results). Second, we relate these frequencies to a number of social variables, namely language group, type of school, gender, and socio-economic status. After considering each variable separately, they are combined in a generalized linear regression model to provide a more comprehensive view and to control for their individual effects.

Results
The following subsections correspond to the five hypotheses stated in Section 1 and present the results in the order given there. In addition, the final subsection includes a regression analysis which combines the previously introduced variables.

Decrease of lexical transfer tokens over time (H1)
Table 1 provides the descriptive frequency overview of the longitudinal development of the absolute numbers of lexical transfer as well as the mean ratio per 100 words for each language group separately. We removed outliers with extremely high lexical transfer token ratios (three standard deviations above the mean, i.e. more than 14 lexical transfer tokens per 100 words). In total, 24 out of 238 texts were removed from the final data set. 13 By and large, even after outlier removal, there remains a considerable amount of internal variation (visible in the standard deviations which range between 1.71 and 3.62) across all language groups and all measurement points. Over time, i.e. from the first measurement point (MP 1) to the last (MP 4), we observe a decrease in all groups. Moreover, whereas the Turkish-German bilinguals started out with overall lower lexical transfer token ratios, this difference reduces towards the later points of data collection. This means that over the course of two and a half years, the students gradually use fewer lexical transfer and the groups become more similar.
12 The shortest text had 37 word tokens, whereas the longest consisted of 373 word tokens. The mean length across the corpus is 142, the median equals 140, and the standard deviation is 43.6. 13 Texts from all language groups (n GER = 10; n RUS-GER = 8; n TUR-GER = 6) as well as all measurement points (MP 1: n = 9; MP 2: n = 8; MP 3: n = 3; MP 2: n = 4) were considered outliers. These texts contained more than 14 and up to 34 lexical transfer tokens per 100 words. Moreover, we ran a correlation analysis using Pearson's r for the scores of the English C-test. Without distinguishing language group membership, correlations between the C-test scores and the ratios of lexical transfer tokens reach statistical significance for all measurement points (MP 1: r = −0.27, p = 0.030; MP 2: r = −0.56, p = 0.000; MP 3: r = −0.53, p = 0.000; MP 4: r = −0.47, p = 0.000). With increasing C-test score (i.e. higher proficiency in English), the ratio of lexical transfer tokens decreases. A closer look at each language group separately reveals some interesting patterns. In the texts of the German monolingual students all moderately or strongly negative correlations are significant (MP 1: r = −0.54, p = 0.019; MP 2: r = −0.79, p = 0.000; MP 3: r = −0.81, p = 0.000; MP 4: r = −0.86, p = 0.000). The same, though with weaker correlational strengths, can be reported for the Turkish-German bilinguals, except for measurement four where statistical significance is not reached (MP 1: r = −0.45, p = 0.028; MP 2: r = −0.61, p = 0.003; MP 3: r = −0.46, p = 0.028; MP 4: n.s.). However, none of the correlations in the texts of the Russian-German bilinguals reaches statistical significance, neither across the entire group, nor at any of the measurement points individually.

Source language of lexical transfer (H2)
The results show that German is the main source language of lexical transfer (here based on the absolute type frequency, see Figure 1) in both the Russian-German (77.1%, n = 384) as well as the Turkish-German (91.0%, n = 352) texts. A minor proportion can be accounted for by transfer from Russian (0.2%, n = 1), respectively Turkish (1.6%, n = 6), and a slightly higher number of lexical transfer instances can be analyzed as influence from Russian and German (22.3%, n = 111) or Turkish and German (7.5%, n = 29). Moreover, two lexemes (0.4%) in the Russian-German group may come from German or the foreign language French (recherche/cherche instead of English search, from German recherchieren or French recherche), and one out of the 410 lexical transfer types identified in the texts of the German monolinguals was coded as influence from French (supermarche instead of English supermarket, from French supermarché); the remaining originate in German.

Norm consciousness of bilingual and monolingual students (H3)
When comparing each group internally, there is a statistically significant difference in the mean ratio of lexical transfer tokens over time in the group of the German monolinguals (t(30.81) = 1.78, p = 0.042) as well as the Russian-German bilinguals (t(32.90) = 1.84, p = 0.037) based on one-tailed t-tests comparing MP 1 with MP 4. The same effect cannot be observed in the Turkish-German group, although the means also decrease over time (t(34.93) = 1.38, p = 0.089). Furthermore, comparisons within one measurement point reveal that some of the observed differences are significant in MP 2 (between the monolingual German and the Russian-German students (t(32.00) = −3.22, p = 0.002), as well as between the Russian-German and Turkish-German students (t(32.61) = 2.46, p = 0.019)) and MP 3 (between the Russian-German and the Turkish-German students (t(33.89) = 2.66, p = 0.012)), but only marginally significant in MP 1 (between the Russian-German and Turkish-German bilinguals (t(33.52) = 1.89, p = 0.067)), and not significant at the latest measurement point MP 4.

Gender effects (H4)
When looking at gender (female, male), none of the group internal differences in the mean ratios of the lexical transfer tokens, neither across the entire cohort nor subdivided into the four different measurement points and language groups, turns out to be significant. The only exceptions are the comparisons of female and male German monolingual students in the first measurement point (t(12.86) = −2.05, p = 0.031), where females produced significantly fewer lexical transfer instances than their male peers, and between the female and male Turkish-German bilinguals in the third measurement point (t(17) = 1.94, p = 0.035), where the mean ratio of lexical transfer tokens is lower among the males.

School type and SES effects (H5)
We ran several t-test comparisons with respect to type of school and normalized lexical transfer tokens, namely across the entire population in general and over time, as well as within each language group over time. There is a statistically significant difference between Gymnasium and Realschule when all participants and all MPs are considered (t(191.03) = −3.99, p = 0.000). This means that those attending Gymnasium have a significantly lower mean ratio of lexical transfer tokens than their peers attending Realschule. When testing each MP separately, again across the entire sample, the difference between the two school types is only significant at MP 2 (t(42.96) = −2.72, p = 0.005) and MP 3 (t(45.73) = −3.39, p = 0.001), with the same trend as before. Within each language group, a more diverse picture emerges. Among the monolingual German students, all t-tests are significant except for MP 1 (MP 2: t(10.32) = −2.64, p = 0.012; MP 3: t(14.86) = −3.14, p = 0.003; MP 4: t(8.29) = −2.50, p = 0.018). When repeating the same tests with the Russian-German or the Turkish-German students, then the differences are not statistically significant, with the exception of MP 2 in the Russian-German group (t(14.47) = −2.04, p = 0.030), and MP 3 in the Turkish-German group (t(11.93) = −1.86, p = 0.044). Correlation analysis (Pearson's r) reveals a weak negative association between SES and the ratio of lexical transfer tokens when all participants over all four measurement points are considered (r = −0.233, p = 0.002). The higher the socio-economic status, the fewer lexical transfer tokens appear per text. When calculating the correlational strength for each measurement point separately, the moderate negative relationship is only significant for MP 2 (r = −0.407, p = 0.009). A further subdivision into language groups and a remaining longitudinal perspective returns no significant negative correlations, except for MP 1 in the German monolingual group (r = −0.642, p = 0.017). This can most likely be explained as an effect of the resulting low numbers, with at times only 10 to 12 students per group, as for some MEZ-participants SES is unknown.

Regression analysis
In order to control for the previously introduced background variables, we ran a linear regression analysis in R (R Core Team 2020). We used a Poisson regression, a special type of a generalized linear model, as we are dealing with frequencies (Gries 2013: 324-327;Levshina 2015: 170, 257). The final regression model includes the variables language group (group), measurement point (MP), type of school (school), English C-test score (C-test), and socio-economic status (SES). This model was selected via the best subset method (see Levshina 2015: 151-152) with the function regsubsets implemented in the package leaps (Lumley 2020). The data were checked visually to ensure linearity (via crPlots). Homoscedasticity of variance was verified with the Breusch-Pagan test, which returned a p-value above 0.05 ( p = 0.10). None of the Variance Inflation Factors (VIF scores) was above 1.5, which signals that there is no multicollinearity. The Durbin Watson Test returned a p-value of 0.17, indicating no autocorrelation. Finally, the normal distribution of the residuals was assured with the Shapiro-Wilk normality test ( p = 0.39).
The regression model (Table 2) includes 146 data points, which represent the normalized lexical transfer tokens of 146 texts. This is a considerable reduction of the original data set (n = 238) caused by the relatively high number of outliers as well as missing cases in the background data (for example missing SES values). The model predicts 54% of the variance, which is an acceptable value. Four out of the five variables add significantly to explaining the variance, namely language group (Turkish-German), the measurement point (MP), the English C-test score, as well as the socio-economic status. Once all variables are controlled for, the Turkish-German bilinguals used significantly fewer lexical transfer tokens than their monolingual peers. Although the estimate is positive for the Russian-German bilinguals, it does not reach statistical significance. Moreover, in a longitudinal perspective (i.e. increasing measurement points), we observe a decrease in lexical transfer tokens. Both C-test in English as well as the socio-economic status show the same direction: with increasing C-test score or with increasing socio-economic status, the ratio of lexical transfer tokens per 100 words decreases once all other variables are kept constant. Surprisingly, the type of school (Gymnasium vs. Realschule) does not significantly add to explaining the variance in the model as soon as the English C-test score is included.

Discussion
In the following sections, we discuss the above results in light of the five hypotheses initially stated.

Parameter
Regression model

Decrease of lexical transfer tokens over time (H1)
All students in the sample exhibit a decrease in lexical transfer tokens over the four measurement points. In other words, they develop compliance with the monolingual norms imposed by the classroom setting (Fuller 2020;Wiese et al. 2017). The result is double monolingualism, 14 but it is bound to a specific institutional context, as code-switching continues outside classroom (see Section 5.3).
Although the immediate normative pressure is exerted by the education system, what is really at stake is an imagined identity with the (mainly standard British) English-speaking monolingual community. Furthermore, English serves as "an emblem of global identity" (Fuller 2020: 176). Moreover, we observed a negative relationship between the C-test scores and the number of lexical transfer tokens per text across all learners. This relationship further supports the longitudinal learning progress of the students, where higher proficiency in English (as measured with the C-tests) can be associated with lower rates of lexical transfer.

Source language of lexical transfer (H2)
That German was identified as the main source language of transfer may have several explanations. To begin with, both English and German are Germanic languages, while Russian is a Slavic language and Turkish a Turkic language. Typological similarity between German and English results in the use of false and real cognates, as well as in the numerous cases of semantic extension and loan translation. This is in line with several studies on lexical influence between two or more non-native languages (De Angelis and Selinker 2001;Dewaele 1998;Odlin and Jarvis 2004;Ringbom 1987;Williams and Hammarberg 1998). As argued above, psychotypology cannot easily be disentangled from typology, and it is very likely that we here see influence from psychotypology as well.
Typological distance could have helped the Turkish heritage speakers to start out with the lowest ratio of lexical transfer compared to the two other groups. There are only few cases of co-activation of Turkish and German visible in the data, and Turkish heritage speakers relied proportionally more on their previously acquired German than their Russian-German peers. Simultaneous co-activation of Russian and German in the texts of Russian heritage speakers, on the contrary, resulted in proportionally higher lexical transfer ratios in the texts of these participants.
Another factor that cannot be easily detached from typological proximity is language dominance. German is typologically the closest language to the target language English and also enjoys a dominant status. As pointed out in Section 3.1, the bilingual heritage speakers in this study are dominant in German and use their heritage language less frequently, as it is often restricted to the family context.
A further point worth mentioning is teaching style and, broadly speaking, the educational context. The monolingual habitus encountered in schools does not offer much leeway for the bilinguals. The use of heritage languages is typically discouraged (Siemund and Lechner 2015: 148), which is therefore most likely not perceived as a linguistic option or at least not the most obvious option. German is used predominantly if not exclusively in the school context and can be expected to suppress other potential source languages (see also Tomoschuk et al. 2021 on the effects of language of instruction).
Moreover, one needs to bear in mind the particular stimuli (picture description tasks) that the study participants received during the four measurement points. The topics Breakfast in Germany and A Trip to Hamburg might have triggered certain German lexemes and filtered out potential lexemes from the heritage languages. Perhaps the topics Breakfast in Russia/Turkey or A Trip to Moscow/Istanbul would have triggered more instances of lexical transfer from Russian or Turkish.
Finally, bilinguals do not randomly transfer lexemes from their previously acquired languages but adjust to their interlocutors qua accommodation (Aalberse et al. 2019: 82-85;Gardner-Chloros 2009: 78-81;Sachdev and Giles 2004; see also Muysken 2013 on bilingual optimization strategies). The study participants might have assumed that the audience (normally the teachers; here German speaking researchers) is very likely to have a high command of German, as the study was carried out in Germany. However, they probably did not assume that their audience could have knowledge of Russian or Turkish. It is feasible that the heritage language speakers wanted their texts to be comprehensible for their readers, and that they therefore suppressed the use of their heritage languages.

Norm consciousness of bilingual and monolingual students (H3)
We initially expected that the bilinguals enjoy an advantage over their monolingual peers manifesting in a lower ratio of lexical transfer tokens and a stronger decrease over time. It may thus appear at first surprising that we cannot detect such differences between the monolinguals and the bilinguals in terms of lexical transfer usage. What is more, there are pronounced differences between the two bilingual groups. The data reveal the highest ratio of lexical transfer for the Russian-German bilingual group and the lowest for the group of Turkish-German bilinguals. Students with a German monolingual background are placed in between. This trend is consistent with the exception of measurement point two (where Turkish-German bilinguals rank above German monolinguals). Accordingly, we cannot corroborate our initial hypothesis that the bilingual groups show a greater norm-sensitivity due to their increased levels of metalinguistic awareness. If true, this only holds for the Turkish-German group. Nevertheless, it is well-known that this group code-switches substantially in the home context, there being important differences between communication amongst peers and communication between children and their parents (Auer 2010: 463;Küppers et al. 2015;Şahingöz 2014: 9). They can be expected to be very sensitive to this issue, however unconsciously. In addition, even though the German monolinguals and the Russian-German bilinguals start out with the highest ratios of lexical transfer tokens compared to the Turkish-German bilinguals, there seems to be a trend for this difference to become smaller. Thus, the participants converge over time. This is also an unexpected finding as we predicted lower rates or a more pronounced decrease among the bilinguals. Yet, from an educational perspective this is reassuring, as English language instruction seems to have a comparable effect for all three language groups. In other words, differences surface stronger in the beginning, i.e. at an earlier phase of language acquisition, but they decrease over time, after a considerable amount of formal language instruction. This supports findings from other studies, equally set in the German context, which compared the English performance of monolingual German students with their bilingual peers. They reported an increasing uniformity with advancing age or years of schooling because of the unifying effect of formal education (see Siemund and Lechner 2015 on cross-linguistic influence; Maluch et al. 2016 on general achievement in English; see also Agustín-Llach 2019b for cross-linguistic influence in the Spanish context). In addition, advantages, if at all, were only found in the younger cohorts.

Gender effects (H4)
Regarding the stipulated lower frequency of lexical transfer tokens amongst female students, this could only be corroborated in the monolingual German group and only at the first measurement point. We believe that it is no coincidence to find significant differences only among younger students, as the normative pressure imposed by the education system in the foreign language classroom is likely to level out differences with increasing age. However, we cannot explain why this difference is only visible in the monolingual group. Moreover, we failed to detect any obvious trends at the subsequent measurement points. Sometimes the male students show higher lexical transfer rates (significant in the Turkish-German group at MP 3), sometimes the female students do. Having said that, it needs to be borne in mind that the original observations regarding a greater norm-orientation of females by now have a long history (Labov 1966;Trudgill 1972). The world has changed substantially since, and the greater empowerment of womenequivalent to a decreasing linguistic insecurity and perhaps less reliance on overt prestige formsmay begin to show up in the data. In addition, any such differences emanated primarily from studies on certain phonological variables (see Cameron 2003). Lexical transfer may be an area of language where these differences simply do not surface to a similar extent.

School type and SES effects (H5)
As far as the influence of school type is concerned, the data basically confirm our initial expectations. Students attending Gymnasium consistently display lower ratios of lexical transfer, which can be plausibly related to the higher norm-orientation prevalent at this institution. However, only for the German monolingual group do these differences reach statistical significance during most measurement points. Apparently, the students with a German background are more strongly stratified in relation to norm-orientation. This may be related to the fact that the average socio-economic status of the bilingual heritage groups ranks considerably below the monolingual German group (German monolingual: 67; Russian-German: 40; Turkish-German: 44). Consequently, the German group contains more students with a high socio-economic background correlating with a strong norm-orientation.

Conclusion
Lexical transfer, or lexical insertions from previously acquired languages into the target language production, is viewed as not adhering to the norms defined in the context of foreign language learning in school. Despite the multilingual turn in language education and the emphasis on communicative success rather than grammatical correctness, the English language classroom is governed by a monolingual orientation and favors, or in fact requires, the use of English only. This study reveals that type of school (Gymnasium vs. Realschule), age, and SES impact instances of lexical transfer. Moreover, we believe that the norm orientation present in the English language classroom further intensifies the effect of these variables. We also observe differences between Russian-German bilinguals, Turkish-German bilinguals, and monolingual German secondary school students. Also, bilinguals tend to transfer from the majority language German. This may be explained by the typological similarity between English and German, the dominant status of German, the educational context, the particular stimuli, and the concomitant inhibition of heritage languages. All in all, the use of lexical transfer decreases over a period of two and a half years, and the groups become more similar.
Research funding: The study was supported by the German Federal Ministry of Education and Research under the grant "MEZ (01JM1406)".