Language pedagogies and late-life language learning pro ﬁ ciency

: Late-life language learning has gained considerable attention in recent years. Strikingly, additional language (AL) pro ﬁ ciency development is underinvestigated, despite it potentially being one of the main drivers for older adults to learn an AL. Our study investigates whether Dutch older adults learning English for three months signi ﬁ cantly improve their AL skills, and if explicit or implicit language instruction is more bene ﬁ cial. Sixteen learners participated in online weekly group lessons, ﬁ ve days of 60-min homework, and pre-post-retention tests. Half were randomly assigned to the mostly explicit condition and half to the mostly implicit condition. Data includes language pro ﬁ ciency measures and 201 dense-data spoken homework samples. Results show improvements in several areas for both conditions. For structural errors in homework, wefoundimplicitlytaught participants to make signi ﬁ cantly more mistakes. Our exploratory data show that older adults signi ﬁ cantly develop AL pro ﬁ ciency after a short language training, and, as we only found di ﬀ erences between conditions on one construct, that teaching pedagogies do not play a substantial role.


Introduction
With global life expectancy increasing, late-life language learning, or LLLL for short, has gained much attention in recent years.Although most studies have defined LLLL's starting age to be 65+, it is hard to reduce ageing to a number due to it being a highly individual experience where social, cognitive and physiological factors interact (Christopher 2013).The work that has been done in the field of LLLL mostly comprises language learning in an instructed setting in the form of additional language learning (Singleton and Pfenninger 2019).The majority of LLLL studies, however, have investigated the potential cognitive benefits ensuing from learning a language later in life (Bak et al. 2016;Bubbico et al. 2019;Ramos et al. 2017;Ware et al. 2017), and fewer but still a substantial number of researchers have investigated the socio-emotional (related to well-being) effects that follow from language learning (Klimova and Pikhart 2020;Klimova et al. 2021;Pikhart and Klimova 2020;Pot et al. 2019;Valis et al. 2019).In contrast, and counterintuitively, very few studies to date have looked at actual language learning outcomes for older adults in instructed second language acquisition (ISLA) contexts.Hence, it is not clear how older-adults learn or what works best for them.Indeed, most second language acquisition (SLA) theories are based on younger learners and this body of research has not been extended to older learners.However, late-life is generally viewed as a period for selfexploration, personal achievement, and self-fulfilment (Laslett 1991) where language learning fits in perfectly.Therefore, this study dives deeper into instructed LLLL by investigating proficiency gains ensuing a three-month English course for Dutch older adults.

Language outcomes in LLLL
In one of the earliest investigations (Gruneberg and Pascoe 1996), 40 healthy older adult females (Mage = 70.45)participated in a single session in which they either learnt word pairs using either a so-called keyword method (i.e., a word is linked to an English similarly sounding word by means of an image) or by merely being presented with the word pairs.This vocabulary learning was done by listening to an audio tape consisting of 20 English words and their Spanish translation.Results showed that both receptive and productive recall were significantly better in the keyword method group when being lenient with rating (i.e., substantially correct showing some knowledge of what the Spanish should be; the so-called soft criterion).When being less lenient in rating correctness of responses (i.e., almost correct, but not strictly phonetically correct for the middle criterion and totally phonetically correct for the hard criterion) there were no differences between the groups.Additionally, it needs to be pointed out that listening to 20 word pairs in a single session is not reminiscent of an actual language course.Other studies, too, have focused on very specific language outcomes, for instance on question formation.In Mackay and colleagues' (2012) study, nine older Spanish L1 speakers (Mage = 72) met individually with a native speaker on a weekly basis for five weeks.This included testing sessions as well meaning that there were three language sessions in the study during which participants and native speakers used communicative tasks to elicit question forms.The duration of each training session is not included in the study.Results showed that question formation in participants improved after the sessions but only two out of nine adults managed to sustain this improvement.Similarly, older adults have been shown to frequently pose questions in the language classroom, but that a large portion of these questions were unrelated to the progressivity of the lesson.Instead, they asked "wonderment questions": questions on something they wonder about (van der Ploeg et al. 2022).
In one of the only studies to date that comprise a longer time course than a few sessions, the time course of improvement could be more closely monitored.In Pfenninger and Polz's (2018) study, 12 German-speaking older monolinguals and bilinguals (6 German/Slovenian sequential bilinguals, Mage = 84.33;6 German monolinguals Mage = 68.20)were enrolled in a four-week English course.Although participants did significantly better on overall proficiency, no significant improvements were observed on receptive tasks.Interestingly, the bilinguals did not outperform the monolinguals.Taking an even longer language instruction perspective, Kliesch and Pfenninger (2021) followed 28 German older adults (Mage = 68.5)learning Spanish for 30-32 weeks in small groups.Instruction comprised both in-person classroom instruction and Duolingo practice at home.The authors found that the increase in AL mainly occurred during the initial 10-20 week interval of learning.
Finally, some studies have investigated factors to influence AL learning in olderadulthood; the similarity of known languages to English, L1 skills, and English language exposure predicted English skills in older adults (N = 19; M = 72.92)(Blumenfeld et al. 2017).This is corroborated by Kliesch and colleagues (2018), who found higher L1 verbal fluency to be related to faster language learning rates.
The scant evidence to date points in the direction of the possibility of successful language learning in older adulthood.Although older adults in comparison studies have in general been found to be slower in learning a new language than their young adult counterparts (cf.Marcotte and Ansaldo 2014), the learning attested is promising and fits in well with self-fulfilment and self-achievement that is said to characterise the third age (see above).
But these studies do collectively render the question what can be done to promote language learning success in late life.Other than the study by Gruneberg and Pascoe (1996), the studies to date have not investigated different language teaching pedagogies.Building on applied linguistics studies in younger learners, one of the most fiercely debated topics is the effectiveness of more explicit versus more implicit language instruction methods (cf.Andringa and Rebuschat 2015).To date, this has never been applied to an older adult context.This study is the first to exploratively do so in a small sample of older adults.The main objective in doing so is to shed light on the optimal language learning trajectory for an emergent but substantial group of language learners: older adults.As a second aim, the study outcome may reciprocally aid theorising in the realm of applied linguistics.To do that, it is pivotal to present the state-of-the-art of applied linguistic investigations into implicit versus explicit language teaching pedagogies.

Implicit/explicit language teaching pedagogies
At its core, implicit and explicit language instructional methods differ in the metalinguistic input learners do or do not receive (Ellis 1994;Norris and Ortega 2000).To further complement this distinction are focus on form (FoF), focus on forms (FoFs) and focus on meaning (FoM; Long 1991).On a continuum these methods range from FoFs, which is mostly described as traditional and explicit grammar-based language teaching where meaningful interaction is largely absent, to FoM which tends to exclusively focus on communicative meaning (Long and Robinson 1998) and which is often equated with implicit language instruction: learners are expected to learn the language based on natural and abundant exposure and input.In between FoFs and FoM is FoF, where a focus on conveying a message is typically prioritised but the teacher also pays explicit attention to the form of the message.
Over the years there have been many studies investigating the effectiveness of these different language teaching methods.However, a consensus on which method is optimal is hard to reach (Andringa and Rebuschat 2015).The meta-analyses by Norris and Ortega (2000), Spada and Tomita (2010), and Goo and colleagues (2015) seem to point in the direction of a clear advantage of explicit language instruction.But the studies that formed the basis for their syntheses, however, have been criticised on three aspects: (1) one specific grammatical structure is targeted (Rousse-Malpat and Verspoor 2012), ( 2) training is short (DeKeyser 2013), and, (3) task procedures and test circumstances favour explicit knowledge (Andringa and Rebuschat 2015).Indeed, a more recent meta-analysis comparing implicit and explicit teaching approaches showed that outcome measures were significant moderators in the effects of implicit and explicit instruction: in spontaneous AL use, implicit instruction showed stronger effects (Kang et al. 2019), whereas previous meta-studies (Norris and Ortega 2000;Spada and Tomita 2010) mainly incorporated controlled outcome measures that relied on metalinguistic knowledge.
Partly to address the critique of a short training, several studies have carried out longitudinal classroom-based intervention studies on the difference between implicit and explicit language instruction (Gombert 2023;Piggott 2019;Rousse-Malpat and Verspoor 2012).Results showed that on usage-based holistic measures related to speaking and writing fluency, the implicitly taught group tended to outperform their explicitly taught peers.Whereas the explicit groups initially tended to outperform their implicitly taught peers on complexity, accuracy and fluency (CAF) measures, this difference disappeared with time.This state-of-the art of studies in this domain has so far relied solely on young (predominantly secondary school) learners.
Importantly, many studies have referred to the difference in terms of mostly implicit and mostly explicit conditions.Indeed, implicit versus explicit language instruction is a continuum and fully implicit or explicit instruction does not exist (Van den Branden 2016; Verspoor et al. 2015).

Implicit/explicit language teaching pedagogies in LLLL
Translating these findings on explicit versus implicit instruction to learners in the older adult life stage is impossible due to effective methods of late life language learning and teaching being greatly under-researched.Tentative evidence comparing different age groups of language learners have pointed to younger adults benefiting more from explicit grammar instruction than older adults do.This was attested in two studies that both targeted the learning of Latin.Cox and Sanz (2015) taught basic Latin morphosyntax to younger (N = 10; 19-27) and older (N = 11; 60+) English/Spanish bilinguals.In three sessions (duration unclear) over four weeks that participants completed, amongst other things, explicit grammar instruction and practice that provided them with correct/incorrect feedback without an explanation (i.e., no explicit instruction).Younger participants benefited more from explicit instruction, which the authors ascribe to the task's memory demands but may also be due to the younger participants being much more used to this type of instructional setting.Significant differences previously observed between the age groups disappeared two weeks after the grammar lesson and practice session.The authors ascribe this to the fact that older adults' language developmental process operated over a longer time compared to younger learners.Although Lenet and colleagues (2011) also investigated Latin (semantic function assignment), they did incorporate an explicit and less explicit condition.Younger (N = 20; Mage = 18.7) and older (N = 20; Mage = 72.3)adults were provided with feedback that either just told them their answer was incorrect (less explicit), or told them why their answer was incorrect (explicit).Similar to Cox and Sanz's (2015) study, the duration of the sessions is not mentioned in the article either.Results showed that the younger group did better in the explicit condition whereas the older adults did better in the less explicit condition.
Finally, Midford and Kirsner (2005) studied older (N = 22; Mage = 65.9) and younger (N = 22; Mage = 20.6)adults on an artificial grammar-learning paradigm.Grammar instruction varied in complexity and use of rules, leading to four conditions.After this training phase, with an unknown duration, participants were asked to complete a grammar-judgement task.In the simple-grammar condition with rules, the younger group performed better than the older group; however, in the complexgrammar condition without rules, these differences disappeared.The authors conclude that less explicit grammar instruction works better for older adults.
Collectively, these studies show a tendency for implicit instruction being more beneficial for older adults.At the same time, however, it needs to be pointed out that this conclusion is not without problems: groups were small, but most importantly, instruction and training sessions were very brief (and the precise duration was unclear) and did not resemble language course set-ups since they took place in a lab setting rather than classroom environment, limiting the generalisability of the studies' results.Similarly, the language of instruction was either an artificial language or Latin.Finally, but crucially, the studies differentiated between no versus explicit instruction, albeit that explicit language instruction was operationalised differently (grammar explanation vs. feedback), but at no point was a more explicit versus more implicit learning conditions introduced.Lacking are studies building on tentative work (detailed in 2.1) that older adults can learn a new language but looking specifically at the optimal conditions to do this , in a naturalistic classroom setting where older adults are learning a new language over a longer period of time (DeKeyser 2013;Lambert and Kormos 2014;Larsen-Freeman 2015;Rebuschat 2015).

The present study
Following the caveat in earlier work, we designed a three-month English as an additional language (AL) course for older adults, specifically comparing more implicit versus explicit language instruction in an actual late-life language classroom setting.We aim to answer the following research questions: (1) Does older adults' overall AL proficiency improve after a three-month language course and to what extent is this dependent on language teaching pedagogy?(2) To which extent do implicit versus explicit language teaching pedagogies differ in CAF and holistic rating scores of spoken assignments?
Based on earlier work, we hypothesise that older adults' overall AL proficiency does show improvements after three months of language instruction (cf.Kliesch et al. 2021), but that the groups show differences in several areas of development.As there is more room to practise speaking in the implicit grammar condition (see below for details), we expect this group to show more substantial improvements on speaking tasks.Regarding the second research question, if the findings of our study resemble those attested in studies on younger learners, we hypothesise that explicitly taught older adults outperform implicitly taught older adults on holistic grammar and accuracy ratings.The implicitly instructed group, however, are expected to do better on measures of fluency (Piggott 2019;Rousse-Malpat and Verspoor 2012).

Participants
Sixteen Dutch older adults (M = 71;9, SD = 6;3, 13 female) participated in our longitudinal study till the end.Educational levels varied from secondary school to university degrees.All participants were retired at the time of testing; former professions included police officer, secretary, teacher, and business owner.Based on self-reports, the average number of languages/dialects spoken per participant was 4 (this included languages learnt at school or other places and regional languages/ dialects).An additional 14 participants initially started with the English course but dropped out due to health issues (N = 6), difficulty with online technology (N = 4), perceived difficulty of the course (N = 3), and not liking the teaching pedagogy to which they were assigned (N = 1; see below for further details

Study design
Our study followed a pre-test, post-test, retention test design where the retention test was administered three months after the post-test.Between the pre-test and posttest, participants followed a three-month English course that included weekly group lessons, and homework with a daily diary.The full experimental design and timeline of the study is detailed in Figure 1.
Language pedagogies in later life

Course design and language learning procedure
Participants took part in a three-month English course specifically developed for, and targeted towards, older adults.This course consisted of twelve two-hour lessons (see Appendix A for topics and features for each lesson).Due to the COVID-19 pandemic, the course was held online.In order to investigate the implicit/explicit dichotomy, the course was offered in two versions to which participants were randomly assigned: one with more explicit language instruction (N = 8), operationalised mainly as explicit grammar instruction, and the other with mostly implicit instruction (N = 8), devoid of explicit grammar rule instruction.In line with previous research (Van den Branden 2016; Verspoor et al. 2015), in our course design we adhere to the premise that 'purely' implicit or explicit language teaching is difficult to accomplish (Long 1991).The explicit condition was not fully explicit because grammar was not explained in absence of any other linguistic input.Additionally, in the implicit condition, explicit attention was sometimes paid to word meaning or pronunciation.
Both classroom materials and homework assignments were adapted to fit either implicit or explicit instruction.In the explicit condition, grammatical constructs were explicitly explained (i.e., when to use a tense and how to form it), whereas the implicit condition included example sentences and exercises with the target structure, so essentially was input only.Figure 2 provides an example of the different exercises in the conditions when it comes to grammar explanations.
Other than the explicitness of grammar instruction, the two courses were identical.This meant that exercises and topics were also kept as similar as possible, except for the explicitness of the grammatical structure.In another example in Figure 3, the explicit condition is prompted to use the present simple, whereas the implicit condition is provided with more example sentences without changing the essence of the exercise on describing habits.
Two rounds of pilots were carried out before data collection commenced (respectively N = 2, M = 77;9, SD = 3;6, 2 male; and N = 8, M = 70;9, SD = 4;3, 6 male) on both implicit and explicit conditions (explicit N = 6; implicit N = 4).The main purpose of the pilots was to ascertain feasibility of the testing sessions (including materials) and the course itself (i.e., whether the instructional conditions could be maintained as desired).As a result of the pilots, minor alterations were made which mainly pertained to homework and online formats, such as switching to Google Docs (instead of Google Classroom).
Both during the pilot rounds and the actual data collection, teachers were instructed on what to say in case participants in the implicit group asked explicit grammar questions.These answers followed along the lines of 'this is not important right now; we are focusing on other things now'.However, none of the participants in the implicit condition ever asked such questions.Language pedagogies in later life

Language tests
During all three testing moments (pre, post and retention test), several language proficiency tests were administered via Google Meet, comprising both productive and receptive language tasks.

Productive language tasks
For productive language tasks, an IELTS-inspired speaking task was used (van der Ploeg et al. submitted).This test consisted of part Aconversationwhere the researcher asked participants questions on a certain topic (e.g., their village, their house, and their favourite food).Next, in part B, participants were given a topic (hobbies, famous people, and books) and, after a few minutes of preparation, asked to briefly present on this topic without being prompted by the researchers with focused questions.Both parts were recorded and rated by three independent raters using the IELTS rubric (fluency, grammar, lexical resources, and pronunciation), from which also the overall IELTS score was calculated, and an overall CEFR score.Ratings showed good interrater reliability (Kline 1999) (α = 0.67); see below for details.We opted for two different frameworks to see if they would show different results, as close to no research on LLLL proficiency has been done.
Additionally, a letter verbal fluency test was used (cf.Nijmeijer et al. 2021).Here participants were asked to name as many words starting with a certain letter within the time span of 1 min (i.e., CFL, PWR and FAS).The final score is the number of correct words and versions were randomly assigned for the different testing sessions and participants.
Finally, in addition to the measures described above, which were administered at pre-, post-and retention tests, participants also recorded themselves weekly as part of their homework exercises.These recordings were sent to the course instructor via WhatsApp audio messages.Participants were asked to give their opinions on different topics or tell the teacher about something including (a) ideal holidays, (b) should public transport be free, and (c) a typical day in their life.A total of 22 assignments divided over twelve weeks formed part of the homework.However, not all participants completed all homework speaking assignments.Additionally, three participants were excluded from analysis as one of them had only completed one speaking assignment and the other two wrote down text for the assignment and read that prepared text out loud.This left us with a total of 201 recordings divided over thirteen participants; participants thus individually completed between 8 and 21 speaking assignments over the course of three months (M = 15.09,SD = 4.00).These recordings were then holistically rated by three independent raters each on a rating scale developed by Piggott (2019;Appendix B).Raters had been trained to standardise the rating process prior to listening to the assignments.The average, minimum and maximum of the SDs between raters can be found in Table 1 below.As none of the SDs were exceptionally high, we can assume agreement between raters.
Homework recordings were also transcribed orthographically and rated on Complexity, Accuracy and Fluency (CAF).For complexity, the moving-average typetoken ratio (MATTR) was used to measure lexical complexity (Covington and McFall 2010) (as this measure has been claimed to be least affected by excerpt length [Zenker and Kyle 2021]).Transcripts with fewer than 50 tokens were, however, not included in the analysis (Zenker and Kyle 2021), meaning that the analysis was run on 134 audio files.
Syntactic complexity was measured by means of the number of coordinate phrases per T-unit using TAASCC (Kyle 2016).In this analysis, pruned manuscripts were used that excluded self-corrections and repetitions.To measure accuracy, the number of structural and lexical errors were counted.Structural errors do not change the meaning of the sentence and include things such as verb use and verb form errors (e.g., 'When I are in England…').Lexical errors, on the other hand, are meaning-based errors such as L1 interference errors (e.g., 'It is a shopping evening' which is a literal translation from Dutch).A second rater assessed 10 % of the data and both raters showed substantial reliability for structural and lexical errors respectively (α = 0.78; α = 0.73) (Kline 1999).
Finally, fluency was operationalised by means of a number of separate measures: mean number of words per minute; number of silent pauses per minute; mean length of silent pauses; number of filled pauses per minute; number of repairs/ repetitions per minute; and, speech rate.These measures have been used in previous studies on fluency in older adults (van der Ploeg et al. submitted) and were calculated using a PRAAT script (De Jong and Wempe 2009).
No written measures were included as a needs analysis found that older adults want to (re)learn language in order to be able to speak the language (Authors submitted).

Receptive language tasks
Receptive tasks included the IELTS listening test where participants randomly received one of three versions.Scores on this task could vary between zero and a maximum of twenty.Participants received the paper form at home and were told to fill it out while listening to the audio played over their device by the experimenter during the online test session.As the participants received the listening task at home and testing was repeated, using three versions of the task was deemed necessary in order to obtain objective results.The different versions were similar in design and difficulty.Additionally, both Dutch and English receptive vocabulary tests were administered.For Dutch this was Lextale (Lemhöfer and Broersma 2012) which was administered via the Lextale website (scoring range between 1 and 100).For English the Peabody Picture Vocabulary Test (PPVT; Dunn and Dunn 2007) was used in the form of a powerpoint presentation and the non-standardised scores were used, as English was not participants' first language.

Analysis
Linear mixed model analyses were run to analyse the data using the lme4 package (Bates et al. 2015) in R (R Core Team 2021).Language test scores were included as dependent variables, test moment (pre-, post-, and retention test or week/day) and condition (implicit and explicit) were included as independent variables with an interaction.Participant, finally, was included as random effect, as older adults have been shown to show substantial individual variation (Christensen 2001;Grotek 2018).Additionally, in the homework models, task was also included as a random effect as the topics of the tasks differed and this may have influenced the results.Conditional R2 was computed using the mgcv package (Wood 2017) and the scores for all language tests were centred around zero, based on the pre-test scores.Graphs were created using ggplot2 (Wickham 2016).Model output for all models can be found in Appendix C.

Speaking
In the overall IELTS rating of the speaking tasks, we found a significant effect between time and speaking scores (conditional R2 = 0.78).Our model showed a significant main effect for retention test (b = 0.52, SE = 0.17, t = 3.04, p < 0.01, CI[0.18, 0.87]), which means that, regardless of condition, older adults spoke better English overall as part of the retention test than in the pre-test.Subsetting showed that there was no significant difference between post-test and retention test (p = 0.071).Figure 4 shows both the explicitly and implicitly taught groups' performance on the speaking task.
As a next step, we investigated the IELTS ratings in more detail and found a significant effect between time and IELTS speakinglexical resources scores (conditional R2 = 0.72).Our model showed a significant main effect for retention test (b = 0.50, SE = 0.23, t = 2.22, p < 0.05, CI[0.04, 0.96]), sosimilar to overall IELTS speaking scoresolder adults, regardless of condition, did better on the retention test on lexical components in speaking than in the pre-test.Subsetting showed that there was no significant difference between post-test and retention test (p = 0.68).In Figure 4 participants' lexical speaking scores are visualised, split per explicit and implicit teaching condition.
For IELTS speakinggrammar scores in interaction with time we also found a significant effect (conditional R2 = 0.81).Our model showed a significant main effect for retention test (b = 0.44, SE = 0.21, t = 2.09, p < 0.05, CI[0.01, 0.86]), where, again, older adults, regardless of condition, did better on the retention test on grammar components in speaking than in the pre-test.Subsetting showed that there was no significant difference between post-test and retention test (p = 0.109).See Figure 4 for a visualisation.
Finally, for IELTS speakingpronunciation, the trend was the same as for the other three IELTS test components: we found a significant effect of time (conditional R2 = 0.76).Our model showed a significant main effect for retention test (b = 0.59, SE = 0.21, t = 2.82, p < 0.01, CI[0.17, 1.02]): regardless of condition, older adults' English pronunciation performance was significantly better on the retention test compared to the pre-test.Subsetting showed that in the retention test, participants did significantly better than in the post-test (p < 0.01).Below, Figure 4 shows the performance of both groups.
Although this has to be treated with caution, Figure 4 does visualise different trends in both conditions (though at no point resulting in significant differences).Whereas the implicit group mainly stays stable in their development before and after the course, the explicit group shows an increase in their scores.This increase is sustained in the retention test which, in all four cases, shows even higher scores than the post-test.
For the CEFR rating of the speaking task we found significant effects between time and speaking scores (conditional R2 = 0.68).Our model showed a significant main effect for post-test and retention test (b = 0.32, SE = 0.12, t = 2.71, p = 0.01, CI[0.08, 0.56]; b = 0.46, SE = 0.12, t = 3.93, p < 0.001, CI[0.22,0.70]): older adults, regardless of their assigned teaching method condition, performed better in both post-test and retention test compared to pre-test.No significant difference in performance was found between post-test and retention test (p = 0.23).
As opposed to IELTS overall, as well as its lexical, grammar and pronunciation components and for the CEFR rating on the basis of the IELTS speaking test, IELTS fluency ratings did not reveal any improvement over time nor were any effects of condition found.

Vocabulary
Turning to receptive vocabulary, we found a significant effect of time for English vocabulary (i.e., performance on the Peabody picture vocabulary test) (conditional R2 = 0.73).Our model showed a significant main effect for retention test (b = 12.75, SE = 5.42, t = 2.35, p < 0.05, CI[1.79,23.71];b = 24.62,SE = 5.42, t = 4.54, p < 0.001, CI[13.67,35.58]),pointing to older adults, regardless of condition, to show the best receptive command of English words at the time of the retention test compared to their performance on the pre-test.Subsetting showed that participants did better on the retention task than on the post-test (p < 0.05).In Figure 5 both groups' performance is plotted.

Listening
For IELTS listening, no significant effects were found for time or condition.Participants averaged at 9.66/20 in the pre-test.

Verbal fluency
For English verbal letter fluency, we found a significant effect between time and scores (conditional R2 = 0.66).However, our model only showed a significant main Language pedagogies in later life effect for the retention test (b = 7.75, SE = 2.44, t = 3.17, p < 0.01, CI[2.82,12.68]).Regardless of condition, and similar to receptive English vocabulary, older adults named more correct words in the retention test.No significant differences were found between groups between post-test and retention test (p = 0.68).Figure 6 shows participants' performance.

CAF versus holistic scores of spoken homework samples 4.2.1 CAF
For complexity, both lexical complexity (as measured by the MATTR) and syntactic complexity (as measured by coordinate phrases per T-unit), no significant effects were found across time or between the conditions.For accuracy, we found no significant effects for lexical errors.For structural errors, however, we found a significant main effect (conditional R2 = 0.41).Our model showed a significant main effect of condition (b = 0.02, SE = 0.01, t = 2.77, p < 0.01, CI[0.01, 0.04]) and an interaction between time and condition (b = −0.00,SE = 0.001, t = −2.92,p < 0.01, CI[−0.00,−0.00]).More specifically, our results indicate that, even though the implicitly taught group made more structural errors than the explicitly taught group overall, compared to the explicit group the implicit group showed a significant decrease in structural errors over time (see Figure 7).
For fluency, only one significant main effect was found for the various fluency indicators under investigation: for mean number of words per minute, number of silent pauses per minute, mean length of silent pauses, number of filled pauses per minute, and number of repairs/repetitions per minute no main effect was found.For speech rate (conditional R2 = 0.72), the model showed a significant main effect of time (b = 0.03, SE = 0.01, t = 2.78, p < 0.01, CI[0.01, 0.05]).This indicates that older adults' overall speech rate improved over the course regardless of condition.Indeed, we did not find any significant main effect of condition (p = 0.21) or interaction between condition and time (p = 0.97).See Figure 8 for a visualisation of these results.

Holistic scores
For holistic grammar scores we found a significant effect (conditional R2 = 0.44).The model showed a significant main effect between time and grammar scores (b = 0.04, SE = 0.02, t = 2.29, p < 0.05, CI[0.01, 0.08]), meaning that, regardless of condition, our participants' holistic grammar scores improved over the timespan of the course.The model did not show any significant main effect of condition (p = 0.62) or interaction between condition and time (p = 0.58).This is visualised in Figure 9.
For vocabulary, fluency, and functional adequacy, no significant effects were found.This longitudinal study investigated the language development of older adults learning English for three months as either a mostly implicit or mostly explicit language instruction setting.More specifically, we aimed to answer the question whether 1) older adults' proficiency shows an increase after a three-month language course and 2) to what extent this is dependent on language teaching pedagogy.Underlying these questions is the extension of applied linguistics work to comprise an increasing group of older adult language learners.In our study we investigated L1 Dutch older adults learning English as an additional language.
Regarding the first part of this research question, we hypothesised older adults to develop their English language skills (cf.Kliesch et al. 2021) and we found them to do so in a number of areas.For IELTS speaking (including the subcomponents lexical resource, grammatical range and accuracy, and pronunciation) and overall CEFR scores, we found older adults to significantly improve over the three months time of the language course.Similar results were found for receptive vocabulary (i.e., PPVT) and verbal fluency as a measure of productive vocabulary.Furthermore, our dense spoken data showed speech rate and holistic grammar scores to improve over the course of twelve weeks, as assessed by independent raters.Our results thus showed that even a short late-life language learning experience may lead to improvements in AL proficiency.This has been attested in earlier work too (Kliesch et al. 2021;Pfenninger and Polz 2018).
Interestingly, the older adults in our study did not improve on listening scores and neither did participants in Pfenninger and Polz's (2018) study.A potential reason why listening scores did not improve after the course might be hearing loss and reduced processing speed evidenced in older adults (Kliesch et al. 2018).Additionally, due to testing being online in our study, this meant that the listening test was also administered online and it may very well be that internet connection strength and other technology-related factors influenced older adults' performance on this task.Related to this, multiple participants mentioned that the audio was too shrill.Hence, all of these above-mentioned issues need to be kept in mind when providing older adults with listening tests and tasks.However, we do believe it is important to incorporate listening skills in LLLL studies even though Kliesch and colleagues (2018) advocate for using the written modality for testing general AL skills.Indeed, older adults themselves have named communication to be one of the reasons for wanting to pick up LLLL (van der Ploeg et al. submitted) and listening is an integral part of communication skills.The same argument can be made for including speaking skills.
Other results in our study are not in line with previous research: whereas Kliesch and Pfenninger (2021) found lexical complexity to improve during initial language learning, we did not replicate this finding in our sample.However, the language taught differed (English vs. Spanish) which means that the initial level of proficiency also differed as English is a language encountered frequently in the linguistic landscape of the Netherlands whereas this is not the case for Spanish in Austria/Switzerland.Hence, initial language proficiency development will be faster compared to the later language-learning stages.
Furthermore our study and Kliesch and Pfenninger's (2021) differ in terms of errors the older adults were found to make.Kliesch and Pfenninger focused on morphosyntactic accuracy and transfer errors of a lexical nature in their study.Although the categories are not fully comparable to our structural and lexical errors, there is of course overlap.For both of their error categories, the authors found improvements during the initial language learning stages.In our study, however, we did not find such effects, except that the learners in the implicit condition produced significantly fewer structural errors over time (see below).Indeed, in our study most of the improvements are found at the end of the course, often even after the language learning is over.This is in line with Cox and Sanz (2015), who found older adults' language development to operate over a longer time course compared to younger adults.An explanation as to the origin of these differences might be the type of elicitation task: Kliesch and Pfenninger collected spoken data during semi guided 5-min interviews whereas our data originated from tasks completed by the participants themselves on specific topics.
Finally, our global spoken data (pre-post-retention data) showed older adults to improve in their scores, but only in the retention test.Hence, there was no significant improvement from pre-test to post-test but there was from pre-test to retention test.This is not in line with Kliesch and Pfenninger (2021), who found that the increase in AL mainly occurred during the initial 10-20-week interval of learning.It is, however, in line with Cox and Sanz' (2015) longer time course of language development.Again, the reason why Kliesch and Pfenninger (2021) found initial language learning effects might simply be that language that was offered as part of the course and its corresponding starting level.We do not believe that there is a test-retest effect as participants completed a different version of the tests in the pre-, post-and retention test.
The other language measures we wish to touch upon are the measures that did not show a significant development over time.Most of the measures that did not show effects over time (or per condition) are the homework speaking tasks.As these tasks took the form of monologues, this might be the reason we did not find effects: monologues are radically different from naturalistic conversation-type interactions.Producing a monologue requires multiple cognitive functions compared to dialogues: planning, monitoring, retrieval and working memory (cf.Garrod and Pickering 2004).And precisely these cognitive functions have been found to decline as a function of ageing (Cox and Sanz 2015;Mackey and Sachs 2012).Hence, a lack of effects on such a monologue task might be explained by decreased cognitive functioning in older adults and by the fact that it is a specific skill that is not trained to the same extent in all individuals.The fact that we did find an effect of speech rate might simply be because our older adults became more comfortable speaking allowing them to increase their speech rate.An additional explanation could be that their lexical access skills had improved, leading them to find words more quickly and, therefore, talk at a faster rate.
Our second research question focused on the extent to which language development differs for different language teaching pedagogies.Here we hypothesised the groups to show differences in several areas of development (e.g. for the implicit group to show bigger improvements on speaking tasks).Our findings, however, did not show this difference between conditions.The one difference that was attested between the implicit and explicit condition in our study was that the implicit group made significantly more structural errors and yet at the same time also significantly declined in this type of error during the language course compared to the learners in the explicit group.A possible explanation for this is offered by Piggott (2019) who argues that the implicit group might not mind making mistakes as much as the explicitly taught group: "it is conceivable that the implicit group was less aware of or restricted by errors such as using their L1" (p.117).This might seem contradictory to the fact that they improved more rapidly, but underlines the need for time-trajectory data.
Interestingly, even though we found an effect of time on holistic grammar scores and IELTS grammar scores, there was no effect of group.Hence, the explicit group did not outperform the implicit group on grammar scores contrary to expectations and, contrary to robust evidence in younger learner demographics.Indeed, these two findings are not in line with previous longitudinal implicit/explicit grammar instruction studies with younger participants.In Piggott's (2019) study the explicit group made more verb form and verb use errors (part of our structural errors) in the first year while, at the same time, the implicit group received higher holistic speaking scores.Both Rousse-Malpat and Verspoor (2012) and Gombert (2023) also found their implicitly instructed groups to outperform the explicit group on general proficiency holistic scores (as measured by the SOPA; Rhodes 1996).
Even though previous studies have demonstrated clear differences in AL development when comparing implicit and explicit approaches, our study has not yielded such an effect.There are several reasons why this might be the case.First of all, our training was shorter than the abovementioned studies: three months (our study) versus 21 months up to six years.Notably, effects of implicit grammar approaches take a longer time as more input is needed (Hulstijn 2015), which might be especially true for older adults (Marcotte and Ansaldo 2014).Additionally, our language learning conditions might have been too similar: both incorporated taskbased language teaching with a focus on speaking (as this was indicated as a wish by older adults; van der Ploeg et al. submitted) as opposed to, for example, Piggott (2019) whose conditions differed more.However, in order for older adults to finish a language course it also needs to be fun enough to finish.Overall, however, language learning might just take more time for older adults (cf.Cox and Sanz 2015).
In addition to the longitudinal studies described above, the studies incorporating implicit/explicit language learning in older adults have shown implicit grammar instruction to be more effective for older adults (Lenet et al. 2011;Midford and Kirsner 2005).The results in our study are not in line with these previous studies.However, participants in Midford and Kirsner's (2005) study were younger than our participants, which might explain why they found implicit language learning to work better (Cherry and Stadler 1995;Howard and Howard 1997).Nonetheless, these studies are vastly different from our study, making direct comparisons hard: training was short, conducted in a lab setting, focussing on a language that is not used in daily life, and without a fully implicit condition but rather a less explicit condition.
In relation to our research questions and hypotheses, individual variation also needs to be addressed.As shown by the large SEs in our models, our dataset showed substantial individual variation, something that is generally accepted in gerontology: "this age group is characterised by the largest diversity of any age groups involved in education" (Grotek 2018;p. 128).Research has shown that both cognitive and L2 performance can fluctuate within an individual on a day-to-day basis (Christensen 2001;Neupert and Allaire 2012;Strauss et al. 2002), and inter-and intra-individual differences are even known to increase over the lifespan (Christensen 2001).Hence, generalising across participants is hard, and it is even harder to speak of "the" typical late-life language learner.

Limitations
There are several limitations to our study that we wish to touch upon.First of all, our exploratory study's sample size was small.This means that the statistical outcomes need to be interpreted with caution.In addition, we did not control for participants' English starting level as there was not enough power in our models to do so.However, we urge future studies to include this variable in their models.Secondly, staying on the topic of statistics, our homework data consisted of different tasks.Even though we controlled for this potential task effect in our statistical models, future research might want to incorporate a single measure over time.This, of course, comes with new challenges, such as fatigue and potential drop-out, something that is especially present in older-adults.Another factor that needs to be taken into account is that three older adults had to be removed from the analysis of homework data as they had read their spoken assignments out loud.For these participants it was very clear that they read their assignments out loud (i.e., turning papers can be heard in the recordings), but there might of course be more older adults who did this, influencing the results.A third limitation is the fact that both testing sessions and actual teaching and homework had to be organised in an online setting.Although our study has shown that such a set-up is feasible for older adults, it also might have influenced the results as listening tasks in such a set-up are much harder to carry out and technical issues with, for example, internet connections will always be present.Finally, there are two limitations we already touched upon when discussing the results: language training might have been too short to demonstrate differences between the two conditions, and the implicit and explicit condition might have been too similar.

Conclusions and implications
Our small-scale longitudinal dense data study showed that it is possible for older adults to develop their AL proficiency, even during a relatively short course of threemonths.Moreover, such language proficiency gains can even be attained in an online language course.Regarding implicit and explicit grammar instruction, we found no noteworthy differences: the implicit condition showed more structural errors yet also significantly went down in these errors during the course.
From an SLA-theory perspective this means that previous research into implicit and explicit grammar instruction has been biased towards younger learners and that it remains to be seen whether the results regarding younger learners hold true once older language learners are incorporated into research designs.
Practically speaking, seeing that there were no substantial differences between the two teaching pedagogies, and the fact that participant attrition was quite high in this study, it might be best to adopt a teaching pedagogy that meets older adults' language learning needs best.A needs analysis revealed older adults to want explicit explanation of grammatical constructs (van der Ploeg et al. submitted), and one of our participants dropped out of the course due to a lack of explicit grammar instruction.Hence, a communicative-focused teaching pedagogy that does include some explicit grammar instruction might be most suitable for older adults.
Finally, our study showed the late-life language learner (as far as one can talk about 'the' late-life language learner) to be very invested in the course.The fact that multiple participants had to be excluded from part of the analysis due to them reading assignments out loud shows that they want to do a good job and that they put a lot of energy and effort into their language learning.By doing so, they show agency over their language learning process.
Appendix A: Topics and features of the language course

Figure 1 :
Figure 1: Study design with test moments and design of the language course.

Figure 4 :
Figure 4: IELTS speaking scores per condition over time (left to right: overall, lexical resources, grammar, pronunciation).Note: scores are centred around zero.

Figure 5 :
Figure 5: Peabody scores per condition over time.Note: scores are centred around zero.

Figure 6 :
Figure 6: English verbal fluency scores per condition over time.Note: scores are centred around zero.

Figure 7 :
Figure 7: Structural errors corrected for length over time per condition including SEs.

Figure 8 :
Figure 8: Speech rate over time per condition including SEs.

Figure 9 :
Figure 9: Holistic grammar scores over time per condition including SEs.

Table  :
Average, minimum, and maximum SDs of raters on the four holistic rating scales.

Table C :
Model output for IELTS speaking overall.*Notes significance, **notes not significant after subsetting.

Table C :
Model output for IELTS speaking lexical resources.*Notes significance.

Table C :
Model output for IELTS speaking grammar.*Notes significance.

Table C :
Model output for IELTS speaking fluency.*Notes significance.

Table C :
Model output for IELTS speaking pronunciation.*Notes significance, **notes not significant after subsetting.

Table C :
Model output for CEFR speaking overall.*Notes significance.

Table C :
Model output for Peabody.*Notes significance.

Table C :
Model output for IELTS listening.*Notes significance.

Table C :
Model output for English verbal fluency.*Notes significance.

Table C :
Model output for MATTR.*Notes significance.

Table C :
Model output for coordinate phrases per T-unit.*Notes significance.

Table C :
Model output for lexical errors.*Notes significance.

Table C :
Model output for structural errors.*Notes significance.

Table C :
Model output for number of words per minute.*Notes significance.

Table C :
Model output for number of silent pauses per minute.*Notes significance.

Table C :
Model output for mean length of silent pause.*Notes significance.

Table C :
Model output for number of filled pauses per minute.*Notes significance.

Table C :
Model output for number of repairs/repetitions per minute.*Notes significance.

Table C :
Model output for speech rate.*Notes significance.

Table C :
Model output for holistic grammar scores.*Notes significance.

Table C :
Model output for holistic vocabulary scores.*Notes significance.

Table C :
Model output for holistic fluency scores.*Notes significance.

Table C :
Model output for holistic functional adequacy scores.*Notes significance.