The effectiveness of corpus-based training on collocation use in L2 writing for Chinese senior secondary school students


 Corpus tools are known to be effective in helping L2 learners improve their writing, especially regarding their use of words. Most corpus-based L2 writing research has focused on university students while little attention has been paid to secondary school L2 students. This study investigated whether senior secondary school students in China, upon receiving corpus-based training under the framework of data-driven learning (DDL), could improve their vocabulary use, especially the use of collocations, in their writing for the International English Language Testing System (IELTS) test. Twenty-two students aged 16–18 in a senior secondary school in Nanchang, China who were planning to take the IELTS exam participated in the study. Corpus of Contemporary American English (COCA) and Word and Phrase were the main corpora that the participants used to learn various search functions. Pre-writing and post-writing tests were administered to measure the effect of corpus training. In addition, a questionnaire and interviews were used to collect students’ perspectives and attitudes. The results indicate that students made improvement in word selection after three corpus training sessions, and their attitudes towards corpus use were positive even though they were restricted from using computers to access corpora inside their school.


Introduction
In view of rapid globalization, it is not surprising that the number of English language learners continues to increase as English is embraced as the Traditionally in China, students are encouraged to produce error-free sentences from elementary school to university, and teachers focus on delivering grammar rules and analyzing students' academic performance based on dictation and quizzes (Duan & Yang, 2016). Under this learning approach, students have limited opportunities to learn content-based knowledge where unknown words are taught through context. Despite the focus on vocabulary knowledge, previous research has revealed that Chinese students have difficulty in converting their receptive vocabulary to productive vocabulary in their writing (Wen, 2018;Zhai, 2016;Zhou, 2010); they also lack skills and strategies to write in English (Sang, 2017). Conveying a clear written message using appropriate words is a major challenge for Chinese learners in their academic writing (Clark & Yu, 2020). Chinese students may have acquired a passive knowledge of many lexical items throughout a long period of English learning, but they still fail to access this vocabulary knowledge to express their thoughts clearly (Li & Schmitt, 2009).
In recent years, a growing number of Chinese students have been studying abroad, most of them taking IELTS as a qualifying standard. It is estimated that 50% of IELTS test takers in Asia are from mainland of China. Among the IELTS's four language skills, Chinese students tend to get lower score in writing compared to scores in the other skills (British Council, 2020b). In the IELTS writing Task 2, test takers are required to write at least 250 words in a formal, academic style, which is difficult for many Chinese students (An, 2013;Hao, 2016;Mayor, 2006). Despite a large vocabulary and a solid foundation in grammar, Chinese students' scores on IELTS academic writing are below average (Yao, 2014). A common error occurred in IELTS writing is inappropriate collocation use (Futagi, Deane, Chodorow, & Tetreault, 2008).
To increase vocabulary competence, research reveals that a greater focus is needed on vocabulary learning, especially collocations (Conzett, 2000;Men, 2015;Webb, Newton, & Chang, 2013). Knowledge of collocations is important for fostering language learning (Nation & Webb, 2011); however, Chinese learner's writing competence has been found to be weak in this area (Hou, 2014;Zou, 2019). Zou (2019) summarized six major collocation types in Chinese student writing, namely, noun-noun, noun-verb, verb-noun, adjective-noun, verb-adjectives, and adjectiveadverb, where more than half of the errors belong to verb-noun collocations. Hou (2014) also identified verb-noun as well as adjective-noun collocations as the two most frequent collocation error types. Zou (2019) explained that these collocation errors can be attributed to factors such as first language transfer and misuse of synonyms. Several studies have demonstrated that exposing students to concordance lines from corpora is a useful approach to improve their collocation performance (Li, 2017;Saeedakhtar, Bagerin, & Abdi, 2020;Vyatkina, 2017;Wu, 2016).

Data-driven learning
Beginning with the work by Johns (1991) and Sinclair (1991), the application of DDL has been advocated by scholars because corpora provide rich authentic examples regarding how words are used in real contexts. Gilquin and Granger (2010) also promoted DDL as a field with pedagogical value for language learners. DDL allows students to learn and internalize frequency patterns and contextual information Effectiveness of corpus-based training regarding the appropriate use of words. Under this approach, learners are described as "language detectives" (Johns, 1997:101) as they can actively detect how words are used in context. Research has revealed considerable benefits from using corpora as a pedagogical tool in L2 writing (Chang, 2014;Crosthwaite, 2017;Kennedy & Miceli, 2010;Yoon & Hirvela, 2004).
One of the most important roles of corpus use is error correction (Crosthwaite, 2017;Vyatkina, 2017;Yoon & Hirvela, 2004). It is found that corpus-led DDL instruction can help L2 students identify and correct errors in their writing (Crosthwaite, 2017;Li, 2017;Saeedakhtar et al., 2020;Todd, 2001;Vyatkina, 2017;Wu, 2016). Wu (2016) noted two broad types of collocation errors, (1) Adj. + noun and (2) Verb + noun, and found that Chinese learners with both high and low proficiency levels improved their collocation knowledge with the help of corpus tools. Li (2017) focused on verb-preposition collocations and found that Chinese postgraduates who had no previous knowledge of corpora displayed a significant improvement in using this type of collocation in their academic writing.
Furthermore, after being introduced to the basic functions of corpora, learners can use corpora as a reference/editing tool without receiving any comments or error correction from their teachers (Charles, 2012(Charles, , 2014Kennedy & Miceli, 2010;Yoon & Hirvela, 2004). Kennedy and Miceli (2010), for example, designed a longitudinal training program and found that their students were able to use various search functions when facing collocation difficulties, and subsequently built their own corpus for later consultation. Studies have also revealed that students hold positive attitudes and perceptions towards corpus use (Charles, 2012(Charles, , 2014Crosthwaite, 2017;Yoon & Hirvela, 2004). Charles (2012) investigated 40 English for Academic Purposes (EAP) students' evaluations of using their self-created corpora to solve collocation errors and found that 70% of the students wanted to use corpora in their EAP studies over the long-term.
However, most research on DDL for improving students' use of collocations and word choice in their writing was conducted at the university level or above (Charles, 2012(Charles, , 2014Crosthwaite, 2017;Kennedy & Miceli, 2010;Wu, 2016;Yoon & Hirvela, 2004). Boulton and Cobb (2017) conclude that "unfortunately, there is little [DDL] research with high school learners" (p. 357). Crosthwaite (2019), therefore, recommends more research on the application of DDL to pre-tertiary students. Because there are many Chinese secondary students who need to perform well on the IELTS for acceptance into overseas universities, the present study investigates whether secondary school students can correct collocation errors and use appropriate vocabulary in their IELTS-oriented writing after being trained to use corpus tools. These secondary students' attitudes towards the corpus resources were also collected and analyzed. Thus, the following research questions are relevant: (1) In what ways does corpus training benefit secondary school students' collocation competence and lexical quality in IELTS academic writing? (2) How do senior secondary school students perceive corpus training, and what attitudes do they hold towards corpora use in EAP writing?

Participants
Twenty-two grade eleven Chinese students volunteered to participate in the study. They were studying at a senior high school in Nanchang, Jiangxi, and wished to pursue higher education overseas. The participants were aged 16-18, and had been learning English as a foreign language for about 14 years. The average English proficiency level among the participants was 5.0 on the IELTS exam. For confidentiality, pseudonyms are used throughout the paper in alphabetical letters (from A to V). Normal ethical procedures were followed throughout the study.

Research instruments
In total, three instruments were designed for data collection in the current study: (1) one pre-test and one post-test, (2) one questionnaire (see Appendix II), and (3) interviews (see Appendix III) with one intervention study (corpus training) between the two writing tests. The two tests were compositions written by students before and after the intervention (three training sessions) both of which were assessed to investigate whether students were able to reduce their error frequency (and as a formative exercise).

Procedure
Students were firstly asked to complete the pre-test. Following Wu (2016), students were not allowed to refer to any tools so the researcher could evaluate students' baseline writing skills on collocation knowledge. Then, by examining errors in the students' pre-test scripts, the researcher designed the teaching materials for the corpus training sessions. Three corpus training sessions were then conducted (see Section 4.3.1) to demonstrate how corpora function as an alternative reference tool for academic writing. Following the training sessions, students then completed the post-test with guidance from the corpora to measure their change of collocation Effectiveness of corpus-based training knowledge. Apart from consulting COCA and Word and Phrases Concordance, students were also allowed to use dictionaries which are their habitual reference tools in writing. Finally, a questionnaire was administered (see Appendix II), and interviews were conducted to determine the students' attitudes as well as their overall evaluation of corpus use. The prompts for the two compositions focused on educational matters and were taken from the IELTS writing database on Task 2: Pre-test prompt: "In the past, the role of teachers was to provide information. Today, students have access to wide sources of information. There is, therefore, no role of teachers in modern education. To what extent do you agree or disagree?" Post-test prompt: "In the past, lectures were used as a way of teaching large numbers of students, but now with the development of technology for education, many people think there is no justification for attending lectures. To what extent do you agree or disagree?"

Training sessions
Three corpus training sessions (see Appendix IV) were designed to introduce corpusbased reference tools for students to consult as they wrote. Three major types of collocation errors were addressed in this study as shown in the literature review, namely adjective + noun, verb + preposition, and verb + noun. The collocations were addressed in the three training sessions (30 min each), and these errors were collected from students' scripts in the pre-test. During the three training sessions, COCA and Word and Phrase Concordance were used as the main reference tools. Participants were taught to use various search functions from the two corpora and were asked to examine the correctness of collocations as well as to provide suggested answers to the extension tasks either in class or at home (see Table 1). The aim of the training sessions was to provide alternative tools for enhancing the students' writing competence, i.e., using the collocates function of COCA and the frequency list & analyze texts function of Word and Phrase Concordance. Meanwhile, concordance lines from the corpora could provide students with rich authentic examples to discover language patterns (see Tables 1 and 2). Each corpus training session was divided into three parts: lead-in activity (to check students' comprehension and increase their awareness of the usage of collocations); hands-on corpus search (to show students how to consult corpora to check the correctness of collocations and to explore the synonyms of the target words); and extension tasks (to provide students with more opportunities to work individually). Each corpus training session focused on two search functions of COCA: (1) Collocation and (2) Synonyms. Some simple search functions catering to the learners' current writing competence were designed in the teaching guide as students had no background knowledge of corpora (see Appendix IV for a sample of instructional materials design).
In IELTS writing, students are required to use different expressions to avoid repetition as well as to prevent monotonous-sounding writing. Therefore, synonym learning was chosen in the design of the teaching guides. The collocates function from COCA was introduced to help students examine collocation accuracy. The frequency lists function from Word and Phrase was presented to help learners generate language patterns, and the analyze texts function was an alternate way to search for synonyms. An inductive method was adopted for instruction to activate learners' schemata to become familiar with corpora along with the searching functions. The three corpus training sessions were delivered by one of the researchers at a computer lab in a large classroom setting.

Data collection and analysis 4.4.1 Assessing the students' writing
It is generally accepted that students' progress can be adopted as a measure of effectiveness of a given teaching innovation. In this case, a pre-writing task (the

Apply to
Apply for Effectiveness of corpus-based training pre-test) before corpus training intervention and a post-writing task (the post-test) were conducted to measure learners' performance and progress using IELTS-oriented prompts. Four marking criteria were used in the assessment of the two tests: "task achievement, coherence and cohesion, lexical resource, and grammatical range and accuracy" (British Council, 2020a). In addition, the raters marked the scores on lexical quality separately based on the IELTS rubric with reference to British Council (2020a) (see Appendix I). Students' writing samples were independently rated by the one of the authors and one writing teacher in the school. Disagreements were resolved via discussion. Mean scores and standard deviation measures were generated.

Questionnaire and interviews
The questionnaire and the interviews were designed to measure learners' perceptions of and overall attitudes towards corpus usage after the training sessions. Two parts were included in the questionnaire: (1) Background information and an examination of students' prior knowledge of corpora and (2) An evaluation and investigation of their attitudes towards corpora use. Fifteen items from the evaluation part were designed with a 7-point Likert preference scale, ranging from "strongly agree" (7 points) to "strongly disagree" (1 point) (see Appendix II). Inspired by Yoon and Hirvela (2004) and Crosthwaite (2017), the questionnaire data were categorized into four dimensions: (1) Reflection and perception of corpus use after corpus training; (2) Evaluation of a corpus; (3) Learners' autonomy in choosing a corpus to consult when writing; and (4) Attitudes towards corpus use. Semi-structured interviews (Appendix III) were conducted to gain insight into students' attitudes towards corpora; most questions were taken from Chang (2014).
The interview data were analyzed to answer research question 2 regarding the learners' attitudes and overall evaluation of the corpus training and usage. Eleven of the 22 participants were interviewed individually in 10-15 min. The interview responses were recorded and then transcribed and translated by the authors. Each participant viewed the transcripts and approved of the contents. The interview data were then coded independently by two of the researchers and an inter-rater reliability of 85% was reached. The two sets of codes were then reduced to one set via discussion between the two raters.

Results
The results are mainly presented in two parts based on the two research questions: (1) The effectiveness of corpus training and its usage as measured by frequency of errors and quality of writing and (2) The students' overall attitudes towards corpora. The effectiveness of corpus training was divided into two categories: (1) Collocation error frequency and (2) Lexical quality scores.

Collocation error frequency
The effectiveness of corpus training and its usage was measured based on two aspects: (1) The frequency of collocation errors in the two writing tests and (2) The lexical quality scores in the two tests. The results are displayed in Tables 3-6. As the study took place in evenings during students' self-study period and all students participated in the study on a voluntary basis, some students missed either the pretest or the post-test for various reasons. There remained seven students who took both the pre-test and the post-test. So we decided to focus on the scripts of the seven students. Only a certain number of students' data were used as they had a full data set. Due to practical constraints, some students' data was missing. Table : T-test of error frequency in collocation. Only a certain number of students' data were used as they had a full data set. Due to practical constraints, some students' data was missing.

Effectiveness of corpus-based training
The mean error frequency of the seven students in their pre-test and post-test indicates that learners made a minor improvement of 0.72 in the post-test. However, the result of the t-test (t = 1.94; p = 0.09 > 0.05) showed that there was an insignificant effect on the frequency of errors between the pre-and post-writing tasks, possibly due to the relatively small sample size.

Lexical quality and writing scores
The mean writing score of the post-test (7.5) ( Table 5) indicates that the students' quality of writing increased by an average score of 0.64 over the pre-test. In addition, the result of the writing scores of the t-test (t = 1.94; p = 0.00 < 0.05) suggested that there was a statistically significant difference between the pre-and post-writing tasks, which showed that learners made slight improvements in writing scores after three corpus training sessions.

Questionnaire
The questionnaire was divided into two categories, (1) The background information of students' prior knowledge in writing and (2) The evaluation of their attitudes towards the corpora. Students indicated in the survey that the e-dictionary was the most popular tool to consult when encountering any vocabulary problems in writing before receiving any corpus training. However, after receiving the corpus instruction, most participants expressed that they would seek help from the corpora to solve any problems they encountered.
The latter part of the questionnaire was divided into four dimensions (see Table 7). The results indicate that learners' overall attitude towards corpus training was positive, with a mean score of 5.87 out of 7. Their perceptions and reflections about corpus use after corpus training were positive with a mean of 5.88. This result indicates that students believed that the three corpus training sessions were useful for learning new words and collocations when writing academic essays. Students highly appreciated using the corpus (mean = 5.93) and their attitude (mean = 5.94) towards the corpus was the highest. They expressed a strong willingness to continue using corpora, such as COCA and Word and Phrase in their future English learning and academic writing. However, learner autonomy (mean = 5.67), received a relatively low score. This is understandable since full learner autonomy in corpora use is likely to be developed after sustained usage.

Interview
Coding of the interviews generated two main codesbenefits and difficulties of corpus use. These are elaborated here with excerpts from the students.

Corpus as a beneficial reference tool in writing
The search functions of the corpora (COCA and Word and Phrase Concordance) motivated the students to actively infuse their search results into their academic writing, especially the functions, "collocates" and "analyze texts," which they could consult to find appropriate synonyms to replace the target words. Most students claimed that the corpora facilitated and guided them to develop their writing abilities. They were more willing to use corpora to find suitable synonyms and to examine how to appropriately use words and collocations. Prior to the corpus training, learners tended to translate directly from Chinese when writing English essays as noted in literature (Zheng & Chang, 2014). For example: The size of my vocabulary is small. If I encounter any questions in choosing which word to use in IELTS writing, I will think in Chinese first, and then I will translate it into English. (A) I would look up the Chinese meaning in an electronic dictionary first and then choose a word that I had not used before in my writing so that my essay would be more academic. (B) After being introduced to the concepts of corpora, many students showed interests in using corpora to help them overcome difficulties in collocating words. Students indicated that they appreciated the teaching guides from the corpus training sessions, which were practical and closely matched the needs of the IELTS writing topics for which they were preparing. Moreover, the use of collocation errors drawn from students' previous writing motivated them to find ways to solve their own problems. During the training aimed at collocations and synonyms, students enjoyed being given more opportunities to do individual work by using corpora to identify appropriate words to collocate with target words. Some students looked in their notebooks and used corpora to check whether the phrases they had previously written were correct. IELTS candidates are also required to use a variety of words and expressions in the writing section of the test according to the marking criteria; this requirement motivated students to turn to corpora searching to decrease the monotony. Student C and E mentioned that: I have benefited from Word and Phrase Concordance in finding synonyms. Meanwhile, I also used both Word and Phrase and COCA to find the words which can be juxtaposed with these synonyms. Furthermore, the collocation and synonym learning through Word and Phrase and COCA provided me a new learning opportunity for using different expressions to avoid repeating the same word and enhanced language accuracy, which improved the quality of my writing.  (2) The opportunity to train how to summarize the usages of collocations. Specifically, summarizing requires students to generate different language patterns for two similar collocations, such as "apply to" and "apply for." Student B reported that summarizing required him to think critically, while student D stated that concordance lines are more efficient than dictionaries in terms of self-discovery: Summarizing requires my deep knowledge of vocabulary. For example, when we had the training to find out the difference between "apply for" and "apply to" through the concordance lines, I was not able to figure it out without the help from the teacher. After I tried two to three times, I found out that the concordance lines provided me examples to analyze what kinds of words can be added after the target word. (B) Concordance lines require me to self-discover the language patterns while dictionaries present all the patterns for learners. In this case, learners will just copy the phrases without knowing the usage. However, I can find the language patterns through concordance lines and use them in my writing which can make me feel successful. Thus, I think I can improve with the help of corpora. (D)

Practical problems of corpus use in classrooms with short training sessions
The interview responses also indicated that it was difficult for learners to search for information through the corpora due to limited training. During the training session about verb + preposition collocation, the sheer abundance of concordance lines appearing in their second language overwhelmed the students, preventing them from promptly deciphering language patterns. Having only three 30 min training sessions was also insufficient for learners to be fully familiar with the search functions. The excerpts below illustrate: It takes me a long time to search for one word because I am not familiar with the searching functions. (E) We had never heard of corpora before, so it was hard for us to remember all the searching steps in such a short time. (C) When I use these two corpora, it is easy for me to mix up the order of searching for words. For example, I always forget when I should change the order of the words to either the left side or the right side. Thus, I think it will be better if I can receive more corpus-training. (F)

Constraints of electronic device usage in classrooms
Due to the school policy, students' use of electronic devices during class was restricted, which reduced the capacity of students to become familiar with corpora. Although some students were able to use their mobile phones to look up Effectiveness of corpus-based training collocations, the small screen proved to be a challenge for them to continue using corpora to search for words. It was also found that some students did not have access to corpora for their post-writing task. Therefore, the reference tool they could use was dictionary. I did not use corpora because I cannot use laptops at school. I used my mobile phone to search for words, but the words and screens are too small. We are studying for IELTS, and it required handwriting. Thus, we are not allowed to use computers to write essays at school. (H) I really appreciated the training sessions because I learned something new that can be applied to my academic writing. However, the sessions were too short, and our writing teachers never mentioned corpora before. So, I do not think I will use corpora at this moment, but I will use it later when I enter university. At that time, I will have my own laptop. Moreover, we are learning IELTS, which requires us to write essays in a written form. If I am learning TOEFL (Test of English as a Foreign Language), then I think it is definitely a good resource, and our teacher will allow us to use computers in school because TOEFL requires candidates to use the computer to write the essay. (G) 6 Discussion and implications 6.1 Benefits of using corpus tools to facilitate writing The higher scores on IELTS writing task 2 show that the students improved their lexical use and tended to reduce errors in their writing after corpus-based training. According to the questionnaire, the students had barely heard of the concept of collocations and had no prior knowledge of corpora in English vocabulary learning. In this sense, the study achieved two aims, namely, to familiarize students with collocations and provide them with tools to improve their writing. Similar to existing studies on DDL (Crosthwaite, 2017;Li, 2017;Saeedakhtar et al., 2020;Vyatkina, 2017;Wu, 2016;Yoon & Hirvela, 2004), students made noticeable improvement on collocations and vocabulary use. We therefore conclude that Chinese secondary students can benefit from DDL in their language learning, especially their collocation competence.
This finding contributes to the application of DDL for international English learners. According to Futagi et al. (2008), collocation errors are not only common in Chinese IELTS examinees, but also frequently appear in academic writing by other L2 learners, including college students in Saudi Arabia and international students in American universities. Same as Futagi et al. (2008), our study highlights the need for learners to focus on collocation strings and distinguish synonyms based on their collocations. Smirnova (2017) also found that the experience of using corpora proved effective for EFL learners in a Russian university. She claimed the use of corpora improved students' understanding of usage patterns, helped them correct collocation errors autonomously, and more importantly, increased their scores in the IELTS writing section. In addition, students in the current study expressed their willingness to use corpora when practicing for other tests, such as TOFEL and CET (College English Test). Therefore, it is suggested that non-native English learners refer to corpora to improve their writing competence to increase their scores in English tests.
The inductive thinking involved when searching linguistic items and looking for language use patterns in corpora helps language learners take charge of their own learning beyond the examples provided in textbooks (Crosthwaite, 2019). In the current study, the secondary students also showed the ability to correct errors on their own. Searching and examining language use in corpora provided them with the opportunity to correct writing errors. Crosthwaite (2019) suggests that the more pre-tertiary students interact with corpora data, the more they develop problem solving skills and autonomous learning, which are important learning strategies required after graduation. However, corpus-using strategies take time to develop. Charles (2014), for example, showed that 70% of students practiced for 12 months before getting accustomed to using corpora. Likewise, Yoon and Hirvela (2004) found learners needed a long time to develop their abilities in editing lexicogrammatical errors in the corpora. This explains why some students from the present study claimed that the limited number of training sessions prevented them from being fully familiar with various search functions embedded in the corpus websites. Early studies, Thurstun and Candlin (1998) and Sun (2000), revealed that students may not be confident in reading concordance lines in English. This concern was also evident in our study. Thus, it appears that if corpora training were to become a part of the English curriculum, students would require prolonged training to become more skillful and confident in using search functions and benefit more from corpus use.

Attitudes and overall evaluations of corpora 6.2.1 Student attitudes and overall evaluations of corpora
The results reveal that students held positive attitudes towards corpora use in academic writing. Even though they were not able to fully familiarize themselves with the search functions, they regarded corpora as beneficial alternative tools to better understand language patterns as well as to enhance their confidence and ability in writing. This finding is consistent with Yoon and Hirvela (2004), who Effectiveness of corpus-based training stated that learners were positive towards corpora as they regarded them as helpful reference tools not only for obtaining word usage but also for developing writing skills. Chang (2014) also showed that learners appreciated the benefits corpora provide, even though they had not yet developed a deep knowledge of corpus use.
There appear to be several reasons for the students' positive attitudes towards corpora usage as indicated by responses to the questionnaire. Apart from arousing students' attention to the lexical mistakes offered by the researcher during the training session, some students used corpora to check whether the previously written phrases in their notebooks were appropriate after the training session. The questionnaire responses implied that dictionaries were commonly used by students before corpus training sessions; however, after consulting corpora, they found that words from dictionaries were not always appropriate for certain contexts. Similar to Chan and Liou (2005), students were able to transfer collocation patterns in examples they learned during the lesson to new sentences they wrote outside of class. In addition, after training, many learners asked the researcher to teach them more search functions, such as "all forms" and "parts of speech" from the Word and Phrase Concordance (see Figure 1).
Students showed their appreciation for learning about collocations through corpora even though the training sessions were limited. Some students stated that using various expressions can enhance one's ability to use lexical resources. Most of the students were fully engaged in learning about collocations during the three training sessions, especially the hands-on activities, which encouraged them to further practice, leading to a highly positive attitude towards using corpora.
Nevertheless, some negative aspects were evident. Because self-discovery plays an important role in corpora usage, particularly for the pattern "verb + preposition," it was found that some students were unable to develop their own language patterns through concordance lines without any hints from their teacher even after discussions with their groupmates. Also, some students had a difficult time locating information similar to what they wanted when faced with such an abundance of concordance lines, which caused them to give up. These reasons probably explain why learners' autonomy was rated the lowest in the survey among all dimensions.

Technology issues related to corpora use
One salient factor that caused learners to have a negative evaluation of corpus usage was the unstable speed of the Internet. Due to the school policy of banning computers or electronic devices, the training sessions were conducted in the school's computer lab, where the speed of the Internet was relatively slow, thus reducing the students' practice time. A few of the students explained that the slow speed of the Internet was one possible factor inhibiting the effectiveness of the training. It prevented students from having more opportunities to practice search functions. In the early days of corpora usage, Sun (2000) also noted that network speed and stability are two major problems. This challenge thus remains.
Although students were generally positive about corpora usage, they had not yet gained the habit of using them. According to Charles (2014), developing a habit of using corpora means that students can continue to use them after the training is completed. However, because the students could access computers only either during the training sessions or at home on the weekend, the opportunities for students to consult corpora and develop a behavioral pattern of using them were limited.
One solution, noted by students, is to create a record of the collocations they searched from the corpora. Some students made a record of the suggested answers from their corpus search to replace their errors during the training, which they could later refer to for further study or review. Some of them also copied some concordance lines so that they could learn the difference between two phrases in their free time. This notetaking was especially important in the present context of their English learning because students were restricted from using electronic devices in the classroom. Further, teachers need to urge their schools to provide facility for students while encouraging students to be pro-active in their learning by keeping records of the new collocations they have learned.

Conclusion
The study explored the effectiveness of corpus training sessions in IELTS academic writing among secondary school students by assessing their ability to learn from a corpus-based intervention; afterwards, their attitudes towards the corpora were gathered. The results of this research shed positive light on the efficacy of corpus training for improving senior secondary school students' lexical usage. Specifically, the findings revealed the students collocated better in their writing after the intervention and had mostly positive views about using corpora. Moreover, the questionnaire responses suggest that students had ample opportunities to use new vocabulary or phrases in their weekly writing practice. Students also received adequate support from their writing teachers for further improvements, such as grammatical errors and coherence. However, corpus tools are relatively new to Chinese teachers and students, and not all writing teachers are able to point out collocation errors in students' texts or even know about corpora, indicating a need for bringing corpora usage into language teacher training.
The findings also revealed some limitations and difficulties that the students encountered in using corpus tools. One key element that emerged from this study is that it is difficult for Chinese secondary school students to have access to electronic devices to use corpora. Students could only use computers during the training sessions and on weekends. Therefore, we recommend that schools provide students with computers after class to access corpora and create records of concordance lines for certain language items so that they can analyze these corpus data in free time. Although only three corpus training sessions were conducted, the findings revealed the positive effects of DDL for pre-tertiary students. Since learning corpus search functions and the analysis of concordance lines can be a very complicated process (Heather & Helt, 2012;Naismith, 2017), it is suggested that instructors provide adequate corpus training for students so that they can be more autonomous and confident in corpora use over the course of their long-term language learning.
In conclusion, the study showed that in general, secondary L2 learners can improve their writing skill and are positive towards corpora usage. However, due to practical constraints, only seven participants' full data sets were used in the quantitative scoring part of the study, which limits its generalizability; thus, the results are indicative only. The qualitative parts of the study gathered positive views from a larger group of students indicating good potential. Although the learners' writing scores improved, collocations and synonyms learned through corpora are still only one instructional element among a myriad of elements that go into L2 writing.
There are a few limitations in the study. First, this is a small-scale study that only examined limited data on student consultation of corpora as reference tools for academic writing. Second, due to some practical constraints caused by the school policy, some students did not make full use of corpora, which may have undermined the results of the study. Even though, our data is indicative of good potential of corpus use in enhancing secondary school students' collocation learning in academic writing. Future studies researching corpora training on language students can consider having a larger number of participants while including an expanded list of the type of collocation errors L2 students make. Appendix I: Criteria for lexical quality dimension (adapted and edited from British Council, 2020a) Band Lexical Quality  -"uses a wide range of vocabulary with very natural and sophisticated control of lexical features; rare minor errors occur only as 'slips'" -1 or below collocation error  -"uses a wide range of vocabulary" -"fluently and flexibly to convey precise meanings" -"skilfully uses uncommon lexical items but there may be occasional inaccuracies in word choice and collocation" -"produces rare errors in spelling and/or word formation" -2-3 collocation errors  -"uses a sufficient range of vocabulary to allow some flexibility and precision" -"uses less common lexical items with some awareness of style and collocation" -"may produce occasional errors in word choice, spelling and/or word formation" -4-5 collocation errors  -"uses an adequate range of vocabulary for the task" -"attempts to use less common vocabulary but with some inaccuracy" -"makes some errors in spelling and/or word formation, but they do not impede communication" -"produce few collocation errors" -6-8 collocation errors  -"uses a limited range of vocabulary, but this is minimally adequate for the task" -"may make noticeable errors in spelling and/or word formation that may cause some difficulty for the reader" -9-12 collocation errors  -"uses only basic vocabulary which may be used repetitively, or which may be inappropriate for the task" -"has limited control of word formation and/or spelling; errors may cause strain for the reader" -13-15 collocation errors  -"uses only a very limited range of words and expressions with very limited control of word formation and/or spelling errors may severely distort the message" -"errors may severely distort the message" -16-18 collocation errors  -"uses an extremely limited range of vocabulary; essentially no control of word formation and/or spelling" -19-21 collocation errors  -"can only use a few isolated words" -21 or more collocation errors  -"does not attend" -"does not attempt the task in any way" -"writes a totally memorized response"        . Using the corpus is helpful for learning the usage of vocabulary        . Using the corpus is helpful for learning the usage of collocations        . Using the corpus improved my English academic writing skill        . When I had problems in the second writing task, I searched for help in COCA        . When I had problems in the second writing task, I searched for help in Word and Phrase Concordance        . When I search for information in a corpus, I usually get the information that I need        . I understand the purpose of using the corpus after the three training sessions        . I want to continue to use corpus in the future        . After the three training sessions, I wish our teachers can introduce corpus more        . My confidence in writing in English has increased by learning about the corpus        . I will continue to use COCA in English academic writing        . I will continue to use Word and Phrase in English academic writing        . Overall, the corpus is a very useful resource for my English academic writing        Effectiveness of corpus-based training -Look at the frequency page to see whether "questions" can be juxtaposed after meet & encounter.

Meet Encounter
-Show students how to process the vocabulary "question" on the collocates function of COCA and to see what kind of verbs are commonly collocated with "questions".

Questions
Effectiveness of corpus-based training Step 3: Extension task -First, students process the vocabulary "tell" and "express" on the collocates function of COCA. -Second, check the frequency page of these two words and see if "opinions" can be collocated together. -Third, process the vocabulary "opinion" on the collocates function of COCA and to see what kind of verbs are commonly collocated with "opinion".

Tell Express
Tell Express

Opinion
Step 4: Synonym learning -Showing students how to find synonyms of "questiond" that are collocated with "encounter" and the synonyms of "encounter" that are collocated with "question".

Question Encounter
Step 5: Extension task (synonyms) -Students search individually how to find synonyms of "opinion" that are collocated with "express" and the synonyms of "express" that are collocated with "opinion".

Effectiveness of corpus-based training
Opinion Express Opinion

Express
Step 6: Homework Use COCA to check the collocations in the following sentences and to find a synonym(s) to replace them. Although modern technology can provide lots of information, it will also bring some disadvantages. I said some unpleasant comments about her work. (Say comments) Did you have some exercise today? (Have exercise) Education from Brigham Young University Hawaii. Her research interests include corpus linguistics, data-driven learning (DDL), and English for academic purposes (EAP).

Qing Ma
The Education University of Hong Kong, Hong Kong, China maqing@eduhk.hk Qing Ma is an associate professor at the Department of Linguistics and Modern Language Studies, The Education University of Hong Kong. Her main research interests include second language vocabulary acquisition, corpus linguistics, computer assisted language learning (CALL) and mobile assisted language learning (MALL).

Jiahao Yan
The Education University of Hong Kong, Hong Kong, China yjiahao@eduhk.hk Jiahao Yan is a Research Associate and post-graduate student at the Education University of Hong Kong. He has taken part in a number of research projects and come up with many insight research outputs. His research interests include corpus linguistics, data-driven learning (DDL) and English for academic purposes (EAP).