AERA (American Educational Research Association), APA (American Psychological Association) & NCME (National Council on Measurement in Education). 2014. Standards for educational and psychological testing. Washington, DC: AERA.Google Scholar
Alderson, Charles J. 2012. Principles and practice in language testing: compliance or conflict? Presentation at TEA SIG Conference: Innsbruck. http://tea.iatefl.org/inns.html (accessed May 2017).
Alderson, Charles J & Jayanti Banerjee. 2002. Language testing and assessment (part 2). Language Teaching 35(2). 79–113.Google Scholar
Alderson, Charles.J, Caroline Clapham & Diana Wall. 1995. Language test construction and evaluation. Cambridge: Cambridge University Press.Google Scholar
ALTE/Council of Europe. 2011. Manual for language test development and examining. for use with the CEFR. http://www.coe.int/t/dg4/linguistic/ManualtLangageTest-Alte2011_EN.pdf (accessed January 2017).
Bachman, Lyle. F. 1990. Fundamental considerations in language testing. Oxford: Oxford University Press.Google Scholar
Bachman, Lyle. F. 2004. Statistical analysis for language assessment. Cambridge: Cambridge University Press.Google Scholar
Bachman, Lyle. F. 2005. Building and supporting a case for test use. Language Assessment Quarterly 2(1). 1–34.CrossrefGoogle Scholar
Bachman, Lyle. F. 2007. What is the construct? The dialectic of abilities and contexts in defining constructs in language assessment. In Janna Fox, Mari Wesche, Doreen Bayliss, Carolyn E Liying Cheng & Christine Doe Turner (eds.), Language testing reconsidered, 41–71. Ottawa: University of Ottawa Press.Google Scholar
Bond, Trevor. G & Christine M Fox. 2015. Applying the Rasch model: Fundamental measurement in the human sciences. 3rd edn., New York: Routledge.Google Scholar
Buck, Gary. 2001. Assessing listening. Cambridge: Cambridge University Press.Google Scholar
Cohen, Jacob. 1988. Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Earlbaum Associates.Google Scholar
Council of Europe. 2001. Common European framework of reference for languages: Learning, teaching, assessment. Cambridge: Cambridge University Press.Google Scholar
Council of Europe. 2008. Recommendation CM/Rec (2008)7 of the committee of ministers to member states on the use of the council of Europe’s Common European framework of reference for languages (CEFR) and the promotion of plurilingualism. Strasbourg: Council of Europe. http://www.coe.int/t/dg4/linguistic/Conventions_EN.asp (accessed January 2017).
Council of Europe. 2009. Relating language examinations to the common European framework of reference for languages: Learning, teaching, assessment (CEFR): A manual. Strasburg: Council of Europe. http://www.coe.int/T/DG4/Linguistic/Manuel1_EN.asp (accessed January 2017).
Council of Europe. 2017. Common European framework of reference for languages: Learning, teaching, assessment. Companion volume with new descriptors. Strasbourg: Council of Europe.Google Scholar
Davies, Alan & Catherine Elder. 2005. Validity and validation in language testing. In Eli Hinkle (ed.), Handbook of research in second language teaching and learning, vol. 1, 795–813. Mahwah, NJ: Lawrence Erlbaum.Google Scholar
Field, John. 2008. Listening in the language classroom. Cambridge: Cambridge University Press.Google Scholar
Field, John. 2013. Cognitive validity. In Ardeshir Geranpayeh & Lynda Taylor (eds.), Examining listening: research and practice in assessing second language listening. Cambridge: Cambridge University Press.Google Scholar
Green, Rita. 2013. Statistical analyses for language test developers. London: Palgrave Macmillan.Google Scholar
Green, Rita. 2017. Designing listening tests: A practical approach. London: Palgrave Macmillan.Google Scholar
Kane, Michael. 2012. Validating score interpretations and uses. Language Testing 29(1). 3–17.CrossrefWeb of ScienceGoogle Scholar
Kane, Michael. 2013. Validating the interpretations and uses of test scores. Journal of Educational Measurement 50(1). 1–73.CrossrefWeb of ScienceGoogle Scholar
Kecker, Gabriele & Thomas Eckes. 2010. Putting the manual to the test: The TestDaF–CEFR linking project. In Waldemar Martyniuk (ed.), Aligning tests with the CEFR: Reflections on using The Council Of Europe’s Draft Manual, 50–79. Canbridge: Cambrideg University Press.Google Scholar
Kolen, Michael J & Robert L Brennan. 2014. Test equating, scaling, and linking: Methods and practices. 3rd edn. New York: Springer-Verlag.Google Scholar
Larry Vandergrift & Christine C. M Goh. 2012. Teaching and learning second language listening: Metacognition in action. New York: Routledge.Google Scholar
Linacre, John Michael. 2017. Winsteps® Rasch measurement computer program user’s guide. Beaverton, Oregon: Winsteps.com (accessed January 2017).
McNamara, Timothy Francis. 1996. Measuring second language performance. Harlow: Addison Wesley Longman Ltd.Google Scholar
Messick, Samuel. 1989. Validity. In Robert L Linn (ed.), Educational measurement, 3rd edn., 13–103. New York, NY: Macmillan.Google Scholar
North, Brian & Neil Jones. 2009. Further material on maintaining standards across languages, contexts and administrations by exploiting teacher judgment and IRT scaling. Strasbourg: Council of Europe.Google Scholar
Rasch, George. 1960. Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.Google Scholar
Reckase, Mark D. 2010. NCME 2009 presidential address: What I think I know. Educational Measurement: Issues and Practice 29(3). 3–7.CrossrefGoogle Scholar
Shackleton, Caroline. 2018. Linking the university of Granada CertAcles listening test to the CEFR. Revista de Educación 381. 37–65.Google Scholar
Sick, James. 2008. Rasch measurement in language education part 2: Measurement scales and invariance. Shiken: JALT Testing & Evaluation SIG Newsletter 12(2). 26–31.Google Scholar
Sick, James. 2010. Rasch measurement in language education part 5: Assumptions and requirements of Rasch measurement. SHIKEN: JALT Testing & Evaluation SIG Newsletter 14(2). 23–29.Google Scholar
Wright, Benjamin D & Mark H Stone. 1979. Best test design. Chicago: MESA.Google Scholar
Wu, Margaret & Ray Adams. 2007. Applying the Rasch model to psycho-social measurement: A practical approach. Melbourne: Educational Measurement Solutions.Google Scholar
Xi, Xiaoming. 2008. Methods of test validation. In Elana Shohamy (ed.), Language testing and assessment, volume 7 of encyclopedia of language and education, 177–196. New York: Springer.Google Scholar
Comments (0)