The International Research Foundation

for English Language Education

TESTING SECOND LANGUAGE SPEAKING SKILLS:

SELECTED REFERENCES

(last updated 18 August2012)

Akiyama, T. (2003). Assessing speaking: Issues in school-based assessment and the introduction of speaking tests into the Japanese senior high school entrance examinations. JALT Journal, 25(2), 117-141.

American Council on the Teaching of Foreign Languages (1986). ACTFL Proficiency Guidelines. New York: American Council on the Teaching of Foreign Languages.

Andrews, S., & Fullilove, J. (1994). Assessing spoken English in public examination-why and how? In J. Boyle & P. Falvey (Eds.), English Language Testing in Hong Kong (pp. 57-86).Hong Kong: Chinese University.

Arter, J. A. (1989). Assessing communicative competence in speaking and listening: A consumer’s guide. Portland, OR: Northwest Regional Educational Laboratory.

Bachman, L. F. (1988). Problems in examining the validity of the ACTFL oral proficiency interview. Studies in Second Language Acquisition, 10(2), 149-164.

Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.

Bachman, L. F., Lynch, B. K., & Mason, M. (1995). Investigating variability in tasks and rater judgments in a performance test of foreign language speaking. Language Testing, 12(2), 238-257.

Bachman, L. F., & Palmer, A. S. (1981a). A multitrait-multimethod investigation into the construct validity of six tests of speaking and reading. In A. S. Palmer, P. Groot, & G. Trosper (Eds.), The construct validation of tests of communicative competence (pp. 149-165). Washington, DC: TESOL.

Bachman, L. F., & Palmer, A. S. (1981b). The construct validation of the FSI Oral Interview. Language Learning, 31, 67-86.

Bachman, L. F., & Palmer, A. S. (1982). The construct validation of some components of communicative proficiency. TESOL Quarterly, 16(4), 449-465.

Bachman, L. F., & Savignon, S. J. (1986). The evaluation of communicative language proficiency: A critique of the ACTFL Oral Interview. Modern Language Journal, 70(4), 380-390.

Bailey, K. M. (1985). If I had known then what I know now: Performance testing of foreign teaching assistants. In P. C. Hauptman, R. LeBlanc, & M. Wesche (Eds.), Second language performance testing (pp. 153-180). Ottawa: University of Ottawa Press.

Barrett, S. (2001). The impact of training on rater variability. International Education Journal,2(1), 49-58.

Bartz, W. H. (1979). Testing oral communication in the foreign language classroom. Arlington, VA: Center for Applied Linguistics.

Berwick, R., & Ross, S. (1996). Cross-cultural pragmatics in oral proficiency interview strategies. In M. Milanovic & N. Saville (Eds.), Studies in language testing 3: Performance testing, cognition, and assessment: Selected papers from the 15th language testing research colloquium (pp. 34-54). Cambridge, UK: Cambridge University Press.

Bernstein, J., De Jong, J. H. A. L., Pisoni, D., & Townshend, B. (2000). Two experiments on automatic scoring of spoken language proficiency. In P. Delcloque (Ed.), Proceedings of InSTIL2000: Integrating speech technology in learning (pp. 57-61). Dundee, Scotland, UK: University of Abertay Dundee.

Berwick, R., & Ross, S. (1996). Cross-cultural pragmatics in oral proficiency interview strategies. In M. Milanovic & N. Saville (Eds.), Studies in language testing 3: Performance testing, cognition, and assessment: Selected papers from the 15th language testing research colloquium (pp. 34-54). Cambridge, UK: Cambridge University Press.

Blanche, P. (1990). Using standardized achievement and oral proficiency tests for self-assessment purposes: The DLIFLC study. Language Testing, 7(2), 202-229.

Bridgeman, B., Powers, D., Stone, E., & Mollaun, P. (2011). TOEFL iBT speaking test scores as indicators of oral communicative language proficiency. Language Testing, 29(1), 91-108.

Briggs, S., & MacDonald, C. (1978). A practical approach to testing speaking and listening skills. English Teaching Forum, 16(3), 8-15.

Brindley, G. (1989). Assessing achievement in the learner-centered curriculum. Sydney: National Centre for English Language Teaching and Research, Macquarie University.

Brooks, L. (2009). Interacting in pairs in a test of oral proficiency: Co-constructing a better performance. Language Testing, 26(3), 341-366.

Brown, A. (1993). The role of test-taker feedback in the test development process: Test-takers’ reactions to a tape-mediated test of proficiency of spoken Japanese. Language Testing, 10(3), 277-303.

Brown, A. (1995). The effect of rater variables in the development of an occupation-specific language performance test. Language Testing, 12(1), 1-15.

Brown, A. (2003). Interviewer variation and the co-construction of speaking proficiency. Language Testing, 20(1), 1-25.

Brown, A. (2004). Discourse analysis and the oral interview: Competence or performance. In D. Boxer & A. D. Cohen (Eds.), Studying speaking to inform language learning (pp. 263-282). Clevedon, UK: Multilingual Matters.

Brown, A., & Hill, K. (1998). Interviewer style and candidate performance in the IELTS oral interview. IELTS Research Reports, 1, 1-19.

Brown, D. (1983). Conversational cloze tests and conversational ability. ELT Journal, 37(2), 158-161.

Brown, H. D. (2004). Language assessment: Principles and classroom practices. London: Longman.

Button, G. (1992). Answers as interactional products: Two sequential practices used in interviews. In P. Drew & J. Heritage (Eds.), Talk at work: Interaction in institutional settings (pp. 212-231). Cambridge, UK: Cambridge University Press.

Byrnes, H. (1987). Features of pragmatic and sociolinguistic competence in the oral proficiency interview. In A. Valdman (Ed.), Proceedings of the symposium on the evaluation of foreign language proficiency (pp. 167-177). Bloomington, IN: Indiana University.

Callaway, D. R. (1980). Accent and the evaluation of ESL oral proficiency. In J. W. Oller & K. Perkins (Eds.), Research in language testing (pp.102-115). Rowley, MA: Newbury House.

Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1, 1-47.

Carpenter, K., Fujii, N., & Kataoka, H. (1995). An oral interview procedure for assessing second language abilities in children. Language Testing, 12(2), 157-181.

Center for Applied Linguistics (2000). BEST evolves to meet new needs. CAL Reporter, 14(1), 1, 5.

Chalhoub-Deville, M. (1995). A contextualized approach to describing oral language proficiency. Language Learning, 45(2), 251-281.

Chalhoub-Deville, M. (1995). Deriving oral assessment scales across different tests and rater groups. Language Testing, 12(1), 16-33.

Chalhoub-Deville, M., & Fulcher, G. (2003). The oral proficiency interview and the ACTFL guidelines: A research agenda. Foreign Language Annals, 36(4), 498-506.

Chalhoub-Deville, M., & Wigglesworth, G. (2005). Rater judgment and English language speaking proficiency. World Englishes, 24(3), 383-391.

Chambers, L., & Ingham, K. (2011). The BULATS Online Speaking Test. Cambridge ESOL Research Notes, 43, 21-25.

Chaudhary, S. (1997). Testing spoken English as a second language [Electronic version]. Forum, 35(2), 2.

Clankie, S. (1995). The SPEAK test of oral proficiency: A case study of incoming freshmen. In J. D. Brown & S. Yamashita (Eds.), Language testing in Japan (pp. 119-125). Tokyo: JALT Applied Materials.

Clark, J. L. D. (Ed.). (1978). Direct testing of speaking proficiency. Princeton, NJ: Educational Testing Service.

Clark, J. L. D. (1979). Direct and semi-direct tests of speaking ability. In E. J. Briere & F. B. Hinofotis (Eds.), Concepts in language testing (pp. 35-49). Washington, DC: TESOL.

Clark, J. L. D. (Ed.). (1983). Language testing: Past and current status -- directions for the future. Modern Language Journal, 67(4), 431-443.

Clark, J. L. D. (1986). Development of a tape-mediated ACTFL/ILR scale-based test of Chinese speaking proficiency. In C. W. Stansfield (Ed.), Technology and language testing (pp. 129-146). Washington, DC: TESOL.

Clark, J. L. D. (1987). A study of the comparability of speaking proficiency interview ratings across three government language training agencies. In K. M. Bailey, T. L. Dale, & R. T. Clifford (Eds.), Language testing research: Selected papers from the 1986 Colloquium (pp. 132-179). Monterey, CA: Defense Language Institute.

Clark, J. L. D. (1988). Validation of a tape-mediated ACTFL/ILR-scale based test of Chinese speaking proficiency. Language Testing, 5(2), 187-205.

Clark, J. L. D., & Clifford, R. T. (1988). The FSI/ILR/ACTFL proficiency scales and testing techniques: Development, current status, and needed research. Studies in Second Language Acquisition,10(2), 121-147.

Clark, J. L. D., Clifford, R. T., & Hooshmand, D. (1992). “Screen-to-screen” testing: An exploratory study of oral proficiency interviewing using video teleconferencing. System, 20(3), 293-304.

Cogan, D. (1998). Oral English testing in a Japanese university. In J.C. Richards (Ed.), Teaching in action: Case studies from second language classrooms (pp. 334-339). Alexandria, VA: TESOL.

Congdon, P., & McQueen, J. (2000). The stability of rater severity in large-scale assessmentprograms. Journal of Educational Measurement, 37(2), 163-178.

Coniam, D. (1995). Towards a common ability scale for Hong Kong English secondary-school forms. Language Testing, 12(2), 182-193.

Courtney, M. (1996). Talking to learn: Selecting and using peer group oral tasks. ELT Journal,50(4), 318-326.

Dandonoli, P., & Henning, G. (1990). An investigation of the construct validity of the ACTFL oral proficiency guidelines and oral interview procedure. Foreign Language Annals,23(1), 11-22.

Davies, A. (1985). Communicative language testing. Applied Linguistics, 1, 22-33.

Day, E. M., & Shapson, S. (1987). Assessment of oral communicative skills in early French immersion programmes. Journal of Multilingual and Multicultural Development,8(3), 237-260.

De Saint Leger, D. (2009). Self-assessment of speaking skills and participation in a foreign language class. Foreign Language Annals, 42(1), 158-178.

Douglas, D. (1994). Quantity and quality in speaking test performance. Language Testing, 11(2), 125-143.

Douglas, D. (2000). Assessing languages for specific purposes. Cambridge: Cambridge University Press.

Douglas, D. (2004). Discourse domains: The cognitive context of speaking. In D. Boxer, & A. Cohen (Eds.), Studying speaking to inform second language learning (pp. 25-47). Clevedon, UK: Multilingual Matters.

Douglas, D., & Selinker, L. (1992). Analyzing oral proficiency test performance in general and specific purpose contests. System, 20(3), 317-328.

Duffy, C. (2007). An examination of test task characteristics and their effect on oral language test performance. In C. Irvine-Niakaris & A. Nebel (Eds.),2nd language testing & evaluation forum, Teaching and testing: Opportunities for learning (pp. 20-43). Athens, Greece: Hellenic American Union.

Egbert, M. M. (1998). Miscommunication in language proficiency interviews of first-year German students: A comparison with natural conversation. In R. Young & A. W. He (Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency (pp. 149-172). Philadelphia, PA: John Benjamins.

Edwards, A. L. (1996). Reading proficiency assessment and the ILR/ACTFL text typology: A reevaluation. Modern Language Journal, 80(3), 350-361.

Egbert, M. M. (1998). Miscommunication in language proficiency interviews of first-year German students: A comparison with natural conversation. In R. Young & A. W. He (Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency (pp. 149-172). Philadelphia, PA: John Benjamins.

Elder, C., Iwashita, N., & McNamara, T. (2002). Estimating the difficulty of oral proficiency tasks: What does the test-taker have to offer? Language Testing, 19(4), 347-368.

Fall, T., Adair-Hauck, B., & Gilsan, E. (2007). Assessing students’ oral proficiency: A case for online testing. Foreign Language Annals,40(3), 377-406.

Firth, J. D. (Ed.). (1980). Measuring spoken language proficiency. Washington, DC: Georgetown University Press.

Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all reasons. Applied Linguistics, 21(3), 354-375.

Fulcher, G. (1987). Tests of oral performance: The need for data-based criteria. ELT Journal, 41(4), 287-291.

Fulcher, G. (1996). Does thick description lead to smart tests? A data-based approach to rating scale construction. Language Testing, 13(2), 208-238.

Fulcher, G. (1996). Testing tasks: Issues in task design and the group oral. Language Testing, 13(1), 23-51

.

Fulcher, G. (1996). Invalidating validity claims for the ACTFL oral rating scale. System, 24(2), 163-172.

Fulcher, G. (1998). Testing speaking. In C. Clapham (Ed.), Language testing and assessment (pp. 75-86). Encyclopedia of Language and Education. Amsterdam, The Netherlands: Kluwer Academic Publishers.

Fulcher, G., Davidson, F., & Kemp, J. (2011). Effective rating scale development for speaking tests: Performance decision trees. Language Testing, 28(1), 5-29.

Fulcher, G., & Marquez Reiter, R. (2003). Task difficulty in speaking tests. Language Testing, 20(3), 321-344.

Galaczi, E. D. (2010). Peer-peer interaction in a paired speaking test: The case of FCE. Cambridge ESOL Research Notes, 42, 22.

Gan, Z. (2010). Interaction in group oral assessment: A case study of higher and lower scoring students. Language Testing, 27(4), 585-602.

Grove, E., & Brown, A. (2001). Tasks and criteria in a test of oral communication skills for first-year health science students. Melbourne Papers in Language Testing, 10, 37-47.

Hadden, B. (1991). Teacher and non-teacher perceptions of second-language communication. Language Learning, 41(1), 1-24.

Haggstrom, M. (1994). Using a video camera and task-based activities to make classroom oral testing a more realistic communicative experience.Foreign Language Annals,27(2), 161-175.

Halleck, G. (1992). The oral proficiency interview: Discrete point test or measure of communicative language ability? Foreign Language Annals, 25(3), 227-231.

Harlow, L. L., & Caminero, R. (1990). Oral testing of beginning language students at large universities: Is it worth the trouble? Foreign Language Annals, 23(6), 489-501.

He, A. W. (1998). Answering questions in LPIs: A case study. In R. Young & A. W. He (Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency (pp. 101-116). Philadelphia, PA: John Benjamins.

He, A. W., & Young, R. (1998). Language proficiency interviews: A discourse approach. In R. Young, & A. W. He (Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency (pp. 1-24).Amsterdam, The Netherlands: John Benjamins.

Hendricks, D., Scholz, G., Spurling, R., Johnson M., & Vandenburg, L. (1980). Oral proficiency testing in an intensive English program. In J. W. Oller & K. Perkins (Eds.), Research in language testing (pp. 77-90). Rowley, MA: Newbury House.

Henning, G. (1983). Oral proficiency testing: Comparative validities of interview, imitation, and completion methods. Language Learning,33(3), 315-331.

Henning, G. (1992). The ACTFL Oral Proficiency Interview: Validity evidence. System, 20(3), 365-372.

Hill, K. (1998). The effect of test-taker characteristics on reactions to and performance on an oral English proficiency test. In A.J. Kunnan (Ed.), Validation in language assessment (pp. 209-229). Mahwah, NJ: Lawrence Erlbaum.

Hingle, I., & Linington, V. (1997). English proficiency test: The oral component of a primary school [Electronic version]. Forum, 35(2), 26.

Hinofotis, F. B., Bailey, K. M., & Stern, S. L. (1981). Assessing the oral proficiency of prospective foreign teaching assistants: Instrument development. In A. S. Palmer, P. J. M. Groot, & G. A. Trosper (Eds.), The construct validation of tests of communication competence (pp. 106-126). Washington, DC: TESOL.

Hoekje, B., & Linnell, K. (1994). "Authenticity" in language testing: Evaluating spokenlanguage tests for international teaching assistants. TESOL Quarterly, 28(1), 103-126.

Hughes, A. (1981). Conversational cloze as a measure of oral ability. ELT Journal, 35(2), 161-168.

Hughes, A., Cooper, R. L., Nevo, D., Stevenson, D. K., & Wesche, M. B. (1986). Panel discussion: The next 25 years? Language Testing,3(2), 237-245.

Inoi, S. (1995). The validity of written pronunciation questions: Focus on phoneme discrimination. In J. D. Brown & S. Yamashita (Eds.), Language testing in Japan (pp. 179-186). Tokyo: JALT Applied Materials.

Isaacs, T. (2008). Towards defining a valid assessment criterion of pronunciation proficiency innon-native English-speaking graduate students. The Canadian Modern LanguageReview, 64(4), 555-580.

Iwashita, N., Brown, A., McNamara, T., & O’Hagan, S. (2008). Assessed levels of second language speaking proficiency: How distinct? Applied Linguistics, 29(1), 24-49.

Iwashita, N. & Grove, E. (2003). A comparison of analytic and holistic scales in the context of a specific purpose speaking test. Prospect, 18(3), 25-35.

James, R. (1996). CALL and the speaking skill. System, 24(1), 15-21.

Jendi, A. (2005). Approaches to assessing English oral communication in UAE high schools. In P. Davidson, C. Coombe, & W. Jones (Eds.), Assessment in the Arab world (pp. 173-190). Dubai: TESOL Arabia.

Jieke, G., & Yan, J. (2005). Development and preliminary validation of the CET semi-direct oral proficiency test. In A. McNeill & J. Lai (Eds.), Crosslinks in English language teaching (pp. 19-43) Hong Kong: English Language Teaching Unit, Chinese University of Hong Kong.

Johnson, M. (2001). The art of non-conversation: A reexamination of the validity of the oral proficiency interview. New Haven, CT: Yale University Press.

Johnson, M., & Tyler, A. (1998). Re-analyzing the OPI: How much does it look like natural conversation? In R. Young & W. He (Eds.), Talking and testing: Discourse approaches to the assessment of oral proficiency (pp. 27-51). Philadelphia: John Benjamins Publishing Company.

Jones, R. L. (1978). Interview techniques and scoring criteria at the higher proficiency levels. In J. L. D. Clark (Ed.), Direct tests of speaking proficiency: Theory and application (pp. 89-102). Princeton, NJ: Educational Testing Service.

Jonz, J. (1990). Another turn in the conversation: What does cloze measure? TESOL Quarterly,24(1), 61-83.

Ke, C., & Reed, D. J. (1995). An analysis of results from the ACTFL Oral Proficiency Interview and the Chinese Proficiency Test before and after intensive instruction in Chinese as a foreign language. Foreign Language Annals, 28(2), 208-222.

Kim, M. (2001). Detecting DIF across the different language groups in a speaking test. Language Testing, 18(1), 89-114.

Kim, Y-H. (2009a). An investigation into native and non-native teachers’ judgment on oral English performance: A mixed methods approach. Language Testing, 26, 187-217.

Kim, Y-H. (2009b). Exploring rater and task variability in second language oral performance assessment. In A. Brown, & K. Hill (Eds.), Language testing and evaluation, Volume 13: Tasks and criteria in performance assessment (pp. 91-109). Frankfurt, Germany: Peter Lang.

Kormos, J. (1999). Simulating conversations in oral-proficiency assessment: A conversation analysis of role plays and non-scripted interviews in language exams. Language Testing, 16(2), 163-188.

Kormos, J., & Denes, M. (2004). Exploring measures and perceptions of fluency in the speech of second language learners. System 32(2), 145-164.

Lantolf, J.P., & Frawley, W. (1985). Oral-proficiency testing: A critical analysis. Modern Language Journal, 69(4), 337-345.

Lantolf, J. P., & Frawley, W. (1988). Proficiency: Understanding the construct. Studies in Second Language Acquisition, 10(2), 181-195.

Lazaraton, A. (1992). The structural organization of a language interview: A conversation analytic approach. System, 20(3), 373-386.

Lazaraton, A. (1996). Interlocutor support in oral proficiency interviews: The case of CASE. Language Testing, 13(2), 151-172.

Lazaraton, A. (1997). Preference organization in oral proficiency interviews: The case of language ability assessments. Research on Language and Social Interaction, 30, 53-72.

Lazaraton, A., & Riggenbach, H. (1990). Oral skills testing: A rhetorical task approach. Issues in Applied Linguistics, 1(2), 196-217.

Lazaraton, A., & Wagner, S. (1996). The revised test of spoken English (TSE): Analysis of native speaker and nonnative speaker data. TOEFL Monograph Series MS-7. Princeton, NJ: Educational Testing Service.

Lee, Y. (2006). Dependability of scores for a new ESL speaking assessment consisting of integrated and independent tasks. Language Testing, 23(2), 131-166.

Leung, C., & Mohan, B. (2004). Teacher formative assessment and talk in classroom contexts – assessment as discourse and assessment of discourse. Language Testing, 21(3), 335-359.

Lim, G., & Galaczi, E. (2010). Lexis in the assessment of speaking and writing: An illustration from Cambridge ESOL's General English tests. Cambridge ESOL Research Notes, 41, 14-19.

Lindblad, T. (1992). Oral tests in Swedish schools: A five-year experiment. System,20(3), 279-292.

Linder, C. (1977). Oral communication testing: A handbook for the foreign language teacher. Skokie, IL: National Textbook Company.

Liski, E., & Puntanen, S. (1983). A study of the statistical foundations of group conversation tests in spoken English. Language Learning, 33(2), 225-246.

Lombardo, L. 1984. Oral testing: Getting a sample of real language. English Teaching Forum. January, 2-6.

Lumley, T. (1998). Perceptions of language-trained raters and occupational experts in a test of occupational English language proficiency. English for Specific Purposes, 17(4), 347-367.

Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(1), 54-71.

Lumley, T., & O’Sullivan, B. (2005). The effect of test-taker gender, audience and topic on task performance in tape-mediated assessment of speaking. Language Testing, 22(4), 415-437.

Luoma, S. (2004). Assessing speaking. Cambridge: Cambridge University Press.

Lynch, B.K., & McNamara, T.F. (1998). Using G-theory and Many-facet Rasch measurement in the development of performance assessments of the ESL speaking skills of immigrants. Language Testing, 15(2), 158-180.

Madsen, H. S., & Jones, R. L. (1981). Classification of oral proficiency tests. In A. S. Palmer, P. J. M. Groot & G. A. Trosper (Eds.), The construct validation of tests of communicative competence (pp. 15-30). Washington, DC: TESOL.

Magnan, S. S. (1988). Grammar and the ACTFL oral proficiency interview: Discussion and data. Modern Language Journal, 72, 266-276.

Major, R. C. (1987). Measuring pronunciation accuracy using computerized techniques. Language Testing, 4(2), 155-169.

Malvern, D., & Richards, B. (2002). Investigating accommodation in language proficiency interviews using a new measure of lexical diversity. Language Testing, 19(1), 85-104.

Manley, J. H. (1995). Assessing oral language: One school district’s response. Foreign Language Annals,28(1), 93-102.

Matthews, M. (1990). The measurement of productive skills: Doubts concerning the assessment criteria of certain public examinations. ELT Journal, 44(2), 117-121.

May, L. (2011). Interaction in a paired speaking test. New York, NY: Peter Lang.

McClean, J. (1995). Cooperative assessment: Negotiating a spoken-English grading scheme with Japanese university students. In J.D. Brown & S. Yamashita (Eds.), Language testing in Japan (pp. 136-148). Tokyo: JALT Applied Materials.

McNamara, T. (1996). Measuring second language performance. London: Longman.

McNamara, T. F. (1997). ‘Interaction’ in second language performance assessment: Whose performance? Applied Linguistics, 18, 446-466.

McNamara, T., Kill, K., & May, L. (2002). Discourse approaches to oral assessment. Annual Review of Applied Linguistics, 22, 243-262.