ITBE Workshop, April 20, 2013

Assessing Listening Comprehension: Test Format Decisions

Brian Hampson, Purdue University Calumet -

Heather Torrie, Purdue University Calumet –

Test Development Stages (adapted from Hughes, 2003)

  1. Stating the problem
  2. What is the purpose?
  3. What abilities are to be tested?
  4. Content
  5. What tasks should students perform?
  6. What texts should be used?
  7. What is the overall format (number of passages, number of listenings, etc)?
  8. What items should be included?
  9. Pilot items – using both native and non-native speakers

Validity: Does the test measure what it is supposed to measure? (Hughes, 2003)

 Sometimes format can affect validity

Overview of Various Item Formats

Buck (1991) “…successful listening comprehension involves an interaction between linguistic skills,knowledge of the context, background knowledge and inferencingskills. Thus, listening test items, even those written to test one particular skill, turn out on examination to be testing a number of differentskills.”

Format / Research Notes / Pros / Cons / Other skills being tested
Multiple-choice
True/False
Matching
Selection / In’nami & Koizumi, 2009 - Meta-analysis shows MC is easier than SA
Yi’an 1998 - people often choose the right answer for the wrong reasons / -Reliability
-Less stressful for testees
-Takes less time for testees / -Encourages bottom-up processing
-Cognitive Load: difficult for listeners to hold four options in their mind while listening
-Guessing
-Choose the right answer for the wrong reason
-Difficult to write good distracters
-Cheating / -Reading ability
-Only word recognition (rather than true comprehension)
Short Answer / In’nami & Koizumi, 2009 - Meta-analysis shows SA more difficult than MC
Buck 1991 – Supports validity, but addresses concerns with reliability / -Less guessing
-Easier to write items
-More authentic
-More top-down, especially for main idea questions / -Reliability issues (scoring difficult, especially for inference items)
-Takes time for them to answer / -Writing ability
-Reading (understanding the question and determining which information to write down)
Format / Research Notes / Pros / Cons / Other skills being tested
Table/Outline/Chart Completion / Song 2011 - Filling out a table is easier than blank notes
BrindleySlatyer, 2002 – found that a chart/table structure was easier than SA/blank notes, and easier than cloze / - Constrains the contents of test takers’ notes to a given framework
-Authentic
-emphasizes top-down processing / -Scoring difficult; reliability issues / -Writing ability
Re-call / Used in research (eg, Jung 2003; Sherman 1997); not so much in classroom / -Authentic
-Good measure of intake / -More difficult to score (compile a list of key information units) / -Writing and speaking
ability
-Memory
-Note-taking ability
Cloze / Dictation / Partial Dictation / -Reliability / -Emphasizes bottom-up listening
-Less authentic / -Writing ability

Test Delivery Format Options:

# of Listenings / Rationale / Question Preview* / Rationale
One time / -Authentic / Before / -Helps students focus on particular information
Two times through / -Affective value for students
-Reflects the way listening is taught in the classroom / Sandwiched / -Promotes top-down processing during the first listening; and bottom-up processing during the second listening
More than twice / student-controlled / -Also can be authentic (listening to recorded lectures, online material, conversation) / Afterwards / -Further promotes top-down processing

*Sherman (1997) and Buck (1991) suggest that question preview seemed to have more of an affective benefit than actual performance benefit. Examinees thought it helped them more than it actually did.

References

Brindley, G & H. Slatyer. (2002). Exploring task difficulty in ESL listening assessment. Language Testing, 19(4).

Uses charts and sentence completion.One listening only.

Buck, G. (1991). The testing of listening comprehension: an introspective study.Language Testing, 8(1).

Using verbal report, this study looked at the test-taking process of answering short-answer questions based on listening to segments of a narrative. While suggesting strength in validity with using short-answer items, the study reveals concerns with reliability over the various answers.

Cross, J. (2009). Effects of listening strategy instruction on news videotext comprehension.Language Testing Research, 13(2).151-176.

One of the ways listening comprehension was measured in this study was using written recalls, where they had to write down everything they remembered from the videotext.

Hughes, A. (2003). Testing for Language Teaching.Cambridge University Press.

This book is a great overview on test development for all language skills.

In’nami & Koizumi. (2009). Ameta-analysis of test format effects on reading and listening test performance: Focus on multiple-choice and open-ended formats. Language Testing, 26(2).

After reviewing 56 or so studies, found that multiple-choice items were indeed easier than open-ended formats.

Jung, E.H. (2003). The role of discourse signaling cues in second language listening comprehension. The Modern Language Journal, 87(4).562-577.

Learners performed a written recall task, after listening to a lecture. Assessment was based on how many key “information units” learners included in the recall.

Sherman, J. 1997. The effect of question preview in listening comprehension tests.Language Testing, 14.185-213.

Very interesting study design! Lots of great citations too. Seems that question preview often makes students feel better, but doesn’t necessarily help them. Could interfere with processing.

Song, M. (2011).Note-taking quality and performance on an L2 academic listening test.Language Testing, 29(1).

Studies how effective note-taking using a partially-filled outline. “it would seem that notes taken in the outline format in particular, because it constrains the contents of test takers’ notes to a given framework, might have more potential as a listening measure than notes in the blank format.”

Yi’an, W. (1998). What do tests of listening comprehension test? - A retrospection study of EFL test-takers performing a multiple-choice task. Language Learning.

Did a qualitative study on 6 learners and why they chose the answers they did (multiple-choice). Showed a lot of people chose the correct answer, but for the wrong reason.