BEST PLUS: LISTENING AND SPEAKING ASSESSMENT 1

BEST Plus:

Listening and Speaking Assessment

Guadalupe Lopez

EDUC 8540: Language Assessment

Dr. Kathleen Bailey

October 2, 2013

Introduction

For the last few semesters, I have had the privilege of teaching ESL to adults at the Peace Resource Center (PRC) in Seaside, California. My students are almost entirely Spanish-speaking, and the majority are immigrants from Mexico. Many have had interrupted and/or limited educational opportunities in Mexico; therefore, most of them lack basic literacy skills in Spanish. This is not the case for all the students. Over the summer I taught and mentored Ramón, brilliant poet in Spanish and faithfulstudent of English. With his don de la palabra, I told him, he should be teaching Spanish, not cleaning hotel rooms. I am proud to say that Ramón is now attending GED preparation classes, with the goal of eventually earning a bilingual teaching credential. Students like Ramón keep me coming back for more.

Teachers at the PRC have consistently expressed frustration over teaching adult ESL learners who lack native language literacy skills. This situation, though perhaps surprising, is very common. A report by the ETS Policy Information Center (2008) revealed that, among the learners in Adult Basic Education (ABE) programs, ESL participants had the lowest literacy skills, both in the native language and in the target language. For this reason, they pose very special challenges, in terms of both teaching and assessing. The BEST Plus standardized assessment, created by the Center for Applied Linguistics (CAL), seems to have been created with my PRC students in mind. CAL also offers the BEST Literacy, which measures reading and writing. But my students are stronger in speaking and listening, and I want them to feel successful. That is why I have chosen to review the BEST Plus Listening and Speaking Assessment.

As I began my research on the BEST Plus, I found it a challenge to locate concrete documentation that would allow me to conduct a thorough review. The CAL website provides an overview of the test, with illegible thumbnail-size pages from an examinee test booklet and corresponding picture cue book. Those two tiny screen grabs gave me an idea of what the paper-and-pencil version might look like. The examinee test booklet, with its fill-in-the-bubble multiple choice format, would surely intimidate my students. I considered reviewing the CASAS test instead, since information about that test is more readily available. In fact, I immediately found “Life and Work Listening Sample Items,” available for download, so that teachers may familiarize students with the test and reducing test anxiety (CASAS, 2012). But when I saw the answer sheets that were involved, I knew the CASAS test would not work with my students either. Most of them have no experience as test takers—or as students, for that matter.

I once administered what I believed was a carefully thought out survey with straightforward questions about learner background, such as level of education and country of origin. I read the questions aloud—twice—in Spanish for the students, and I had them follow along with their finger. Then I asked them to fill in the bubbles to indicate their answer. To my dismay, when I analyzed the results, I found that some of them had filled in all three answer choices. Others had not marked any choices. Others had placed an X in random circles, clearly not understanding the connection between the question and the answer choices, even though I had read everything aloud. The structure of the BEST Plus Computer-Adaptive version, which strictly measures listening and speaking, is suited to the special needs of my students at the PRC. My review includes the history and description of the BEST Plus, how it adheres to Wesche’s (1983) framework, and how it adheres to Swain’s (1984) framework.

History and Description of the BEST Plus

The BEST Plus, launched in 2003, is approved by the National Reporting System for Adult Education (NRS), a federally funded program developed by the U.S. Department of Education’s Division of Adult Education and Literacy. The NRS provides accountability for ESL and adult education programs (CAL, 2013a).

The BEST Plus is an example of a criterion-referenced test (CRT) that measures overall English proficiency (Brown, 2005). This type of test would be useful if the PRC teachers were to one day formalize the manner in which we divide students into beginner and advanced. This summer, I taught the advanced class, and my co-teacher taught the beginners in another room. We selected the students simply by intuition, and since there were only two teachers, we only offered two levels. The BEST Plus, which takes only 3 to 20 minutes to administer (CAL, 2013a), would also serve as a placement test (Brown, p. 9). In other words, it would tell us whether to place students in the beginner session or the advanced session.

Wesche’s Framework

The BEST Plus may be described in further detail by following Wesche’s (1983) framework, which consists of four components: stimulus material, task posed to the learner, learner’s response, and scoring criteria.

Stimulus Material

Bailey (1998) defines stimulus material as “whatever linguistic or non-linguistic information is presented to the learners in a test to get them to demonstrate the skills or knowledge we wish to assess.” (p. 13) As outlined in Figure 1, the stimulus material on the BEST Plus includes scripted questions read aloud by the interviewer, with corresponding visuals. The questions relate to daily life in the United States, presenting a situation that is as authentic as possible and which is relevant to a student. This sort of testing falls into what Brown (2005) terms the communicative movement (p. 21). In the Computer-Adaptive Version, the tester shows the learner a picture on a screen and scores the response immediately.

Item Type / Sample Question
Photo
description / Tell me about this picture.
Entry Item / I usually get the news from the television. How about you?
Yes/No / Do you like to watch the news on TV?
Choice
Question / Do you like getting the news in English or in Spanish?
Personal
Expansion / Where else do you get the news?
General
Expansion / Do you think it’s important to keep up with the news? Why?
Why not?
Elaboration / Some people think that news reports in the United States are unreliable
and show only one side of an issue. Others think that the news is accurate
and unbiased. What do you think about news reports in the United States?

Figure 1.The seven item types of the BEST Plus and sample questions

Task Posed to the Learner and Learner Response

As Bailey (1998) points out, the task posed to the learner goes hand in hand with the learner response. The task posed to the learner is to understand the tester’s question and formulate an appropriate oral response to demonstrate that understanding. This task requires that the learner respond creatively to unpredictable input, just as he would encounter in real life (Brown, 2005). The learner is presented with relatively easy items first, such as entry item and photo description. Advanced learners quickly move to the open-ended questions (Van Moere, 2009).

Scoring Criteria

Best Plus scores are benchmarked in Student Performance Levels (SPL), which range from 0 (no ability whatsoever) to 10 (native speaker ability) (Grognet, 1997). These rankings are based on a student’s “relative success in getting his meaning across” (Brown, 2005). These SPL are presented in a detailed scoring rubric, with each level having three categories: general language ability, listening comprehension, and oral communication (CAL, 2008). These pre-set levels are an example of criterion-referenced score interpretation, as explained by Brown (2005). He refers to a criterion level against which each student’s performance is judged (p. 3). For example, the criterion for SPL 2 oral communication is “Expresses a limited number of immediate survival needs using very simple learned phrases” (Grognet, 1997). If a test taker answers a question using very simple learned phrases, then that person will probably receive a score of SPL 2. No reference is made to how the test taker scored relative to other test takers.

Another important factor to consider when discussing scoring is reliability. How well does the rating system measure students’ performance consistently? Although I still feel unprepared to interpret reliability data with absolute confidence, I am assured that BEST Plus does follow best practices. Here is a quote from a discussion forum from Literacy Information and Communication System (2006), in which a representative from CAL responds to a query about reliability from a BEST Plus administer:

Any language assessment that requires test administrators to rate a language sample (rather than simply scoring a multiple choice test, for example) must have standardized administration and scoring procedures to ensure reliability. These procedures are accompanied by benchmark samples that correspond to scores on a rating scale, or a scoring rubric. (Literacy Information and Communication System, 2006)

The CAL representative then goes on to explain the rigorous process a test administrator must undergo to become certified. Notwithstanding the time investment, becoming a certified BEST Plus administrator requires a significant financial investment (See Appendices A and B). CAL offers several training options, most of them requiring a minimum of 5 people registered for the training workshop at a sponsoring institution (CAL, 2013b). While I am volunteering at the PRC and studying at the Monterey Institute, I plan to use what I have learned from this assessment review, along with what I have learned in my Language Assessment class, to create an assessment similar to the BEST Plus.

I have previously provided detailed information on how the BEST Plus follows Wesche’s Framework. Figure 2 provides a summary of that information.

Wesche’s Framework / BESTPlus
Stimulus Material / Scripted questions read aloud, plus corresponding visuals
Task Posed to the Learner / Demonstrate listening comprehension
Learner’s Response / Provide oral response to stimulus material
Scoring Criteria / Scoring rubric with 10 Student Performance Levels (SPL)

Figure 2.Wesche’s Framework and BEST Plus

Swain’s Framework

I have presented Wesche’s (1983) framework as a useful tool in analyzing the BEST Plus. Another useful tool for this purpose is Swain’s framework (1984). The four principles of this framework are 1) start from somewhere, 2) concentrate on content,
3) bias for best, and 4) work for washback.

Start From Somewhere

This principle refers to the idea that a test should be theoretically grounded. The mere fact that the BEST Plus was created by the Center for Applied Linguistics, by experts in our field, assures me that the test is theoretically grounded. More specifically, however, the SPL descriptors are firmly grounded in Canale’s (1983) components of communicative competence, as summarized in Figure 3.

Grammatical Competence
∙phonology
∙vocabulary
∙word formation
∙sentence formation / Sociolinguistic Competence
∙social meanings
∙grammatical forms in different
sociolinguistic contexts
Discourse Competence
∙cohesion
∙coherence / Strategic Competence
∙grammatical difficulties
∙sociolinguistic difficulties
∙discourse difficulties
∙performance factors

Figure 3.Components of communicative competence. Adapted from Canale (1983).

Concentrate on Content

This second principle in Swain’s framework implies that content should take into account the learner’s proficiency level, interests, and goals. The BEST Plus starts out with very simple prompts and increases in difficulty, taking into account the learner’s proficiency. The visuals and questions in the sample seem to take into account the learner’s general interests and goals, which align with the goals of my own students:
to help their children in school, to get along in the community, to get a better job, or to obtain their high school diploma.

Bias for Best

A test that is biased for best elicits a learner’s best performance, just as a good teacher would. The BEST Plus, along with a sympathetic interviewer, is capable of eliciting optimal output. While watching a CAL (2013d) video of a sample session between interviewer and tester, I noticed ways in which the interviewer facilitated the conversation. She nodded, smiled, and made eye contact, all the while inputting data into the computer. She showed her involvement in the conversation, with comments such as “I see” or “Uh huh.” These conversation strategies were no doubt part of a rigorous test administrator training, in which interviewers are taught how to make a test taker feel at ease and perform optimally. Furthermore, the CAL website offers recommendations for helping adult learners succeed, such as conducting student needs assessments prior to administering the actual test (CAL, 2013c).

Work for Washback

Washback refers to the effect that a test has on teaching and learning. I hadn’t considered that a test would have an effect on me as a teacher, but in the course of researching the BEST Plus, I have indeed experienced positive washback. After studying the SPL scoring rubric, for example, I mentally ranked each of my students accordingly. Ramón, my student studying for his GED, would receive an SPL 7 from me, for example. Perhaps after he finishes his GED preparation classes, he will receive a higher score. It is interesting to note that the SPL 7 benchmark states “Communicates on the phone on familiar subjects” (Grognet, 1997). I wonder how the interviewer could ascertain that the tester can communicate on the phone.

At the opposite end of the SPL scale, there is Josefa, the quiet young lady who just shakes her head when I ask her something. I would give her a rating of SPL 1: “functions minimally, if at all, in English” (Grognet, 1997).

I now have my own view of what SPL rank I would give my students, but it would be interesting to administer the BEST Plus, or a similar test that I develop, to see where they actually fall. I would use Spearman’s rho to determine this data. Figure 4 summarizes how the BEST Plus adheres to Swain’s framework.

Swain’s Framework / BEST Plus
Start From Somewhere / Principles of communicative competence
Concentrate on Content / Everyday life in United States; learner goals and interests
Bias for Best / BEST Plus test administrator uses conversation strategies to make learner feel at ease and perform optimally.
Work for Washback / Teacher washback: awareness of SPL rankings affect how current students are viewed.

Figure 4. Swain’s Framework and BEST Plus

Conclusion

As a volunteer at the PRC, I will probably never have the opportunity to administer the actual BEST Plus to my students, given the expense of the test, and other limitations posed by the PRC. However, I hope that my rewarding experience with this group of students eventually leads me to an opportunity at a federally funded program that serves a similar population. This assessment is highly regarded in the field of adult basic education, and when my potential employer asks if I know about the BEST Plus, I can state with confidence that I do indeed.

References

Bailey, K. M. (1998). Learning about language assessment: Dilemmas, decisions and directions. Boston: HeinleHeinle.

Brown, J. D. (2005). Testing in language programs: A comprehensive guide to English language assessment. New York: McGraw-Hill.

Canale, M. (1983). From communicative competence to communicative language pedagogy.In R. C. Richards & R. W. Schmidt (Eds.), Language and communication. London: Longman.

CASAS. (2012). Life and work listening sample test items. Retrieved September 29, 2013, from

Center for Applied Linguistics.(2008). Student Performance Level (SPL) descriptors for listening and oral communication. Retrieved September 28, 2013, from

Center for Applied Linguistics. (2013a). BEST Plus: Computer-Adaptive. Retrieved
September 28, 2013, from

Center for Applied Linguistics. (2013b). BEST Plus: Test administrator training options. Retrieved September 29, 2013, from

Center for Applied Linguistics. (2013c). Best Plus and Best Plus Refresher Instructional Strategies. Retrieved September 28, 2013, from
StaffPagesDocs/AdultED/Teaching%20Tools/ESL/Best%20Plus%20and%20BEST%20Plus%20Refresher%20Instructional%20Strategies.pdf

Center for Applied Linguistics. (2013d). ESL Assessment Demonstration [Video webcast]. Retrieved September 28, 2013, from

Educational Testing Service. (2008). Policy Notes: Adult education in America. Retrieved September 28, 2013, from

Grognet, A. G. (1997). Performance-based curricula and outcomes: The mainstream English language training project (MELT) Updated for the 1990s and beyond. Denver, CO: Spring Institute for International Studies.

Literacy Information and Communication System. (2006). Best Plus questions. Retrieved September 28, 2013, from

Malone, M. (2007). Oral proficiency assessment: The use of technology in test development and rater training. Center for Applied Linguistics. Retrieved September 30, 2013, from

Swain, M. (1984). Large-scale communicative language testing: A case study. In S. J. Savignon, & M. Berns (Eds.), Initiatives in communicative language teaching (pp. 185–201). Reading, MA: Addison-Wesley.

Van Moere, A. (2009). Test review: BEST Plus spoken language test. Language Testing 26: 305–313.

Wesche, M. B. (1983).Communicative testing in a second language.The Modern Language Journal, 67, 41–55.

Appendix A: Cost to Become a BEST Plus Test Administrator

Appendix B: BEST Plus Order Form, page 1

Appendix B: BEST Plus Order Form, page 2