Generic Test Analysis Template s1

Hayward, Stewart, Phillips, Norris, & Lovell 9

Test Review: Preschool Language Scale-4 (PLS-4)

Name of Test: Preschool Language Scale-4
Author(s): Irla Lee Zimmerman, Violette G. Steiner, and Roberta Evatt Pond
Publisher/Year: 1969, 1979, 1992, and 2002
Forms: only one
Age Range: birth to 6 years, 11 months
Norming Sample:
The authors stated the need for a stronger psychometric basis as the primary goal for revision although changes in population, policy, and practice also motivated need for revision. The most recent standardization reflects increased diversity of the U.S. population as represented in the 2000 census. For example, the manual reports that ethnic minorities constitute 38% of the total PLS-4 standardization sample representing an 8% increase over the 1992 sample. The sample was collected between March and September, 2001. A national tryout study was conducted.
Total Number: 2 400 in standardization samples and related reliability and validity studies
Number and Age: 1 564 children from ages 2 days to 6 years, 11 months.
Location: 357 sites in 48 states and DC
Demographics: 50% male/female, also stratified on basis of race or ethnic origin
Rural/Urban: stratified by geographic region
SES: stratified on basis of primary caregiver’s education level
Other: Percentages of sample were reported for additional characteristics: child’s learning environment (e.g., at home with family, at daycare), English dialects and languages spoken by children, and identified conditions/diagnoses (e.g., learning disability, TBI, and developmental delay).
Summary Prepared By: Eleanor Stewart 26 April 2007; updated 29 August 2008
Test Description/Overview:
This new edition includes more test items than PLS-3 (AC now has 62 items and EC has 68 items, in contrast to PLS-3, which had 48 items). The age intervals are the lowest age level, 1 year, 0 months, are divided into three month intervals. Up to and including 2 years, 0 months, spontaneous behaviours can be scored. Additional measures target articulation, spontaneous language, and caregiver information.
Theory: Bloom and Lahey’s 1978 content, form, and use model is cited as the basis for test structure which is meant to target relevant skills that are “related to comprehending language and to communicating ideas” (p. 191).
Purpose of Test: The test’s purpose is to identify children with language delays/disorders.
Areas Tested: A variety of receptive and expressive language skills are tested based on developmental research.
· Oral Language Vocabulary Grammar Narratives Other
· Listening Lexical Syntactic Supralinguistic (e.g., responses to angry voice)
· Other: supplemental measures not included in AC, EC or Total Language scores: Articulation screener, Language Sample checklist, and Caregiver questionnaire (addresses child’s communication at home).
Comments: The Articulation Screener and the Caregiver Questionnaire have doubtful clinical utility other than to highlight areas for further investigation. For example, the articulation screener uses imitation, not generally the recommended means of eliciting productions. There are inadequate norms for interpretation of children’s performance. This additional measure probably does not add anything more to what a clinician might gather from simply listening to children’s spontaneous productions. The Caregiver Questionnaire is useful if the child is uncooperative or otherwise does not engage in the test activities.
The PLS-4 Language Sample checklist is useful. However, it is intended to be used with a minimum 50 consecutive spontaneous utterance sample collected during play interactions with the child. The authors provide time estimates for sampling that range from 15-20 minutes for a brief interaction to 40-60 for a lengthy one. The authors cite the Gavin and Giles (1996) study which “found that 175 utterances were needed for reliability coefficients to exceed .90” (p.147). Therefore, a larger sample would be necessary to complete an analysis and determine productive use. The checklist can function as a marker for further investigation. The authors quite rightfully note that MLU will only reflect language structure and is not an indicator of overall communicative competence.
Language sampling is an important and often unused tool in assessments. It is tempting to use the PLS-4 checklist in a less formal way as many clinicians will do. However, the value of language sampling cannot be overstated given the limitations of any single test in assessing children’s language abilities. It is often overlooked in favour of “quick and efficient” assessment tools.
Who can Administer: Speech-language pathologists, early education specialists, psychologists, educational diagnosticians, and others with training and experience in testing may administer this test. Paraprofessionals can administer and record responses with supervision.
Administration Time: Depending on child’s age and ability to cooperate, estimates for age ranges are given as follows: birth to 11 months, 20-40 minutes; 12 months to 3 years, 11 months, 30-40 minutes; and 4 years to 6 years, 11 months 25-35 minutes.
Comment: At younger ages, and if child has attention or behaviour difficulties, this test can be exceedingly long, often leading to no ceiling being reached or quite possibly to errors in scoring.
Test Administration (General and Subtests):
The test kit consists of an examiner’s manual, a box of test toys, and a picture book containing colour photos and line drawings. The test record contains identifying information and the test summary on the cover. Inside, the pages list test items with brief instructions at clearly identified age intervals in months and year/month format. Start points are not identified on the record but are found in the test manual.
Examiners can choose to administer either Auditory Comprehension (AC) or Expressive Communication (EC) first. However, once a scale is commenced, it must be completed according to guidelines given (i.e., cannot switch between scales). To determine the start point, general information about the child’s language skills is required. Guidelines for the start point are provided. Breaks are encouraged as test administration time can be lengthy. This new edition allows for observation and caregiver report for many of the test items below 2 years, 0 months. The authors caution that deviations from standardized administration, other than minor changes such taking breaks or allowing the child to sit on the caregiver’s lap, will invalidate the test results (p. 122). Additionally, the authors note that fair comparisons can only be made if the child reasonably compares with the sample on which the test was developed (e.g., language group, ethnicity).
Development of scoring rules is described in Chapter 6.
Test Interpretation: A raw score is calculated using the ceiling item minus the number of total errors. By consulting tables in the manual, the raw score can then converted to a variety of scores (see below). Chapter 3 is dedicated to interpretation of the PLS-4. In this chapter, the authors define each of the standardized scores and provide examples of interpretation. The PLS-4 Record Form also provides for task analyses that profile the child’s strengths and weaknesses. Table 3.1 usefully illustrates how standard scores, standard deviations, and percentile ranks are related to each other (i.e., Standard Score of 100, Deviation from mean –zero, and percentile rank of 50).
Age equivalents are discussed at length with the common concerns about this type of score presented. The authors state, “Age equivalents do not (italics in text) provide any sort of comparative information within an age group; they only tell you the age group for which that score is the median…Because it does not give you the information about the range of scores for children in a specific age group, an age equivalent does not (italics in text) give you the information you need to determine if a child has a language disorder.” (p. 125). Examples are provided. This section on age equivalent is followed by a section on “Calculating Percent Delay from an Age Equivalent” (p. 127) in which the authors note that many state agencies requiring quantitative criteria will accept such scores as percent delay from AE. Research demonstrating how this type of calculated score is restrictive is discussed. In the end, the authors summarize by once again cautioning examiners not to use age equivalent scores to determine service eligibility.
A more useful treatment of scores is outlined in the next section, “Using Confidence Bands To Reflect Confidence In Obtained Scores” (p.1 27). All tests carry a certain degree of measurement error. In this section of the manual, standard error of measurement is explained along with its use in creating a confidence band. The confidence band will encompass the range of scores in which the child’s true score will be found. An example is provided.
Confidence bands are also used in comparing performance on the two scales that constitute the PLS-4. The authors provide examples of how to compare confidence bands to determine if a real difference exists. This is especially important to consider when the child’s performance results in a higher EC than AC score. The examiner will want to know if this difference is meaningful. In such a case, the examiner should look to the confidence bands to see if there is overlap and then refer to the incidence of difference provided in Table 3.3 on page 132 to determine the percentage of the standardization sample that exhibits the difference. Only if there is no overlap, the authors explain, can examiner can be assured that the difference is meaningful. Once again, the manual contains an example to illustrate this point. In summary the authors state, “You can consider the scores to be different only when there is no overlap in the confidence bands.” (p. 129).
The remainder of Chapter 3 describes the PLS-4 Task Analyses, the PLS-4 Checklist, and the PLS-4 Profile.
Chapter 4, “The PLS-4 Supplemental Measures”, provides information about when and how to use these measures. The procedures for each supplemental measure are described.
Chapter 5, “After Assessment: The Next Step”, takes the examiner through the completion of the Clinician’s Worksheet found in Appendix B. This worksheet is helpful for organizing information for the child’s program plan. The clinician is encouraged to supplement PLS-4 test results with other measures such as criterion-references checklists and parent interview material in order to write comprehensive therapy objectives. The chapter also addresses how to develop a plan that includes follow-up with the family, determining outcomes for the child and with the family, and identification of a case manager. Presumably this is information that is familiar to trained professionals.
At the end of this chapter, a list of resources for parents and teachers is provided that includes contact information including internet addresses. Agencies serving specific populations are listed as is the office contact for the U.S. Public Law 105-17 IDEA.
Standardization: Age equivalent scores Grade equivalent scores Percentiles Standard scores Stanines
Other Total Language Score is calculated by combining AC and EC standard scores; also with a standard score equivalent available. Standard Errors of Measurement (SEMs) allow for Confidence Bands to be used.
Reliability:
Internal consistency of items: Coefficient alphas were calculated. AC had a range from .66 to .94 with an overall of .86; EC had a range of .73 to .94 with an overall of .91; and the Composite range was .81 to .97 with an overall of .93. Smaller SEMs and confidence intervals were reported which indicating less variability around scores.
Test-retest: A random sample of 218 children age 2 years to 5 years, 11 months of age were retested. (Use caution during interpretation of results under 2 years of age as they are not represented). The interval between testing with the same examiner was 2 to 14 days with an average of 5.9 days. Coefficients ranged between .82 and .95 for subscale scores and .90 and .97 for Total Language Score. Specifically, AC overall SEM=5.94, EC overall SEM =4.89, and Composite overall SEM=4.16.
Inter-rater: Fifteen (15) elementary teachers with three weeks experience using PLS-4 were trained using scoring rules and examples of correct and incorrect responses. One hundred (100) test protocols of children, age birth to 6 years, 11 months, were evaluated. Results: There was 99% agreement between scorers. Correlation for EC was .99. Evidence indicates that scoring rules are well developed and consistent scoring by a variety of trained examiners is possible.
Validity: As this is the fourth edition, extensive studies are reported in the manual regarding sources of evidence for validity.
Content: Evidence was provided by literature search, user survey, content, bias, and task reviews conducted as described in Chapter 6. Extensive description of content is found on pages 193 through 207 of Chapter 7. Also, response processes were considered to demonstrate that task formats successfully elicited the intended skills. Pilot testing was carried out and children’s responses were analyzed; results demonstrated that desired responses were indeed elicited. Previous research demonstrated that practice items were important, so practice items were developed for the PLS-4.
Concurrent validity: (“Convergent Evidence”, p. 209): Validity was reported with Denver II and PLS-4. The study was conducted with 37 children between the ages of birth to 11 months. High level of agreement between Denver II and PLS-4 was evidenced by normal outcomes on Denver II and scores within 1 standard deviation on PLS-4. The authors also studied correlation with PLS-3; AC correlation was .65 and EC correlation was .79.
Criterion Prediction Validity: Further to the issues arising in prediction, the manual states that evidence is still needed but that “… current research addressing the predictive validity with PLS-4 will be posted on the Psychological Corporation website as this information becomes available” (p. 211).
Construct Identification Validity: The “Language Ability” target construct was described. A large pool of 170 test tasks with over 700 sub-items was introduced for tryout. From this pool, the best tasks and sub-items were chosen. Then relationships among the items were analyzed in two ways. Internal consistency revealed high homogeneity (Internal consistency, see previous section). Correlation between AC and EC, which both claim to measure language ability through different aspects, was found to be .74. Therefore, each subscale is measuring different aspects of the construct “language ability”.
Differential Item Functioning:
Other: Clinical validity studies were conducted for four clinical groups: children with language disorder, developmental delay, autism, or hearing impairment. A variety of tests were used: PLS-3, REEL, and Rossetti Infant Toddler Scale (both of which are criterion referenced). The PLS-4 successfully identified children with language disorders previously identified as such as well as children not previously identified. The test also successfully identified children without language disorders both those previously identified as such and those not. Children with autism and hearing impairment performed significantly differently than typically developing children with patterns of variability in their performance on specific tasks described by the authors in the manual (pp. 215-218). Children with developmental delays demonstrated similar performance to those with hearing impairment in areas such as social communication, vocal development, and vocabulary tasks.