Chapter 1 Questions (1&2)
1) What is the difference between testing and assessment?
Testing is just one particular form of assessment. It consists of giving a certain set of questions to an individual or a group with the end result of obtaining a score. On a more general scale, assessment is any kind of process that collects data for the purpose of making decisions about individuals and groups.Assessment refers to a group of activities that can be anything from observations, tests, portfolio collection, etc.
2) Attack or defend this generalization. Different kinds of data are needed for the purpose of making different kinds of educational decisions.
I agree that different kinds of data are needed for the purpose of making different kinds of educational decisions. All assessment is guided by a specific purpose so it makes sense that different forms of data will be required depending on the guiding purpose and the educational decisions that results. For example, if the purpose of assessment is to determine who qualifies for special educational services, than specific forms of assessment, like screening (preliminary tests) and eligibility testing will be needed. CLASS DISCUSSION- collection of all types of data are needed. It has to answer the question and use numerous sources of data.
Chapter 2 Questions 1-6
1) Identify 3 different ways to begin an assessment. Describe an optimal sequence of activities for assessing a student.
a) Assessment should begin with instructional diagnosis to determine if a student’s difficulties result from inadequate teaching or inappropriate curriculum. This involves examination of Instructional challenges (information is appropriately challenging allowing students to succeed with some effort) and Instructional environment (classroom management-time on task & learning management –provide structure for learning)
b) If instruction has been inadequate, appropriate instruction should be given to see if it alleviates the difficulty. Assessment of learning or at this stage determines if the difficulty has been alleviated.
c) Further assessment is then carried out on the student who is still experiencing difficulty despite adequate instruction. Collection of data at this stage involves obtaining existing data to document the nature of the problem and identify the student’s strengths and weaknesses. At this stage, assessment involves collecting observations, recollections, tests and professional judgements, extant information, and an understanding of that student’s current life circumstances in order to form hypothesis about a cause of student difficulty.
CLASS DISCUSSION – what is the question or problem? let that guide your assessment
2.) What are two factors that may have a significant effect on a student’s performance during assessment?
a) Family life plays an essential part in ensuring a student does well during assessment. Students from families that have histories of physical, sexual, psychological or substance abuse have a more challenging time doing well in school despite the caring environment teachers provide.
b) Physical abilities and Health contribute significantly to how a student performs during assessment. Missed opportunities due to illness or lack of sensory information can cause students to perform poorly during assessment.
3.) When and Why might you want to administer a group test individually?
If a teacher is concerned with HOW the student is completing the questions (qualitative data) on a test rather than their quantitative score, than it is best to administer a group test individually so the student can be observed carefully during the assessment.
4.) What is the difference between a Norm-referenced and Criterion-referenced test. Cite an advantage of each.
Norm –referenced tests compare an individual’s performance to that of many peers. An advantage to using this test is that it allows the tester to see how the student’s performance compares to that of his peers. A percentile rank allows the tester to see if the student’s performance is considered below average, average or above average in relation to others in the normative group. This is especially advantageous when a classroom group is not representative of grade ability. CLASS DISCUSSION- Used in screening, it really depends on what you’re looking for.
In contrast, a Criterion-referenced test does not compare an individual’s performance with others. This test is mostly concerned with what a particular individual can and cannot do (what skills or information has this person mastered) The advantage of this form of testing is that the results can be used to help guide individual program instruction because specific strengths and weakness’s are identified.
5.) How might you evaluate the extent to which students you assess are acculturated in a manner that is comparable to those in a tests norm group
You would have to assess the students working knowledge of the public culture of the normative group. You could use any assessments that evaluates an individual’s knowledge of societal mores and values, standard American English (or language or normative group), and the fund of general and specific cultural information. CLASS DISCUSSION-look for language, acculturation- look at what the test norms are (who does the norm sample consist of) and compare them to the test subject to see how similar they are
6) Lupe`s parents have just moved into the area. Lupe was enrolled in second grade a few weeks before the annual standardized achievement tests are administered. The decision has been made to let her take the tests in Mr.Peno`s room, although he is not her teacher, because he speaks Spanish (the language that Lupe speaks at home). Is this sufficient to ensure test validity for Lupe, Why or Why not
No, it is not enough to ensure test validity. Not only may Lupe not be acculturated to the normative group she will be compared with, but the experiential opportunities she engaged in to acquire tested skills may not be accounted for in the test material.
CLASS DISCUSSION- test was normed to be given in English not spanish , translation gives the student more time to answer the questions
Chapter 4 Questions 1,2,3,5
1) After all third-grade students in the state took an achievement test, statewide norms were developed. The superintendent of public instruction reviewed the test results and in a news conference, voiced concerns for the quality of education in the state. The superintendent reported, “Half the third-grade children in the state performed below the state average.” What is foolish about that statement?
Because the sample of grade 3 students would likely be a large population, one would expect the distributions of scores to fall in the pattern of a normal curve. The distribution of scores along a normal curve fall equally on either side of the average score. This would mean that 50% of the grade 3 student’s scores would fall below the mean (average) and 50% of grade 3 students scores would fall above the mean (average).
CLASS DISCUSSION-half the students will always be below average
2) What is the relationship among the mode, the median and the mean in a normal distribution?
The Mode is defined as the score most frequently obtained.
The Median is the score that divides the top 50% of cases (people, not scores) from the bottom 50% of cases. (4,5,7,8 = Median of 6)
The Mean is the arithmetic average of scores in a distribution. The sum of scores divided by the number of scores.
In a normal distribution, the curve is symmetrical meaning that the mean, mode and median are all the same.
3) The following statements about TEST A and TEST B are true:
-Test A and B measure the same behaviour-Test A and B have means of 100
-Test A has a standard deviation of 15-Test B has a standard deviation of 5
a) Following classroom instruction, the pupils in Mr.R`s room earn an average score of 130 on Test A. Pupils in Ms.P`s room earn an average score of 110 on Test B. On this basis, the local principal concludes that Mr.R`s students learn more than Ms.P`s. What is fallacious about this conclusion
Both tests have results that are 2 standard deviations above the mean. Meaning they are the same or fall in the same place in a normal distribution.
CLASS DISCUSSION-
b) Assuming that the pupils were equal prior to instruction, what conclusions could the principal legitimately make?
Both tests consistently and accurately measure the same behaviour
CLASS DISCUSSION- both groups have learned the material to the same extent
5) Discuss the relationship between correlation and causality.
Correlation quantifies relationships between variables or the extent to which two things go together. A correlation coefficent is used in measurement to estimate both the reliability and validity of a test. For example, in a perfect correlation (1.0), if you know a person’s score on one variable, you can predict that person’s score on the second variable without error. (letter identification & reading fluency)
Correlation between two variables is necessary but not sufficient a condition for causality.
For example, intelligence and achievement are positively correlated but are not necessarily caused by each other. More than likely other variables contribute to both.
CLASS DISCUSSION- you can have a relationship between two things but one doesn’t necessarily cause the other
Chapter 5 Questions 4,5,6,7
4) Marvina takes a battery of standardized tests. The results are as follows:
Test A: Mental age =8-6
Test B: Reading grade equivalent=3.1
Test C: Developmental age=8-4
Test D: Developmental quotient=103
Test E: Percentile rank=56
What must the teacher do in order to interpret Marvina’s performances on these five scales and compare the performances with one another?
If the scores are converted into standard scores, or percentile ranks, they are easily compared within a normal distribution.
CLASS DISCUSSION-convert to standard measure
5) Andrew earns a stanine of 1 on and intelligence test. To what z-scores, percentile ranks, and T-scores does his stanine score correspond?
z-score= -2.00 t-score= 30 Percentile= 2%
CLASS DISCUSION- refer to pg.119
-Mean is 100 and standard deviation is 15
6) Distinguish among frustration level, instructional level and independent level. Why are these distinctions important?
Frustration level is reached when a child scores less than 85% correct.
Instructional level is reached when a child scores between 85% and 95% correct
Independent level is reached when a child scores above 95% correct
*These distinctions are important when determining if a child has achieved mastery over any skill or information. It is also important to know these distinctions to help guide instruction and begin at the students current level of functioning.
CLASS DISCUSSION- relate to standard deviations
7) Although scores are frequently reported as developmental scores, a number of problems are inherent in the use of these scores. Discuss three of these problems.
Systematic Misinterpretation
Students perform skills in different ways and may not complete questions as a child of a set age would.
Need for Interpolation and Extrapolation
Average age and grade scores are estimated for groups of children that are never tested. Test scores are extrapolated below or beyond actual scores based on the trend or curve.
Promotion of typological thinking
Developmental scores are based on the “average child”. This is problematic because the “average child” is a statistical abstraction, meaning it is made up or based on what people assume the average is. (family size, economic level, etc)
Implication of a false standard of performance
Due to the way equivalent scores are constructed, you will always see half an age or grade group performing below age/grade level because half of test takers earn scores below the median.
Tendency for scales to be ordinal, not equal interval
The number of correct answers dramatically effects the developmental score. Meaning that one or two more right answers can significantly change the age/grade equivalent. For example, a reading company can offer to increase a child’s reading level in a year but that can mean the child has only to answer only 2 more questions correctly on a test.
Chapter 6 Questions 1&3
1) How might the author of a test demonstrate that its normative sample is representative of population of children attending school in the United States?
CLASS DISCUSSION-look at national census data and compare norm data with census data used in the test Manuel. Look at the technical Manuel. Look at test reviews.
For example, check the % of people (aboriginal) represented in the normative sample to determine if it is fair to test an aboriginal child
3) Discuss three approaches to tinkering with norms that use might use as a test developer to produce better norms.
1) Oversample subjects- select more samples than are needed and then drop subjects until a representative sample has been achieved.
2) Differentially weight subjects – under represented characteristics in subjects may be counted as more than one subject and over represented characteristics in subjects may be corrected by counting each subject as less than or ½ . For example, in a sample group if there are an extreme number of boys vs. girls, boys would be assigned less value ½ in the group and girls would be counted at a higher value 1 or 2.
3) Smooth norms – extremes test scores (outliers) can be dropped
CLASS DISCUSSION- this is done to create a standard normal curve….why do we need a perfect normal curve? what’s the point? – so that when a test is given, an obtained score can be plotted within a normal distribution. so we can see how a students performance is compared to those in the population.
Chapter 7
1) so you can generalize the information you get to other times. will we be able to get the same results at a later date?
2) A because it has less error
3)
1) Test Length – longer tests tend to be more reliable than short tests. There should also be both enough difficulty items for very superior students and enough easy items for deficient students. It doesn’t test enough of the domain if it isn’t long enough. It should have enough questions to be representative of the domain
2) Test-Retest Interval – the greater the time between administrations of a test, the more likely the true score will change (after a year, it is possible to retest to determine growth using the WJ) (2 weeks is preferred)
3) Constriction or Extension of Range – you cant take the abilities from one small sample and generalize it to a larger sample and assume they have the same abilities/ knowledge. it may not be reliable for all individuals based on their abilities (hearing impared or those who read Braille) if the person is outside the norm group, the test results are not necessary reliable
4) Guessing – it introduces error even when a guess results in a correct response
5) Variation within the Testing Situation- situational variables (headache, distraction, noise) introduce error and weaken reliability *this is the one area that we have control over and are evaluated on. have we administered the test in a standard way?
Chapter 8
1)
2) you must have reliability in order to have validity. but just because you have reliability, does not necessarily ensure validity because there are other factors you will need to ensure validity
3)
6) Enabling behaviour -language and culture Item Selection- Norm group-
* Test question is a given scenario like this one. you will have to like to text book terms