FormalAssessmentReview
Completeareport foreach testadministeredthatincludesthefollowinginformation:
I. TEST
a. Title:
Test of Early Mathematics Ability 3
b. Author:
Herbert P. Ginsburg and Arthur J. Baroody
c. Publisher:
Pro Ed
d. Copyright:
2003
II. DESCRIPTION:
a. GeneralDescription:
The TEMA 3 is a test that is a norm-referenced test that tests the early mathematical ability of students 3 years 0 months to students that are 8 years 11 months.
b. Materialsprovided/needed:
Picture book Form A and Form B, Student Worksheet Form A and Form B, and profile/examiner record, manipulatives.
c. Alternateforms:
There is a form A and a form B to this test.
III. ADMINISTRATION:
a. Ageranges:
The age ranges for this test are 3 years 0 months to 8 years 11 months
b. Administrationandscoringtime:
The test will take anywhere from 45-60 minutes.
c. Typesofscoresreported:
Correct scores are marked down as 1 and incorrect scores are marked down as 0.
d. Starting points,basalandceilinglevels:
The starting point for the test is based on the age of the student.The starting points by age are: age 3 starts at item 1, age 4 starts at item 7, age 5 starts at item 15, age 6 starts at item 22, age 7 starts at item 32, age 8 starts at item 43.The basal is the highest five consecutive items answered correctly.The ceiling is 5 consecutive items answered incorrectly.
e. Standarderrorofmeasurement:
The standard error of measurement for Form B of the test at every age is 3.The standard error of measurement for Form A for 3, 4, and 5 year olds is 4.The standard error of measurement for Form A ages 6-8 is 3.
f. Confidenceintervals:
The confidence interval at 95% for Form A for students ages 3-5 is 8 and the confidence interval at 99% is 12.
The confidence interval at 95% for Form B and Form A for students ages 6-8 is 6 and the confidence interval at 99% is 9.
IV. NORMINGPROCEDURES:
a. Samplingprocedures:
The testing sites were chosen in each of the four major geographic regions. The testing sites were Putnam Valley, New York, Madison, South Dakota, Austin, Texas, and Brookings, Oregon. There was a site coordinator for each location chosen to supervise the test administration.The site coordinators were trained to give the test and they then trained all the other examiners. All the children tested attended day-care centers or attended general education classes. Children with disabilities that were in general education classes were included in the sample.The authors used the PRO-ED customer database to find people who might be interested in testing children.Each person was sent a letter asking for his or her participation in the standardizing the test. The people that responded were sent materials to test around 10 to 20 children in their area whose demographic makeup matched that of their community.
b. Sizeandcharacteristicsofsample(sex,age, geographic region, etc.)
The test sample had 1,228 students where 637 took Form A and 591 took Form B. 15 states were included in this sample and these states are: California, Connecticut, Florida, Kentucky, Massachusetts, Missouri, New Hampshire, New Mexico, New York, North Carolina, North Dakota, Oregon, Pennsylvania, Texas, Virginia, and Wisconsin. There were 104 females tested and 96 males tested.
c. Dateofnorms:
The sample was tested in the fall of 2000 and spring of 2001.
V. RELIABILITY:
a. Explanationoftheprocedureforeachreliabilitytype measured
The coefficient alpha reliability method shows the extent to which items correlate with one another and is found by using Cronbach’s method.Coefficient alphas were calculated at six age intervals using data from the entire normative sample.
Alternate forms immediate administration is when both forms were given in one testing session, there is a reliability index that can be used to estimate content sampling error.The standard scores for the different forms were correlated at six different age intervals.There were 46 students given this method from Austin, Texas.
The test-retest method of reliability looks at whether a student’s performance is constant over time and is a measure of time sampling error. The students got tested the same form both times with 2 weeks between the two testing periods.There were 49 students given the test-retest method from Putnam Valley, New York and 21 were from Mandan, North Dakota.
Alternate forms delayed administration can be used to estimate test error that relates to both content sampling and time sampling. There were 46 students given this method and the students were tested two weeks later after taking the initial test.
b. Coefficients foreachmeasureofreliabilityreportedifapplicable
Coefficient Alpha:
The average coefficient for Form A is .94 and for Form B is .96.
Alternate Form (Immediate):
The average coefficient for the immediate alternate form is .97.
Test-Retest:
The average coefficient for Form A was .82 and for Form B was .93.
Alternate Form (Delayed):
The average coefficient for the delayed alternate form reliability is .93.
VI. VALIDITY:
a. Explanation of eachtypeof validitymeasured
Content-Description Validity has 3 parts to it.First there is a rationale given for selecting the items that they did for the TEMA-3.Then there was a conventional item analysis done that measures the item discrimination which is the degree to which an item differs correctly among test takers in the behavior that the test is meant to measure.Finally there was a differential item functioning analysis to look at and see if there are any bias in the test.
The criterion-prediction validity is a method that takes the TEMA-3 and puts it up against other tests that test similar abilities to see how they correlate and compare.The tests that were compared were the Basic Concepts and Operations composites from the Key Math Revised, the applied problems subtest from the Woodcock Johnson III Tests of Achievement, the Mathematics Reasoning and Mathematics Calculation subtests and the Mathematics Quotient from the Diagnostic Achievement Battery-Third Edition, and the Mathematics Quotient form the Young Children’s Achievement Test.
Construct-Identification Validity provides three hypothesis that they thought this test would do and cover.
1.Since mathematical ability is developmental in nature then the performance on the test should be highly related with chronological age.
2.Since the test measures mathematical ability then the results should differentiate between groups of people that are known to be average or below average in their mathematical ability.
3.Since the items of this test measure mathematical ability then the items should be highly related with the total score.
3.
b. Correlationcoefficientsreportedif applicable
Content-Description Validity:
The median discriminating powers for the math ability score at 6 age intervals
3: Form A- .66, Form B- .684: Form A- .50, Form B- .625: Form A- .45, Form B- .546: Form A-.50, Form B- .537: Form A- .57, Form B- .568: Form A- .62, Form B- .54
The median item difficulties
3: Form A- .4, Form B- .34: Form A- .25, Form B- .155: Form A- .53, Form B- .41, 6: Form A- .39, Form B- .477: Form A- .62, Form B- .588: Form A- .67, Form B- .87.
Criterion Predictive Validity:
The relationship between TEMA-3 and other tests
Key Math
Basic Concepts- .54 and Operations- .63
WJIII ACH
Applied Problems- .55
DAB-3
Mathematics Reasoning- .65, Mathematics Calculation- .83, and Mathematics Quotient- .84
YCAT
Mathematics Quotient- .91
Construct-Identification Validity:
Correlation with Age
Form A- .91 and Form B- .89
VII. CLASSROOMUSES:
a. Specificusesassuggestedbymanual/authors:
The main purposes of the test are to identify the students who are either significantly behind or ahead of their peers, identifying certain strengths and weaknesses in mathematics, to suggest different instructional practices that are good for individual children, document the student’s progress in learning math, and it serves as a measure in research projects.
b. Youropinionof appropriateuses:
I think these uses are very appropriate for this test. I think this test covers those uses.
Desirablefeatures
I think the questions of this test are good ways to measure mathematical ability of the students and the test items are very appropriate for the age range.
Undesirable features
I think one thing that I don’t like about this test is that there isn’t subtests and it is all just one test.