Using Split-Half Reliability With Tests

INSTRUCTIONS

Split-Half Reliability is a common statistical method used to determine the reliability of a typical test. It is used for multiple choice tests most often, but it can be used on any test that can be divided in half and scored consistently. Split-Half Reliability assumes that, if a test is reliable, a student should score equally as well or poorly on two randomly selected halves of the test.

  1. Administer the test as you would normally. You will need to know how students performed on the two randomly selected halves of the test, so it is important to keep the responses each student gave for each question. A final score is not, in itself, sufficient to complete the analysis.
  2. Randomly divide the test into two parts. This is often done using an even-odd approach. Each half of the test should approximately the same number of questions. The questions in each half should be more or less equivalent. Essay questions can be included as long as they are evenly distributed between halves in terms of content and point value.
  3. Score each half of the test for each student and record the scores in the appropriate cells of the spreadsheet that can be accessed through the icon at the bottom of this document. There is space for data from 50 students. If you have fewer than 50 students you can leave those rows blank. In the example spreadsheet, student 1 received a 47 on one half of the test and a 45 on the other half.
  4. The reliability coefficient will be automatically calculated. A reliability of 0.8 and higher is generally considered to be good.

QUESTIONS AND ANSWERS

Q: How many students should I use for this statistic?

A: The statistic does take into account the number of scores that are used. Fewer scores mean a lower likelihood of achieving good reliability. For the purposes of technical skill assessment, you should strive to have at least one full class of students (about 30). The spreadsheet will accommodate scores for up to 50 students.

Q: Can I use student scores from several classes or schools?

A: Yes, as long as you are using the same test and the method of administering the test is equivalent.

Q: How long does the test have to be?

A: It is easier to demonstrate reliability as a test gets longer. The sample was developed using a 100 point test. The test should be long enough to split into two equivalent forms but short enough to be administered in the time provided for students.

Q: What data do you need for demonstration of reliability?

A: You can email a copy of the completed spreadsheet as proof of reliability.

Q: Should I use this on a performance assessment?

A: No, this is not appropriate for performance assessments or portfolios.

CONTACT

For additional information on Technical Skill Assessment, contact Tom Thompson at or (503) 947-5790.


RESOURCES

Oregon Department of Education, March 2010Page 1 of 2