Chapter 3: Reliability and Validity
**This chapter corresponds to chapter 6 of your book (“Just the Truth”)
What it is:Reliability and validity are terms that refer to the quality of the measures used in a research study. Reliability refers to the consistency and validity refers to the accuracy of the measure. There are several types of reliability (test-retest, parallel forms, internal consistency, interrater) as well as several types of validity (content, criterion, construct). These different types of reliability and validity are used for different types of measures but can also worktogether. The more types of reliability and validity a measure demonstrates, the more confident we can be in the quality of the measure.
When to use it: Reliability and validity are important anytime you measure anything (so basically in every research study). The exact nature of a study determines which type of reliability and/or validity you should assess (see your text for details).
Using SPSS to calculate Internal Consistency Reliability (Cronbach’s Alpha) (dataset: Chapter3Example1.sav)
The rest of this chapter will focus oninternal consistency reliability (accessed with Cronbach’s Alpha). We are focusing on this topic specifically because it is the most commonly utilized type of reliability as well as the most straightforward to demonstrate in SPSS. Cronbach’s alpha ranges from 0 to 1 and tells you how internally consistent a group of items are. In other words, Cronbach’s alpha tells you the extent to which a group of items measure the same thing. The closer the value of Cronbach’s alpha is to 1, the more consistent the items in a measure.
In this example you will use SPSS to calculate Cronbach’s alpha for The Meaning in Life Questionnaire. Here is what that measure looks like:
The Meaning in Life Questionnaire (Steger, Frazier, Oishi, & Kaler, 2006)
Please take a moment to think about what makes your life feel important to you. Please respond to the following statements as truthfully and accurately as you can, and also please remember that these are very subjective questions and that there are no right or wrong answers. Please answer according to the scale below:
Absolutely Untrue1234567Absolutely True
1. I understand my life’s meaning.
2. My life has a clear sense of purpose.
3. I have a good sense of what makes my life meaningful.
4. I have discovered a satisfying life purpose.
5. My life has no clear purpose.
Open up the data set Chapter3Example1.sav. It should look like this:
You’ll see this data set includes the responses of 100 people to the 5 questions from the meaning in life questionnaire. Each question is called an item and is labeled with a variable name that tells you which question number it refers to (i.e., MIL1 is “I understand my life’s meaning”.).
Take a second and go back and look at the 5 items included in the meaning in life questionnaire. Do any of these items jump out at you? Does any item seem like it might not belong with the others?
You may have noticed that item 5 is different than the other 4 items. A person who believes their life has meaning would respond with high numbers to items 1-4 BUT with a low number to item 5. This is called a reverse coded item – meaning that it should be coded opposite of the rest of the measure. Whereas responses with high numbers on items 1-4 indicate high meaning in life, responses with low numbers on item 5 indicate high meaning in life.
How to handle reverse coded items
When you have a measure that includes reverse coded items, you must use the recode procedure in SPSS to reverse people’s responses to those items. This means that a 7 is turned into a 1, a 6 is turned into a 2 and so on…
In this example, we’ll need to reverse score one item: MIL5.
To recode an item, highlight the “transform” menu and then click on “recode into different variables” as shown below:
That will bring up the following window:
Highlight the item you want to recode (MIL5) and click the arrow to move it over to the window. Next type in a name for what the new recoded variable will be called. You can name it whatever you want, but a handy way to do this is to simply add an “r” (“r” stands for “recoded”) after the existing variable name (e.g. “MIL5r”). You’ll have to click the “change” button in the output variable box to make the new variable name show up in the “numerical variable -> output variable” box. Your screen should look like this:
Next click on the “old and new values” button to bring up the following window:
Type the value of the lowest number in your scale (i.e., a “1”) in the “value” box under the word “Old Value”. Then type the value of the highest number in your scale (i.e., a “7”) in the “value” box under the word “New Value”. In this example, this tells SPSS to turn 1’s into 7’s. Now click the “Add” button to make the numbers show up in the “Old -> New” box. Continue this process for each number that was a response option. The 2’s become 6’s, the 3’s become 5’s and so on. Your screen should look like this:
Once you have done this for all the numbers click “continue” and then “ok”. Take a look at the data window to confirm that your new variable is there – it will be located in the last column of the data set. Spot-check some of the values in your new recoded variable to make sure the recoding worked (e.g., does it look like the 1’s were switched to 7’s?).
Calculating Cronbach’s Alpha
Once you have reverse scored any reverse coded items, you’re ready to calculate Alpha. To begin, click “Analyze” and then highlight “Scale”. Next click on “Reliability Analysis” as shown below:
That will bring up the following window:
Next highlight the first item that is part of your scale and click the arrow to move it over to the “Items” box.Continue until you have done this for each of the items in your scale. Be careful with the reverse coded item, you only want to include the reverse scored version (not both). Your screen should look like this:
Whenever you compute Cronbach’s alpha, it’s helpful to ask SPSS for an extra bit of output that tells you what the alpha would be if you dropped any of the items. This helps you identify “bad items”. If SPSS tells you that the alpha for a scale would be quite a bit higher if an item were dropped – this is a potential bad item. It may be a reverse coded item you forgot to recode or an item that should potentially be dropped from the scale because it is poorly worded or it measures something different. To get this extra output click on the “statistics” button and then check the box in the “Descriptives for” box that is labeled “Scale if item deleted”. Your screen should look like the following:
Now click continue and then OK. You output will look like the following:
Case Processing SummaryN / %
Cases / Valid / 100 / 100.0
Excludeda / 0 / .0
Total / 100 / 100.0
a. Listwise deletion based on all variables in the procedure.
Reliability Statistics
Cronbach's Alpha / N of Items
.846 / 5
Item-Total Statistics
Scale Mean if Item Deleted / Scale Variance if Item Deleted / Corrected Item-Total Correlation / Cronbach's Alpha if Item Deleted
MIL1 / 19.4300 / 19.783 / .685 / .808
MIL2 / 19.2300 / 18.502 / .740 / .791
MIL3 / 18.6000 / 19.636 / .667 / .811
MIL4 / 19.3000 / 19.646 / .643 / .817
MIL5r / 18.4400 / 18.128 / .573 / .846
Interpreting the Output
We have two goals when we look at our output for Cronbach’s alpha.
- Make sure that our alpha is “good”.
- Make sure we don’t have any “bad items”.
Interpreting the value of alpha can be somewhat subjective because people may have different ideas about what’s a “good enough” value for alpha. Just remember that the closer alpha is to 1, the better. In psychology, people generally think of an alpha that is higher than .80 as “good enough” and an alpha between .70 and .80 as OK. Anything below .70 may be a reason to worry.
If you find any potential “bad items” (items that the alpha would increase if they were deleted), go back to your scale and try to figure out what might be going on with that item. Should it be reverse coded if it wasn’t? If it was reverse coded, was that a mistake? If there’s no coding problem, does it seem like the item might be hard to understand or is measuring something different than the rest of the scale? Sometimes, researchers delete such “bad items” from the scale for the sake of higher reliability. This is a judgment call that has to be made by the researcher (often, after years of practice).
Now what?
If you’ve decided that your alpha is “good enough” for you, the next step is usually to compute an average score for each participant. This is because what we really want to know is a person’s average meaning in life rather than their responses to 5 different items. Having an internally reliable set of items tells you that it’s ok to average people’s responses on the items because they each measure the same thing.
You may be wondering why researchers use multiple items in the first place if you have to mess with all of this alpha stuff and then average the items into a single number anyway. The reason has to do with validity. We know that asking one single item is not always a very good way to accurately measure what we’re interested in. This is based on the same logic as taking multiple exams in a single class over the course of a semester. Imagine if you only took one test in this class at the end of the semester.That might not be the most accurate way to assess how much you know about the topic – what if you are just having a bad day? In the same way that taking multiple tests makes our assessment of your knowledge of the topic more accurate, so does asking multiple items about one thing.
To compute a mean, click on “transform “and then click on “compute variable” as seen below:
This will bring up the following window:
In the Target Variable box you should type a name for the average score. For example, “avgMIL” for average meaning in life.
Next, you will want to tell SPSS how to calculate the mean by putting a command in the “Numeric Expression” box. To compute a mean, the command is Mean(item1,item2,item3…). So type “mean(“ in the numeric expression box. Next highlight the first item and click the arrow to move it into the box. Continue this for each item, separating the items with a comma. Don’t forget to ONLY include the reverse coded item for MIL5. When you have done this for each item, type in the closing parentheses “)”. Your screen should look like this:
Now click OK and then navigate to the data view window to make sure your new variable is there – It can be found at the end of the data set. Spot-check your new average variable (e.g., is the new variable the correct average of the five items for Participant 1?). You would be able to use this variable in many of the analyses you will learn about the rest of this semester. For example, you might use a t-test to compare meaning in life between 2 groups or you might use a regression analysis to predict meaning in life from some predictor. Exciting times are ahead!
Practice Problem (answers in appendix)
In this problem you will assess the reliability of the openness to experience subscale of The Big Five Inventory (John, Donahue, and Kentle, 1991). People who are high in openness to experience are more intellectually curious and willing to think about different ideas. This measure is designed to tap the extent to which somebody is open to experiences.
Here is what the scale looks like:
Please choose a number next toeach statement to indicate the extent to which you agree or disagree with that statement.
Disagree strongly 12345Agree Strongly
- I see myself as someone who is original, comes up with new ideas
- I see myself as someone who is curious about many different things
- I see myself as someone who is ingenious, a deep thinker
- I see myself as someone who has an active imagination
- I see myself as someone who is inventive
- I see myself as someone who values artistic, aesthetic experiences
- I see myself as someone who prefers work that is routine
- I see myself as someone who likes to reflect, play with ideas
- I see myself as someone who has few artistic interests
- I see myself as someone who is sophisticated in art, music, or literature.
The dataset “chapter3problem1.sav” includes the responses of 100 people to these 10 questions.
- Read the items carefully and look for any reverse coded items (hint: there are two items). Use SPSS to reverse code the items you believe are reverse coded – write below which items you reverse coded.
- Use SPSS to calculate an alpha for this measure. What is the alpha?
- Do you see any potential “bad items”? What do you think might explain these bad items?
- Calculate a mean for the openness scale then use the descriptive procedure to find the range, mean, and standard deviation for the scale. Write these values below.