Lab Exercise/Homework-Week 3

LAB EXERCISE/HOMEWORK #2

Due: 2/13 (at START of next lab)

Section A: Reliability

As you recall, the purpose of our lab project is to examine the social and career factors that influence relationship and career satisfaction. Of course, being expert scientists, you know that before even looking at the relationships among the variables, you need to examine whether the manifest variables that assess our latent constructs demonstrate sufficient measurement properties. For our current purposes, we will limit this to an examination of the reliability of some of the variables.

Because we have collected data at two different time periods, we can examine not only internal consistency reliability, but also test-retest reliability. Notice that when we examine each form of reliability, we are making different assumptions about what is “true score” variance and what is “error.”

For present purposes, we want to examine career satisfaction and extraversion. We may pose the following questions:

What is the test-retest reliability of career satisfaction? Given that it was nearly one year between measurements, and at least the first measure occurred when they were students, is there any reliable variance across time?
What is the internal consistency reliability of our extraversion measure? In the future, we may want to reduce the number of items of this measure because some participants complained about the personality tests taking so long to complete. Therefore, let’s examine what happens to reliability when we look at different numbers of items.
A colleague suggests that we should also examine the split-half reliability of the extraversion. As this person is helping to fund our research, we’ll oblige and examine that as well.

Exercise1: Test-retest reliability

Let’s say that we are interested in getting the test-retest reliability of career satisfaction. We’ve measured the same career satisfaction twice at time 1 (career1) and at time 2 (career2) and see how scores at time 1 and time 2 are correlated. That is, test-retest reliability is the correlation between scores at time 1 and time 2. If you get the correlation between scores, which is not equal 1, what does not mean?

Procedures)

1)Go to the menu bar and select AnalyzeCorrelateBivariate.

2)In the left box, click two variables (career1 and career2), and send them to the right boxOK.

3)See the output. What do you think about the test-retest reliability of career satisfaction? How will you interpret this? Discuss the sources of error variance and what percentage of variance is due to error?

Exercise 2: Internal consistency (α)

Now we are interested in getting the internal consistency (α) of extraversion. 12 items were used to measure extraversion (ret2, ret7, ret12, ret17, ret22, ret27, ret32, ret37, ret42, ret47, ret52, and ret57).

1)Go to the menu bar and select AnalyzeScaleReliability Analysis.

2)In the left box, click 12 items (ret2, ret7, ret12, ret17, ret22, ret27, ret32, ret37, ret42, ret47, ret52, and ret57) to the right box.

3)Under the left box, you will see “Model” in which Alpha is the default reliability. Therefore, you don’t need to change anything for Model.

4)Click Statistics button on the right bottom. Then, another dialog box will be open. Check out the boxes of “item, scale, and scale if item deleted” under descriptives for, and “correlations and covariances” under inter item.  Click Continue buttonOK to see the output.

5)See the output. Check the correlations among items, and you will find negative correlations between ret7 and other items. This indicates that you did something wrong with that item. The most frequent, critical mistake people make is to forget to recode the item (ret7). Remind that it is important to have all items in the same direction to assume that people with high scores on extroversion are more extravert, whereas those with low scores are less extravert. In reality, you need to recode items before checking the reliability of the measure. In this lab exercise, we (TAs) did not recode the item (ret7) on purpose to show you what kind of consequences could be made by failing in recoding the item.

6)To recode the item (ret7), go to the menu bar and select TransformRecode Into Same Variables.

7)In the left box, click the item (ret7) and send it to the right box.

8)Click Old and New Values button under the bottom and you will see another dialogue box. (Note. It is important to note that Personality items are based on 0 to 4 scales.)

9)In the left box (Old value box), type in 0 and move the cursor to the right box (New value box), and type in 4 Click Add button. Now you will see Old value (0)  New value (4) added in the right box, which indicates that old value (1) will be recoded by new value (3) in the item (ret7) column of SPSS dataset. Repeat this procedure for other old values (2, 3, 4).

That is,

Old values (1) New values (3) Add

Old values (2) New values (2) Add

Old values (3) New values (1) Add

Old values (4) New values (0) Add  Click Continue buttonOK.

By doing it, all values (scores) in the ret7 column are recoded in SPSS Data view sheet.

10)Now you get the internal consistency of extroversion again by repeating # (1) through (4).

Examine the output, especially the directions of correlations among items. Also, check if the internal consistency has been improved. (It should be improved.)

11)Another important thing to remember is that the internal consistency (coefficient alpha) increases as the number of items. Let’s try to get the internal reliability with 6 items (ret2, ret12, ret22, ret32, ret42, and ret52), and see the coefficient alpha. Then, get the internal reliability with 6 more items (ret7, ret17, ret27, ret37, ret47, and ret57) added to the first 6 items, and see the coefficient alpha. Has it been increased with the additional 6 items?

Exercise 3: Split-half reliability

Now we move onto the split-half reliability. The split-half reliability is the correlation between separately scored halves of a single test. Let’s get the split-half reliability of extraversion. 12 items were used to measure extraversion (ret2, ret7, ret12, ret17, ret22, ret27, ret32, ret37, ret42, ret47, ret52, and ret57).

1)Go to the menu bar and select AnalyzeScaleReliability Analysis.

2)In the left box, click 12 items (ret2, ret7, ret12, ret17, ret22, ret27, ret32, ret37, ret42, ret47, ret52, and ret57) to the right box.

3)Under the left box, you will see “Model” and click on the top-down box and change the option from default one (Alpha) to Split-half.

5)See the output and check the correlation between forms on the bottom of the output.

6)It is important to remind that the half-test is only half as long as the total test on which the scores will be computed such that there is a necessity to correct the split-half reliability using the Spearman-Brown (SB) correction. Based on the SB correction, correct the split-half reliability. And, check if your correction was the same as one (Equal-length Spearman-Brown) in the SPSS output. Note) Spearman-Brown (SB) Formula:

SB = (n X rel)/(1 + (n-1)rel)

n = amount by which the test length is multiplied

rel = correlation between halves

HOMEWORK #2-A: Reliability

Now, it’s your turn to look at these issues with some new variables. You know that you need to recode the personality items so that all items are scored the same way.

Recode ret1 (one item for Neuroticism), ret13 (one item for Openness to Experience), ret19 (one item for Agreeableness), and ret25 one item for Conscientiousness. (Note, Remember that you already recoded ret7 (one item for Extraversion) in the lab exercise.)

Make sure that if the scores are recoded in the variable columns. However, don’t repeat this procedure again and again because your attempt to repeat it will recode scores whenever you do that! So, recode variables only once!

Now, we want to look at the test-retest reliability of self-esteem. Some research finds that self-esteem is a very stable construct, whereas other research suggests it is more variable over time. We suspect that self-esteem is more stable, and can check this question by looking at the test-retest reliability of self-esteem. However, we would not expect such a high test-retest correlation for relationship intention. The reason is because we would theoretically expect relationship intentions to change more frequently over time. Therefore, let’s examine the test-retest reliability of self-esteem and relationship intentions.

Get the test-retest reliability of “Self-esteem” (esteem1 and esteem2 in the SPSS dataset). Also get the test-retest reliability of “Intention to remain in relationship” (intent1 and intent2 in the SPSS dataset). REPORTthe test-retest reliability of each scale (self-esteem and intent to remain in relationship), and describe the sources of error variance (This answer does not have to be in APA style).

Remember our discussion of “modern” test theory? Let’s look at how it applies to the test-retest reliability of self-esteem. Assume that the internal consistency of the self-esteem items measured at Time 1 was .90, and the internal consistency of the self-esteem items measured at Time 2 was .85. Using the test-retest reliability estimate obtained in the previous question, correct this test-retest correlation for unreliability due to item content.

Get the “true” test-retest correlation unaffected by measurement item content error using the correction for attenuation formula:  = rxy/[SQRT(rxx*ryy)]. Now, compare this test-retest correlation with the one you obtained in question 2 above. Explain why you have the discrepancies between the two test-retest correlations. Which one is the appropriate estimate of reliability?

Now, we want to examine the internal consistency of the agreeableness personality measure.

Get the internal consistency of “Agreeableness” with 12 items used: ret4, ret9, ret14, ret19, ret24, ret29, ret34, ret39, ret44, ret49, ret54, and ret59. Check the correlations among items and Alpha if item deleted column to examine if everything has been conducted correctly. If you find any problems (e.g., negative correlations among items), check out recoding process of item 19. REPORT the final internal consistency you come up with, and describe the source(s) of error variance.

One practical problem with the agreeableness measure is that it takes up so much space in the survey—12 questions, to be exact. Do we really need to ask all of these questions? Let’s find out what reducing the number of items does to the internal consistency of the agreeableness measure.

Get the internal consistency of “Agreeableness” with the first 4 items (ret4, ret9, ret14, ret19). Then, add 4 additional items added (ret24, ret29, ret34, ret39, 8 items in total). Finally, and another 4 additional items added (ret44, ret49, ret54, and ret59, 12 items in total). REPORT how the internal consistency has changed with increasing number of items.

Now, let’s see what kind of reliability estimate we get using the split-half method of estimating reliability:

Get the split-half reliability of “Agreeableness” with 12 items used: ret4, ret9, ret14, ret19, ret24, ret29, ret34, ret39, ret44, ret49, ret54, and ret59. REPORT the original split-half reliability and corrected reliability using the Spearman-Brown (SB) correction.

We are now ready to present the results in APA style. Let’s make a correlation table:

Complete the following table based on the information using the Lab exercises and homeworks you’ve done previously. Make sure the table is consistent with the APA 5.0 manual. That is, put the means, standard deviations, and internal consistency () as indicated in the table. Print out the Table after completing it.

Note: Place the internal consistency on the diagonal (all values should be rounded to 2 decimal points), and type in the correlations below the diagonal. Leave the columns/rows above the diagonal open. Assume the internal consistency reliability for intentions at time 1 is .75 and at time 2 is .83.

Table 1

Means,Standard Deviations, and Correlations among Variables

Variablesa / Mean / SD / 1 / 2 / 3 / 4 / 5

Time 1 Measures

1. Self-esteem 1 / 4.28 / .49 / (.90)
2. Intentions 1 / 3.24 / .83 / .03 / (.75)
3. Agreeableness / 36.09 / 5.73 / .24** / .102 / (.75)

Time 2 Measures

4. Self-esteem 2 / 4.31 / .54 / .83** / .05 / .19** / (.85)
5. Intentions 2 / 3.09 / .93 / .05 / .65** / .14* / .16** / (.83)

Note. Values enclosed in parentheses represent the internal consistency reliability (α). For correlations with agreeableness, n = 332; for all other correlations, n = 334.

*p < .05. **p < .01.

APA Style Wring:

Now that you have completed the table, we can interpret it. In APA style, interpret the reliabilities of the self-esteem and agreeableness measures. That is, report and interpret:

1)The test-retest reliabilities of the self-esteem and intentions measures.

2)The internal consistency of the agreeableness measure.

3)The sources of error present in each of these estimates.

4)Be sure to refer to the table in your discussion.

5)Are these measures sufficiently reliable to warrant their use in psychological research?

6)This entire writing exercise should be no more than 1-3 paragraphs.

Section B: Validity

Understanding social and career-related factors that influence relationship and career satisfaction is important. For example, a good understanding of the relationships among these factors could enable organizations to design programs that more effectively promote a healthy work-family balance. Based on a review of the literature, you recently developed the following model to account for some of these relationships:

Consistent with the research process, your next step is to start validating this model. This generally includes developing measures of relevant constructs, validating these measures, and collecting data on the relationships hypothesized.

Lab Exercise

The following is to be completed in lab (in groups of 3-4 students). You should turn this information in when you turn in your write-up for Homework #2 (next week). Be sure to record this information, so that you can turn it in next week. What you turn in does not have to be different for each member of your group.

As part of your literature review, you were able to find existing measures for Relationship Satisfaction and Intent to Remain in a Relationship. You were unable to locate existing measures for the other two constructs (Self-Esteem and Relationship Self-Efficacy). Therefore, you will have to develop measures for these constructs yourself. As a group, do the following:

Define the latent construct domains for Self-Esteem and Relationship Self-Efficacy. For each construct, write a one to two sentence conceptual definition. This definition should capture what the group believes are the important characteristics (or components) of these constructs.
Using a five-point Likert scale (1 = strongly disagree to 5 = strongly agree), generate 3-4 items (each) to assess Self-Esteem and Relationship Self-Efficacy. Remember, that these items should go a good job covering as much of the construct domain as possible (based on your conceptual definitions), while minimizing sources of irrelevant variance, and adhering to the do’s and don’ts for developing items.

Critique the items the group just developed. When doing so, remember to pay attention to the conceptual definitions the group generated. When critiquing, do the following:

For each set of items, identify at least three sources of invalidity. How do the items underepresent the constructs as defined (e.g., what’s missing)? What are potential sources of irrelevant variance? Be specific as possible. List and briefly describe each source of invalidity.
Identify at least three sources of random error variance that could impact the reliability for either measure. Be specific as possible (e.g., simply saying “time” or “item content” is not enough). List and briefly describe each source of random error variance.

Describe how you would go about validating the Self-Esteem measure your group generated. In technical terms, we call this a validation plan. In 2-3 paragraphs describe what steps you would take, and what you would be doing at each step. Include at least three of the ways for documenting validity talked about in lecture (e.g., content, substantive, structural, generalizability, external, or consequential).

HOMEWORK #2-B: Validity

Past research indicates that there the relationships among self-esteem, self-efficacy, and satisfaction tend to be strong (rs > .50). To see if these constructs can be meaningfully differentiated, and to validate the measures you generated for Self-Esteem, and Relationship Self-Efficacy, you design a multitrait-multimethod study. The three traits in your design are: a) Self-Esteem, b) Relationship Self-Efficacy, and c) Relationship Satisfaction. The three methods are: a) multi-item self-report measure (e.g., the measures you generated during the lab exercise, plus the existing measures identified from your literature review); b) partners’ ratings; c) content coding of diary entries made by each subject for 4 weeks (Note: entries were content coded for information on the three traits). Additionally, you collected data on the correlations between Self-Esteem, Relationship Self-Efficacy, and Relationship Satisfaction (using the three different methods) with Intent to Remain in a Relationship. The observed correlations and reliabilities (minus those you’ll be computing) are reported below in Table 1.

Table 2: Multitrait-Multimethod (MTMM) Matrix

Method 1

/ Method 2 / Method 3 / r
Traits / SE / RSe / RSat / SE / RSe / RSat / SE / RSe / RSat / Intent
SE / (.90) / .06
M1 / RSe / .37 / (.87) / .24
RSat / .40 / .79 / (.75) / .29
SE / .38 / .44 / .51 / (.90) / .19
M2 / RSe / .32 / .38 / .43 / .76 / (.95) / .26
RSat / .34 / .31 / .48 / .79 / .86 / (.92) / / .22
SE / .47 / .16 / .31 / .27 / .13 / .06 / (.80) / .42
M3 / RSe / .27 / .69 / .23 / .06 / .17 / .13 / .51 / (.81) / .35
RSat / .19 / .13 / .83 / .11 / .15 / .36 / .61 / .58 / (.70) / .67

Note: SE = Self-Esteem; RSe = Relationship Self-Efficacy; RSat = Relationship Satisfaction; Intent = Intent to Remain in a Relationship; Method 1 (M1) = multiple item self-report measure; Method 2 (M2) = partners’ ratings; Method 3 (M3) = content coding of diary. All correlations are uncorrected.