Dr. Betsy BeckerCEP933 Key toMaribel Sevilla

Dr. Christine SchramAssignment 5: TWO-WAY ANOVAWei Pan

Part I. Two-way Analysis of Variance

In Assignment 4 we looked at the separate relationships between type of study preference (individual study only, review session only, or the combination of the two) and midterm score, and between anxiety level (low, medium, high) and midterm score.

For this assignment, we will first examine the main-effects relationships by putting both predictors in the model simultaneously, but we will omit the interaction. You will need to select Analyze, General Linear Model, and then either General Factorial (if you’re using version 8) or Univariate (for version 10). Put in your outcome and your two fixed factors. Click on the models button. Make the “main effects only” model by selecting custom model (not full factorial) and moving over the two separate factors. Do not construct the interaction term.

You may want to compute estimated means, view plots, etc. Use the Options button and try a few things to see what you get. Be sure to click the Plots button and plot the estimated marginal means for each model. You only need to include output from this model that is essential for answering specific questions below.

  1. State in words and symbols the (two) null and alternative hypotheses for the ANOVA with these two main effects.

The population model for the main-effects only model is:

Yijk =  + j + k + eijk

(for i = 1 to njk people in j = 1 to m groups of factor A and k=1 to p groups of factor B)

where Yijk —The midterm score for the person i in the group j and in the group k

 —The grand mean of Y in the population

j —The effect of being in the group j (first categorical predictor: study preference or factor A) in the population with anxiety hold constant

k —The effect of being in the group k (second categorical predictor: anxiety level or factor B) in the population with study hold constant

eijk —The residual associated with the person i in the group j and in the group k

And the (two) null and alternative hypotheses for the ANOVA with these two main effects are as follows:

Define  as the population grand mean (for groups combined),

j. as the population mean for group j and j = j. –  ,

and

.k as the population mean for group k and k = .k – .

For FACTOR A: Study Preference

In symbols / In words
Null hypothesis:H0: j. =  for all jOR
H0: j. = 0 for all j / The population mean midterm scores for all levels of factor A (study preference) are equal in this population with factor B hold constant. (There is no study preference effect; the differences between the midterm score means of the study preference groups (review only, study only, and review and study group) have arisen by chance).
Alternative hypothesis:H1: i. for at least one j OR
H1: j.0 for all j / At least one of the population mean midterm scores for the levels of factor A (study preference) is different in the population with factor B hold constant. (Study preference affects the midterm score in the population; the differences between the midterm score means of the study-preference groups (review only, study only, and review and study group) have not arisen by chance, but are due to the effect of study preference in the population).

For FACTOR B: Anxiety Level

In symbols / In words
Null hypothesis:H0: .k =  for all kOR
H0: .k = 0 for all k / The population mean midterm scores for all levels of factor B (anxiety level) are equal in this population with factor A hold constant. (There is no anxiety level effect in the population; the differences between the midterm score means of the anxiety level groups (low, medium and high group) have arisen by chance).
Alternative hypothesis:H1: .k for at least one kOR
H1: .k0 for at
least one k / At least one of the population means of the midterm score for the levels of factor B (anxiety level) is different in this population with factor A hold constant. (Anxiety level affects the midterm score in the population, the differences between the midterm score means of the anxiety level groups (low, medium, high group) have not arisen by chance, but by the effect of anxiety level in the population).
  1. Calculate the ANOVA table for the two-way model with main effects only. What are your decisions about the hypotheses in question #1?

(1)We may look at the significance level of each factor (using =.01 or some other ) or,

(2)We may compare the F-observed with the F-critical to test the hypothesis.

For FACTOR A: Study Preference

(1)p-value = .006 < .01 =  – So we rejectthe null hypothesis that “The population mean midterm scores for all levels of factor A (study preference) are equal in this population”, andconcludethatstudy preference affects the midterm score, i.e. the differences between the midterm score means of the study preference groups (review only, study only, and review and study group) have not arisen by chance, but by the effect of study preference in the population.

(2)F-observed = 5.640 > F-critical (0.01,2,48)=5.08, so we reject the null hypothesis, and conclude that there is an effect of study preference in the midterm score, and it is statistically significant at 0.01 level (p-value < 0.01).

P.S. To be more conservative, we used F-critical (0.01,2,48) instead of F-critical (0.01,2,49) because the table only presents values for df-error = 48 and df-error = 50.

For FACTOR B: Anxiety Level

(3)p-value = .134 > .05 =  – So we fail to rejectthe null hypothesis that “The population mean midterm scores for all levels of factor B (anxiety level) are equal in this population”, andconcludethatanxiety level does not affect the midterm score, i.e. the differences between the midterm score means of the anxiety level groups (low, medium, and high group) have arisen by chance in the population.

(4)F-observed = 2.097 < F-critical (0.05,2,48)=3.19, so we fail to reject the null hypothesis, and conclude that the effect of anxiety level in the midterm score is not statistically significant at 0.05 level (p-value > 0.05).

P.S. To be more lenient, we used F-critical (0.05,2,48) instead of F-critical (0.01,2,49), even so, F-observed is still less than F-critical.

  1. How has your eta-squared for the main-effects two-way model changed from using study preference only as the single main effect? You’ll have to look at your Assignment 4 output to answer this question or rerun the one-way ANOVA.

Comparing table Assignment 5 and table Assignment 4, we observe that eta-squared has increased from ²= .175 (Adjusted ²=.143) to ²= .240 (Adjusted ²=.178) . Though it is not significant, adding the anxiety effect has added a little to the explanatory power of the model.

  1. Examine the plot of means for this model. Do the means appear to show any interaction?

The plot of means indicates that there is no interaction between factor A and factor B. However, this plot will not show an interaction if one exists, because our MODEL did not include an interaction term. This plot is NOT a good indication of whether an interaction exists in our data!!

  1. Now run the full two-way model, which includes the study preference by anxiety interaction. Using the same set up as above, simply click the Models button and select the full factorial option. SPSS will construct the model with the interaction for you. Include your output for this question.

The output is on the following page.

  1. What additional added null hypothesis is tested in this new model? Give the null hypothesis in both words and symbols. From the output in #5, what do you decide and conclude about this hypothesis? In your answer, be sure you explain which line of the output you used, and include the value of the test statistic. To what critical value is this statistic being compared?

This model also includes the interaction term so we can test for that. It is on the line labeled STUDY * ANXIETY.

For FACTOR A x FACTOR B: Interaction between Study Preference and Anxiety level

In symbols / In words
Null hypothesis:H0: jk = 0 for all jk / The factor A (study preference) and the factor B (anxiety level) do not interact in this population.
Alternative hypothesis:H1: ik  0 for at least one jk / The factor A (study preference) and the factor B (anxiety level) do interact in this population. The effects of factor A depend on the level of factor B OR the effects of factor B depend on the level of factor A.

(1)We may look at the significance level of each factor (we will use the criterion =.05):

p-value = .04 < .05 =  – So we rejectthe null hypothesis that “There is no interaction effect between factor A (study preference) and factor B (anxiety level) at a significance level of .01”, andconcludethatstudy preference and anxiety level together do affect the midterm score in the population.

(2)We may also compare the F-observed with the F-critical to test the hypothesis:

F-observed = 2.733 > F-critical (0.05,4,44)=2.58, so we reject the null hypothesis, and conclude that the interaction effect of study preference and anxiety level in the midterm score is statistically significant at the 0.05 level (p-value <0.05).

P.S. To be more conservative, we used F-critical (0.05,4,44) instead of F-critical (0.05,4,45), because the table only presents values for df-error = 44 and df-error =46.

  1. Did the addition of the interaction term change your decisions about either of the two main effects? How do you know? Compare your output from #2 to your output from #5, and describe similarities and differences between the two sets of output.

No, I still conclude that factor A is significant and Factor B is not significant. The p-values for factor A and factor B in the two models ( the main-effects only model and the full factorial model) do not differ much. Factor A (study preference) continues to be significant, whereas Factor B (anxiety level) remains not significant.

  1. How has your eta-squared for the full-factorial model changed from the value in #3?

Eta-squared has increased from ²= .240 (Adjusted ²=.178) to ²= .389 (Adjusted ²=.280) .

We appear to be explaining a lot more variation in midterm scores with the full factorial model.

  1. Plot the cell means for all 9 groups on one graph. You can use SPSS (get the marginal means from the Plots button, or do an error bar plot) or do it by hand – just be sure to label your plot correctly. Here we give the marginal plots too, but they

are not necessary.

  1. What does this plot tell you about the study preference by anxiety interaction – what do you see in the plot? Compare this plot to the means plot from #4.

For the review-only group, the factor B (anxiety level) does not seem to have a strong effect on the midterm score, since students in the three anxiety sub-groups presented relatively similar mean scores. For the study –only and review+study groups, however, the three anxiety groups (factor B), show differences. The medium anxiety sub-group shows a very high midterm score in the study-only group. There, too much or too little anxiety is not good. In the review+study group, level of anxiety seems inversely related to score – low anxiety students did best, high-anxiety did worst, and the “mediums” were in the middle. There appears to be an interaction effect between the factors A and B that affects the means.

For students with low anxiety levels, the review-and-study preference group performed much better than the other two study preference groups, while for the medium anxiety level, students in the study-only group achieved higher scores than the review-only group.

There appears to be a disordinal interaction between the levels of factor A (study preference), and the levels of factor B (anxiety level).

  1. Finally, you need to check the assumptions of the ANOVA model. Use the Options button and check the homogeneity-of-variance box and the residual-plot box. Are the assumptions satisfied? Be sure to give a complete and thorough answer to this question.

a)Scores are independent. This depends on the research design. In our case, we assume that there is no problem with the design of the study, the subjects represent a random sample of the population defined by particular cell, and there was no problem with the data-collection process. Also the groups selected (study preferences and anxiety level) are independent and they are completely crossed. The levels of the two independent variables (study preferences and anxiety level) exhaust the possible levels of interest of the researcher (so we have a fixed-effects ANOVA).

b)Normal distribution within groups. We assume normality. The scores of a particular cell of the design are assumed to be sampled from a population of scores normally distributed. We can check this by saving our residuals and making a histogram.

c)Homoskedasticity: equal variances across cells. We assess this using the results of the Levene’s test below, with which we fail to reject the hypothesis that the variances across cells are equal. The F-test value is smaller than the F-critical (0.05,8,44)=2.95.

Also, the plot below shows the relationship between the predictors and residuals. Considering that we have such a small sample, the variances appear relatively equal, supporting the assumption of homoskedasticity. Also we do not see any notable correlation between the predicted values (group means) and the residuals.

Part II. Analysis of Covariance

For this part of the assignment, you will add a new variable, homework score, as a covariate in the model you have used above. Below is the original data, with the homework score appearing below the original data. Add this new variable to your data set. Be sure to copy them precisely in this order. (The homework score of each student is listed below the student’s midterm score.)

The midterm score for students who only attended a review session:

Low Anxiety Medium Anxiety High Anxiety

76,88,75,75,56,7378,72,76,68,74,74 82,69,72,73,71,74

HWK: 81,76,90,88,67,8767,76,90,72,80,8388,73,75,72,84,81

The midterm score for students who only studied by themselves:

Low Anxiety Medium Anxiety High Anxiety

78,74,76,77,75,7889,91,88,90,78,8678,77,78,84,56,82

HWK: 87,72,82,83,80,8394,95,88,94,74,8490,77,82,76,79,78

The midterm score for students who studied and attended a review session:

Low Anxiety Medium Anxiety High Anxiety

84,90,87,88,79,7778,81,76,82,85,7784,82,67,83,78,71

HWK: 90,97,71,95,76,8387,84,80,78,98,9487,73,69,74,80,78

  1. First, find out whether homework score is related to midterm score.
  1. Plot the scatterplot and run the correlation between the two variables. What do you conclude about the relationship? Based on this plot, do you think homework score might make a good covariate? Explain.

The scatterplot above shows that there is a moderate positive relationship between midterm score and homework score. As shown in the correlation table, the relationship between midterm score and homework score is significant (r = .504). In a real data analysis, we would want to examine the two cases with very low midterm scores that look a little like outliers.

  1. Now make another scatterplot of midterm and homework, but this time Set Markers by study preference. Edit the plot and use chart options to add the subgroup fit lines. Does this plot change your ideas about using homework score as a covariate? Explain.

Yes, the scatterplot above shows that there are different effects of homework score depending on the study preference group membership. Even though the relationship is still positive, the midterm scores for students in the “study only” group seem to correlate differently with homework score than the midterm scores for students in both the “review and study” and “review only” group.

This scatterplot raises questions about the use of homework score as a covariate since it shows different relationships between scores and homework scores, based on group membership .

  1. Next, test whether the covariate and the main effect (in this case, study preference) interact. Select the general linear model analysis, but remove the anxiety factor from your previous run. Use midterm for the outcome, study preference as a fixed factor and homework as a covariate. Select the Model button, and click custom. Move the factor study preference and homework (the covariate) to the model box. Then highlight both study preference and homework simultaneously, and click the arrow to move them over. This will create the interaction term in the window. Run the analysis. Is the interaction between study preference and homework score significant? What evidence do you have to support your decision?


The interaction between study preference and homework score is not statistically significant at the 0.05 level, based on the p-value = .355 > 0.05.