ANOVA Assignment

Due 4/30/03. A statistics instructor prepares 4 versions of an exam. Students are randomly allocated to take one of the four versions. Answer questions completely in the provided space.

The data is found at

  1. Is this an experimental or observational study?
  1. What are the explanatory and response variables? Characterize each as categorical or quantitative.
  1. Obtain boxplots of score (Y) by version (X). Attach (staple) this to this sheet.
  2. Use > Stat > ANOVA > 1-way to produce the ANOVA table. Means for the versions will be displayed. Attach (staple) this to this sheet. (Store, but do not print, the residuals and fits.)
  3. What’s the overall mean for scores? (Use > Stat > Basic Statistics.)
  1. Find the effect for each of the four versions.

Version

/ 1 / 2 / 3 / 4
Effect

What value (number) in the ANOVA table reflects the total (squared) size of these four effects—the total “between groups” variation?

  1. Find the residual for the first observation listed for exam version 3.

What value (number) in the ANOVA table reflects the total (squared) size of the residuals for all the observations—the total “within groups” variation?

  1. Compute and interpret r2.
  1. What causes all the variation within the groups? In other words: What unmeasured factors are contributing to what we pool together and simply call “error” or “residual”?
  1. Write the hypotheses for testing whether there is a relationship between the version of exam and the score. Use the format of Section 16.3 for your null and alternative.
  1. Your hypotheses should include the symbol 1 (VERSION1 or something like that would be fine). Write a complete sentence explaining precisely—no ambiguity—what this symbol represents. Hint: It is not the mean score for those people who actually took version 1.
  1. Produce (and staple) a normal probability plot of the residuals. Compare the within-group standard deviations. Are the relevant assumptions for using the ANOVAF-test violated? Make a statement of your conclusion here.
  1. Most importantly—in order to use the F test (or any of the other tests we’ve discussed in class) we must assess the randomization. Based on the description of the problem, this has been done properly here. Would it be OK if the versions were distributed to people as they entered the room? (For example, the first people to arrive get version 1, etc.) Why (not)?
  1. What are the F-statistic and the P-value? Is there statistically significant evidence of differences among the version means?
  1. If you were to take this exam, would you make any demands regarding the version you get? Or regarding a curve? Why (not)?

ANOVA Assignment1