Stat 421, Fall 2008

Fritz Scholz

Homework 9

Due: Monday, December 1, 2008

Problem 1: For the fire station aid response data from the previous homework perform pairwise means analyses using 95% confidence intervals, comparing one fire station against another for all choose(4,2) possible comparisons. Do this using the Fisher protected LSD method, the Bonferroni method, the Tukey-Kramer method and Scheffe's method.

Write a function that computes these sets of 4 confidence intervals for each of the choose(4,2) comparisons and plots them in groups next to each other as done in the class slides, each group identified as to which pair of means is compared. Draw a horizontal line at the zero level.

Also give the confidence interval results in tabular form. Suggestion: you may want to use

write.csv(out,"out.csv"), where out is the appropriate tabular output from your function and then you can cut and paste the view of out.csv from Excel into Word.Give the plots, state and explain your conclusion, and give your function code.

Problem 2: The file workerdata.csv (on the web) contains daily part output for 10 workers over 20 days each.

a)Make a boxplot comparing the part output for the ten workers.

b)Do an ANOVA, testing the equality of the output means for the ten workers. How significant is the result?

c)Check the normality assumption by doing a qqnorm plot in conjunction with qqline based on the residuals from the lm command that was used in anova(lm(…).

d)Examine the output variability for the ten workers using the modified Levene test.

e)Examine the same issue as in d) using the Fmin.test. Is the result consistent with d)?

f)Plot the relationship of si against Xbari on a log-log scale. You do this via plot(mvec,svec,log=”xy”), where mvec is the vector of part output means and svec is the vector of part output standard deviations. Fit a line by using lsfit.out=lsfit(log10(mvec),log10(svec)) and then abline(lsfit.out). lsfit.out$coef[2] gives you the slope. Is the linear relationship a strong one?

g)Use the slope from f) to suggest a simple variance stabilizing transformation of the output data, and repeat steps a)-f) on the transformed part output data, i.e., wherever you used the daily part output previously you would use the transformed daily part output.

h)Why would we want to prefer the second ANOVA over the first, even though both give roughly the same significance?

Document all your findings with the called for plots, and commentary, and give the code that produced the plots and the various analysis results.