STAT 2607
Assignment # 5
Due: Tues. April 3
1.A researcher studied the effect of three experimental diets with varying fat contents on the total lipid (fat) level in plasma. Total lipid level is a widely used predictor of coronary heart disease. 15 male subjects who were within 20% of their ideal body weight were put into 5 groups according to age. Within each age grouping, the 3 experimental diets were randomly assigned to the 3 subjects. Data on reduction in lipid level (grams per litre) after the subjects were on the diet for a fixed period of time follow.
Fat Content of Diet
AgeExtremely LowFairly LowModerately LowTotal
15-24 0.73 0.670.15 1.55
25-34 0.86 0.750.21 1.82
35-44 0.94 0.810.26 2.01
45-54 1.40 1.32 0.75 3.47
55-64 1.62 1.410.78 3.81
Total5.55 4.962.1512.66
whereTSS = 13.4436
a)Set up the ANOVA table and test whether the data provide sufficient evidence to indicate a difference in the average reduction in lipid level for the three different diets? Use α = .01.
b)If appropriate, use the Tukey multiple comparison method to determine which diets differ in mean lipid level reduction. Use α = .01.
2.A soft-drink manufacturer uses 4 agents (1, 2, 3, 4) to handle premium distributions for its various products. The marketing director wanted to study the timeliness with which the premiums are distributed. 20 transactions for each agent were selected at random, and the time lapse (in days) for handling each transaction was determined. The output resulting from using PROC ANOVA in SAS to obtain the analysis of variance table is given below, along with a residual plot and horizontal bar charts. Use the SAS output to answer the following questions.
a) SSTr = ______b) TSS = ______c) SSE = ______
d) treatment df = _____e) error df = ______f) total df = ______
g) calculated F for testing vs HA: not all μi equal is to ______
h) the p-value for the F-test is ______
l)based on the residual plot, is there a violation of the assumption that the variances are the same for the 4 populations of agent transaction times? Why?
YESNO
m)Based on the horizontal bar charts would you say there were severe violations of the assumption that the 4 populations are all approximately normally distributed? Why?
YESNO
The SAS System 4
The ANOVA Procedure
Class Level Information
Class Levels Values
agent 4 1 2 3 4
Number of observations 80
The SAS System 5
The ANOVA Procedure
Dependent Variable: time
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 3 2244.537500 748.179167 106.71 <.0001
Error 76 532.850000 7.011184
Corrected Total 79 2777.387500
R-Square Coeff Var Root MSE time Mean
0.808147 14.38080 2.647864 18.41250
Source DF Anova SS Mean Square F Value Pr > F
agent 3 2244.537500 748.179167 106.71 <.0001
The SAS System 7
Plot of ei*agent. Symbol used is '*'.
10 ˆ
‚
‚
‚
‚
‚
5 ˆ
‚ * * *
‚ * *
ei ‚ *
‚ * * * *
‚ * * * *
‚ * *
0 ˆ * *
‚ * * * *
‚ * * *
‚ *
‚ * * *
‚ * * * *
‚ * *
-5 ˆ
‚
‚
‚
‚
‚
‚
-10 ˆ
Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆ
1 2 3 4
agent
NOTE: 25 obs hidden.
The SAS System 49
------agent=1 ------
Residual Cum. Cum.
Midpoint Freq Freq Percent Percent
‚
0 ‚*************** 3 3 15.00 15.00
‚
2 ‚******************** 4 7 20.00 35.00
‚
4 ‚****************************** 6 13 30.00 65.00
‚
6 ‚******************** 4 17 20.00 85.00
‚
8 ‚*************** 3 20 15.00 100.00
‚
Šƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆ
1 2 3 4 5 6
Frequency
The SAS System 50
------agent=2 ------
Residual Cum. Cum.
Midpoint Freq Freq Percent Percent
‚
-2.5 ‚*************** 3 3 15.00 15.00
‚
0.0 ‚************************* 5 8 25.00 40.00
‚
2.5 ‚**************************************** 8 16 40.00 80.00
‚
5.0 ‚********** 2 18 10.00 90.00
‚
7.5 ‚********** 2 20 10.00 100.00
‚
Šƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆ
1 2 3 4 5 6 7 8
Frequency
The SAS System 51
------agent=3 ------
Residual Cum. Cum.
Midpoint Freq Freq Percent Percent
‚
-12.5 ‚******************** 4 4 20.00 20.00
‚
-10.0 ‚****************************** 6 10 30.00 50.00
‚
-7.5 ‚**************************************** 8 18 40.00 90.00
‚
-5.0 ‚***** 1 19 5.00 95.00
‚
-2.5 ‚***** 1 20 5.00 100.00
‚
Šƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆ
1 2 3 4 5 6 7 8
Frequency
The SAS System 52
------agent=4 ------
Residual Cum. Cum.
Midpoint Freq Freq Percent Percent
‚
-11 ‚********** 2 2 10.00 10.00
‚
-9 ‚************************* 5 7 25.00 35.00
‚
-7 ‚******************** 4 11 20.00 55.00
‚
-5 ‚****************************** 6 17 30.00 85.00
‚
-3 ‚*************** 3 20 15.00 100.00
‚
Šƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆƒƒƒƒˆ
1 2 3 4 5 6
Frequency
3.A building contractor employs 3 construction engineers A, B, and C to estimate and bid on jobs. To determine whether one estimator tends to be more conservative (or more liberal) than others, the contractor selected 4 projected construction jobs and had each estimator independently estimate the cost of each job. The data and the SAS output for the problem are given below. Use the output to help answer the questions (a) to (k).
Construction Job
Estimator 1 2 3 4Total
A35.1034.5029.2531.60130.45
B37.4534.6033.1034.40139.55
C36.3035.1032.4532.90136.75
Total108.85104.2094.8098.90406.75
a) SSTr = ______b) SSB = ______c) TSS = ______
d) SSE = ______e) treatment df = ______f) block df = ______
g) error df = ______h) total df = ______
i) For testing vs HA: not all treatment means equal
the calculated value of F is ______
j) the overall or family significance level for the Tukey tests is ______
k) is there evidence to conclude at α = .01 that the average cost differs between
estimators A and B ? ______estimators B and C ? ______
The SAS System 14
The ANOVA Procedure
Class Level Information
Class Levels Values
estimator 3 A B C
job 4 1 2 3 4
Number of observations 12
The SAS System 15
The ANOVA Procedure
Dependent Variable: cost
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 5 48.46895833 9.69379167 12.84 0.0037
Error 6 4.52833333 0.75472222
Corrected Total 11 52.99729167
R-Square Coeff Var Root MSE cost Mean
0.914555 2.562992 0.868748 33.89583
Source DF Anova SS Mean Square F Value Pr > F
estimator 2 10.86166667 5.43083333 7.20 0.0255
job 3 37.60729167 12.53576389 16.61 0.0026
The SAS System 16
The ANOVA Procedure
Tukey's Studentized Range (HSD) Test for cost
NOTE: This test controls the Type I experimentwise error rate.
Alpha 0.05
Error Degrees of Freedom 6
Error Mean Square 0.754722
Critical Value of Studentized Range 4.33902
Minimum Significant Difference 1.8848
Comparisons significant at the 0.05 level are indicated by ***.
Difference
estimator Between Simultaneous 95%
Comparison Means Confidence Limits
B - C 0.7000 -1.1848 2.5848
B - A 2.2750 0.3902 4.1598
C - B -0.7000 -2.5848 1.1848
C - A 1.5750 -0.3098 3.4598
A - B -2.2750 -4.1598 -0.3902
A - C -1.5750 -3.4598 0.3098