Statistics 5372: Experimental Statistics
Homework Due March 1 -- Key
Data Set/Problem Description
A 2-factor fixed effects analysis of variance was run to examine the stability of a drug product stored at 4 storage times (1,3,6, and 9 months) and in 2 labs. The vials were analyzed for mg/mL of the active ingredient and pH at the end of the specified storage times in order to assess stability. A 2-factor ANOVA was run for each variable. There were 3 replicates for each laboratory/storage time combination. The data are given on page 935 of the text.
Variable 1: Active Ingredient (in mg/mL) at End of Storage Period
Key Results of the Analysis
The analysis of variance table (Table 1) for the analysis of mg/mL indicates that there is a significant interaction between lab and storage time (p = .0003). Thus, we do not test for main effects, but instead we compare the cell means. The cell means for both variables are given in Table 2. LSD for comparing cell means at the level of significance is
.
The cell means are given in Table 2 and are plotted via interaction plots in Figure 1. In Table 3 we provide the LSD multiple comparison calculations and provide a display of the resulting significant differences. From the interaction plots we get the impression that the difference in lab storage facility makes more difference for 3 and 6 month storage than it does for 1 or 9 month storage. From the multiple comparisons, it can be seen that lab 1 reading at 3 months is significantly higher than all other readings except lab 1 at 1 month. The lowest 3 readings were from both labs at 9 months and lab 2 at 6 months. The lab 2 reading at 3 months was significantly different from all other readings. The 1 month readings from the two labs were not significantly different from each other or from the 1 month reading from lab 1.
In Figure 2 we present side-by-side boxplots of the data as well as a probability plot and histogram of the residuals. The box plots show a fairly consistent variation and suggest that equality of variances is a reasonable assumption. The probability plot and especially the histogram suggest that the normality assumption may be questionable.
Conclusionsin the Language of the Problem
There is very little difference between the two labs after 1 and 9 months. However, at 3 and 6 months, lab 2 storage resulted in significantly less active ingredient suggesting that lab 2 is less desirable. Also apparent is the general decline in active ingredient present as storage time increases as would be expected. It is interesting to note that the curious increase in active ingredient from 1 month to 3 months in lab 1 is not significant.
Variable 1: pH
Key Results of the Analysis
The analysis of variance table (Table 4) for the analysis of pH shows that there is not a significant interaction between lab and storage time (p = .4038) while there is a significant difference between labs (p < .0001) and among times (p = .0001). The interaction plot is shown in Figure 3 which agrees with the lack of significant interaction in that no distinct patterns of interaction appear. Since there are only two labs, there is no need to perform multiple comparisons on labs. However, these are shown in Table 4 where it can be seen that lab 2 storage results in significantly higher pH readings. The comparisons of times given in Table 4 shows that the pH at 6 months is significantly higher than the other 3 time periods and that the 1 month reading is significantly higher than the 3 month reading.
In Figure 4 we present side-by-side boxplots of the data as well as a probability plot and histogram of the residuals. The box plots show some dramatic differences among variances. However, a definitive assessment of the equality of variances is not available because of the small number of replicates (i.e. n = 3). The histogram of the residuals looks fairly bell-shaped but the probability plot does suggest possible non-normality.
Conclusions in the Language of the Problem
In general, storage in lab 2 seems to consistently result in higher pH readings than for lab 1. The storage time comparisons of pH readings give confusing results in that the 6-month reading is significantly higher than all others. There was no consistent pattern of decline or increase in pH in time as might have been expected. If lower pH is desirable (?) then these results along with those for the mg/mL of active ingredient suggest that lab 1 is the preferred storage facility.
Appendices:
A. Tables and Figures Cited in the Report
Table 1. 2-Factor ANOVA - Ex 15.41, page 935 -- mg/mL Data
The GLM Procedure
Dependent Variable: mgml
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 7 0.46740000 0.06677143 27.30 <.0001
Error 16 0.03913333 0.00244583
Corrected Total 23 0.50653333
R-Square Coeff Var Root MSE mgml Mean
0.922743 0.165090 0.049455 29.95667
Source DF Type III SS Mean Square F Value Pr > F
time 3 0.29376667 0.09792222 40.04 <.0001
lab 1 0.09126667 0.09126667 37.32 <.0001
time*lab 3 0.08236667 0.02745556 11.23 0.0003
Table 2. Cell Means for Ex.6, page 550
------time=1 lab=1 ------
The MEANS Procedure
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 30.0900000 0.0556776
ph 3.5933333 0.0208167
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
------time=1 lab=2 ------
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 30.0800000 0.0529150
ph 3.8366667 0.0351188
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
------time=3 lab=1 ------
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 30.1700000 0.0655744
ph 3.4766667 0.0251661
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
------time=3 lab=2 ------
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 29.9000000 0.0500000
ph 3.7500000 0.0500000
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
------time=6 lab=1 ------
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 30.0066667 0.0404145
ph 3.7033333 0.1289703
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
------time=6 lab=2 ------
The MEANS Procedure
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 29.8000000 0.0500000
ph 3.9000000 0
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
------time=9 lab=1 ------
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 29.8066667 0.0152753
ph 3.5800000 0.0264575
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
------time=9 lab=2 ------
Variable Mean Std Dev
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
mgml 29.8000000 0.0500000
ph 3.7566667 0.0152753
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Table 3. Calculations for LSD comparisons of mg/mL Cell Means
T3L1 T1L1 T1L2 T6L1 T3L2 T9L1 T6L2 T9L2
30.17 30.09 30.08 30.01 29.90 29.81 29.80 29.80
Comparison Actual Difference (lsd = .086)
T3L1 vs T9L2 .37
T3L1 vs T6L2 .37
T3L1 vs T9L1 .36
T3L1 vs T3L2 .27
T3l1 vs T6L1 .16
T3L1 vs T1L2 .09
T3L1 vs T1L1 .38 X
T1L1 vs T9L2 .29
T1L1 vs T6L2 .29
T1L1 vs T9L2 .28
T1L1 vs T3L2 .19
T1L1 vs T6L1 .08 X
T1L2 vs T9L2 .28
T1L2 vs T6L2 .28
T1L2 vs T9L2 .27
T1L2 vs T3L2 .18
T6L1 vs T9L2 .21
T6L1 vs T6L2 .21
T6L1 vs T9L2 .20
T6L1 vs T3L2 .17
T3L2 vs T9L2 .10
T3L2 vs T6L2 .10
T3L2 vs T9L2 .09
T9L1 vs T9L2 .01 X
T3L1 T1L1 T1L2 T6L1 T3L2 T9L1 T6L2 T9L2
30.17 30.09 30.08 30.01 29.90 29.81 29.80 29.80
------
------
Table 4. 2-Factor ANOVA - Ex 15.41, page 935 -- pH Data
The GLM Procedure
Dependent Variable: ph
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 7 0.42016250 0.06002321 21.47 <.0001
Error 16 0.04473333 0.00279583
Corrected Total 23 0.46489583
R-Square Coeff Var Root MSE ph Mean
0.903778 1.429232 0.052876 3.699583
Source DF Type III SS Mean Square F Value Pr > F
time 3 0.11444583 0.03814861 13.64 0.0001
lab 1 0.29703750 0.29703750 106.24 <.0001
time*lab 3 0.00867917 0.00289306 1.03 0.4038
Table 5. LSD Multiple Comparisons for pH
t Tests (LSD) for ph
NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.
Alpha 0.05
Error Degrees of Freedom 16
Error Mean Square 0.002796
Critical Value of t 2.11991
Least Significant Difference 0.0647
Means with the same letter are not significantly different.
t Grouping Mean N time
A 3.80167 6 6
B 3.71500 6 1
B
C B 3.66833 6 9
C
C 3.61333 6 3
Alpha 0.05
Error Degrees of Freedom 16
Error Mean Square 0.002796
Critical Value of t 2.11991
Least Significant Difference 0.0458
Means with the same letter are not significantly different.
t Grouping Mean N lab
A 3.81083 12 2
B 3.58833 12 1
Figure 1. Interaction Plots for mg/mL
Histogram for mg/mL Residuals
Figure 2. Diagnostic Plots for mg/mL
Figure 3. Interaction Plots for pH
Histogram for pH Residuals
Figure 4. Diagnostic Plots for mg/mL
B. SAS Code
DATA one;
INPUT time lab mgml ph dual$;
datalines;
1 1 30.03 3.61 T1L1
1 1 30.10 3.60 T1L1
1 1 30.14 3.57 T1L1
3 1 30.10 3.50 T3L1
3 1 30.18 3.45 T3L1
3 1 30.23 3.48 T3L1
6 1 30.03 3.56 T6L1
6 1 30.03 3.74 T6L1
6 1 29.96 3.81 T6L1
9 1 29.81 3.60 T9L1
9 1 29.79 3.55 T9L1
9 1 29.82 3.59 T9L1
1 2 30.12 3.87 T1L2
1 2 30.10 3.80 T1L2
1 2 30.02 3.84 T1L2
3 2 29.90 3.70 T3L2
3 2 29.95 3.80 T3L2
3 2 29.85 3.75 T3L2
6 2 29.75 3.90 T6L2
6 2 29.85 3.90 T6L2
6 2 29.80 3.90 T6L2
9 2 29.75 3.77 T9L2
9 2 29.85 3.74 T9L2
9 2 29.80 3.76 T9L2
;
procprintdata=one;
run;
procboxplotdata=one;
plot mgml*dual;
title'Boxplots for mg/mL Data';
run;
procboxplotdata=one;
plot ph*dual;
title'Boxplots for pH Data';
run;
PROCGLMdata=one;
CLASS time lab;
MODEL mgml=time lab lab*time;
outputout=newmgml r=residm;
means time lab/lsd;
TITLE'2-Factor ANOVA - Ex 15.41, page 935 -- mg/mL Data';
run;
PROCSORTdata=one;BY time lab;
PROCMEANSmeanstddata=one;BY time lab; OUTPUTOUT=cells MEAN=mgml ph;
Title'Cell Means for Ex.6, page 550 Data';
RUN;
PROCGPLOTdata=cells;
PLOT mgml*time=lab;
Title'Interaction Plot - mg/mL Data';
SYMBOL1V=CIRCLE I=JOIN C=BLACK;
SYMBOL2V=DOT I=JOIN C=BLACK;
RUN;
PROCGPLOT;
PLOT mgml*lab=time;
SYMBOL1V=1I=JOIN C=BLACK;
SYMBOL2V=3I=JOIN C=BLACK;
SYMBOL3V=6I=JOIN C=BLACK;
SYMBOL4V=9I=JOIN C=BLACK;
RUN;
procunivariatedata=newmgml normalplot;
var resid;
title'Normal Probability Plot for Residuals - mg/mL Data';
run;
procgchartdata=newmgml;
title'Histogram for mg/mL Residuals';
vbar residm;
run;
PROCSORTdata=one;BY time lab;
PROCUNIVARIATEdata=one plot;
var mgml; by time lab;
run;
PROCGLMdata=one;
CLASS time lab;
MODEL ph=time lab lab*time;
outputout=newph r=residph;
means time lab/lsd;
TITLE'2-Factor ANOVA - Ex 15.41, page 935 -- pH Data';
run;
PROCSORTdata=one;BY time lab;
PROCGPLOTdata=cells;
PLOT ph*time=lab;
Title'Interaction Plot - pH Data';
SYMBOL1V=CIRCLE I=JOIN C=BLACK;
SYMBOL2V=DOT I=JOIN C=BLACK;
RUN;
PROCGPLOTdata=cells;
PLOT ph*lab=time;
SYMBOL1V=1I=JOIN C=BLACK;
SYMBOL2V=3I=JOIN C=BLACK;
SYMBOL3V=6I=JOIN C=BLACK;
SYMBOL4V=9I=JOIN C=BLACK;
RUN;
procunivariatedata=newmgml normalplot;
var resid;
title'Normal Probability Plot for Residuals - pH Data';
run;
procgchartdata=newph;
title'Histogram for pH Residuals';
vbar residph;
run;
PROCSORTdata=one;BY time lab;
PROCUNIVARIATEdata=one plot;
var ph; by time lab;
*------+
| Generated: Tuesday, March 1, 2005 20:34:37 |
| Data: C:\DOCUME~1\00013961\LOCALS~1\Temp\SAS Temporary Files\_TD768\Newmgml |
+------*;
title;
footnote;
title1"Probability Plot for mg/mL Residuals";
*** Probability Plots ***;
goptionsftext=SWISS ctext=BLACK htext=1 cells;
symbolv=SQUARE c=BLUE h=1 cells;
procunivariatedata=Work.Newmgml noprint;
var RESIDM;
probplot / caxes=BLACK cframe=CXF7E1C2 waxis= 1
hminor=0vminor=0 name='PROB'
normal( mu=est sigma=est color=BLUE l=1
w=1);
insetnormal;
run;
symbol;
goptionsftext= ctext= htext=;
*------+
| Generated: Tuesday, March 1, 2005 20:38:45 |
| Data: C:\DOCUME~1\00013961\LOCALS~1\Temp\SAS Temporary Files\_TD768\Newph |
+------*;
title;
footnote;
title1"Probability Plot for pH Residuals";
*** Probability Plots ***;
goptionsftext=SWISS ctext=BLACK htext=1 cells;
symbolv=SQUARE c=BLUE h=1 cells;
procunivariatedata=Work.Newph noprint;
var RESIDPH;
probplot / caxes=BLACK cframe=CXF7E1C2 waxis= 1
hminor=0vminor=0 name='PROB'
normal( mu=est sigma=est color=BLUE l=1
w=1);
insetnormal;
run;
symbol;
goptionsftext= ctext= htext=;
run;