Statistics 5372: Experimental Statistics

Homework Due March 1 -- Key

Data Set/Problem Description

A 2-factor fixed effects analysis of variance was run to examine the stability of a drug product stored at 4 storage times (1,3,6, and 9 months) and in 2 labs. The vials were analyzed for mg/mL of the active ingredient and pH at the end of the specified storage times in order to assess stability. A 2-factor ANOVA was run for each variable. There were 3 replicates for each laboratory/storage time combination. The data are given on page 935 of the text.

Variable 1: Active Ingredient (in mg/mL) at End of Storage Period

Key Results of the Analysis

The analysis of variance table (Table 1) for the analysis of mg/mL indicates that there is a significant interaction between lab and storage time (p = .0003). Thus, we do not test for main effects, but instead we compare the cell means. The cell means for both variables are given in Table 2. LSD for comparing cell means at the  level of significance is

.

The cell means are given in Table 2 and are plotted via interaction plots in Figure 1. In Table 3 we provide the LSD multiple comparison calculations and provide a display of the resulting significant differences. From the interaction plots we get the impression that the difference in lab storage facility makes more difference for 3 and 6 month storage than it does for 1 or 9 month storage. From the multiple comparisons, it can be seen that lab 1 reading at 3 months is significantly higher than all other readings except lab 1 at 1 month. The lowest 3 readings were from both labs at 9 months and lab 2 at 6 months. The lab 2 reading at 3 months was significantly different from all other readings. The 1 month readings from the two labs were not significantly different from each other or from the 1 month reading from lab 1.

In Figure 2 we present side-by-side boxplots of the data as well as a probability plot and histogram of the residuals. The box plots show a fairly consistent variation and suggest that equality of variances is a reasonable assumption. The probability plot and especially the histogram suggest that the normality assumption may be questionable.

Conclusionsin the Language of the Problem

There is very little difference between the two labs after 1 and 9 months. However, at 3 and 6 months, lab 2 storage resulted in significantly less active ingredient suggesting that lab 2 is less desirable. Also apparent is the general decline in active ingredient present as storage time increases as would be expected. It is interesting to note that the curious increase in active ingredient from 1 month to 3 months in lab 1 is not significant.

Variable 1: pH

Key Results of the Analysis

The analysis of variance table (Table 4) for the analysis of pH shows that there is not a significant interaction between lab and storage time (p = .4038) while there is a significant difference between labs (p < .0001) and among times (p = .0001). The interaction plot is shown in Figure 3 which agrees with the lack of significant interaction in that no distinct patterns of interaction appear. Since there are only two labs, there is no need to perform multiple comparisons on labs. However, these are shown in Table 4 where it can be seen that lab 2 storage results in significantly higher pH readings. The comparisons of times given in Table 4 shows that the pH at 6 months is significantly higher than the other 3 time periods and that the 1 month reading is significantly higher than the 3 month reading.

In Figure 4 we present side-by-side boxplots of the data as well as a probability plot and histogram of the residuals. The box plots show some dramatic differences among variances. However, a definitive assessment of the equality of variances is not available because of the small number of replicates (i.e. n = 3). The histogram of the residuals looks fairly bell-shaped but the probability plot does suggest possible non-normality.

Conclusions in the Language of the Problem

In general, storage in lab 2 seems to consistently result in higher pH readings than for lab 1. The storage time comparisons of pH readings give confusing results in that the 6-month reading is significantly higher than all others. There was no consistent pattern of decline or increase in pH in time as might have been expected. If lower pH is desirable (?) then these results along with those for the mg/mL of active ingredient suggest that lab 1 is the preferred storage facility.

Appendices:

A. Tables and Figures Cited in the Report

Table 1. 2-Factor ANOVA - Ex 15.41, page 935 -- mg/mL Data

The GLM Procedure

Dependent Variable: mgml

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 7 0.46740000 0.06677143 27.30 <.0001

Error 16 0.03913333 0.00244583

Corrected Total 23 0.50653333

R-Square Coeff Var Root MSE mgml Mean

0.922743 0.165090 0.049455 29.95667

Source DF Type III SS Mean Square F Value Pr > F

time 3 0.29376667 0.09792222 40.04 <.0001

lab 1 0.09126667 0.09126667 37.32 <.0001

time*lab 3 0.08236667 0.02745556 11.23 0.0003

Table 2. Cell Means for Ex.6, page 550

------time=1 lab=1 ------

The MEANS Procedure

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 30.0900000 0.0556776

ph 3.5933333 0.0208167

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

------time=1 lab=2 ------

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 30.0800000 0.0529150

ph 3.8366667 0.0351188

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

------time=3 lab=1 ------

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 30.1700000 0.0655744

ph 3.4766667 0.0251661

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

------time=3 lab=2 ------

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 29.9000000 0.0500000

ph 3.7500000 0.0500000

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

------time=6 lab=1 ------

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 30.0066667 0.0404145

ph 3.7033333 0.1289703

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

------time=6 lab=2 ------

The MEANS Procedure

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 29.8000000 0.0500000

ph 3.9000000 0

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

------time=9 lab=1 ------

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 29.8066667 0.0152753

ph 3.5800000 0.0264575

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

------time=9 lab=2 ------

Variable Mean Std Dev

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

mgml 29.8000000 0.0500000

ph 3.7566667 0.0152753

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Table 3. Calculations for LSD comparisons of mg/mL Cell Means

T3L1 T1L1 T1L2 T6L1 T3L2 T9L1 T6L2 T9L2

30.17 30.09 30.08 30.01 29.90 29.81 29.80 29.80

Comparison Actual Difference (lsd = .086)

T3L1 vs T9L2 .37

T3L1 vs T6L2 .37

T3L1 vs T9L1 .36

T3L1 vs T3L2 .27

T3l1 vs T6L1 .16

T3L1 vs T1L2 .09

T3L1 vs T1L1 .38 X

T1L1 vs T9L2 .29

T1L1 vs T6L2 .29

T1L1 vs T9L2 .28

T1L1 vs T3L2 .19

T1L1 vs T6L1 .08 X

T1L2 vs T9L2 .28

T1L2 vs T6L2 .28

T1L2 vs T9L2 .27

T1L2 vs T3L2 .18

T6L1 vs T9L2 .21

T6L1 vs T6L2 .21

T6L1 vs T9L2 .20

T6L1 vs T3L2 .17

T3L2 vs T9L2 .10

T3L2 vs T6L2 .10

T3L2 vs T9L2 .09

T9L1 vs T9L2 .01 X

T3L1 T1L1 T1L2 T6L1 T3L2 T9L1 T6L2 T9L2

30.17 30.09 30.08 30.01 29.90 29.81 29.80 29.80

------

------

Table 4. 2-Factor ANOVA - Ex 15.41, page 935 -- pH Data

The GLM Procedure

Dependent Variable: ph

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 7 0.42016250 0.06002321 21.47 <.0001

Error 16 0.04473333 0.00279583

Corrected Total 23 0.46489583

R-Square Coeff Var Root MSE ph Mean

0.903778 1.429232 0.052876 3.699583

Source DF Type III SS Mean Square F Value Pr > F

time 3 0.11444583 0.03814861 13.64 0.0001

lab 1 0.29703750 0.29703750 106.24 <.0001

time*lab 3 0.00867917 0.00289306 1.03 0.4038

Table 5. LSD Multiple Comparisons for pH

t Tests (LSD) for ph

NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha 0.05

Error Degrees of Freedom 16

Error Mean Square 0.002796

Critical Value of t 2.11991

Least Significant Difference 0.0647

Means with the same letter are not significantly different.

t Grouping Mean N time

A 3.80167 6 6

B 3.71500 6 1

B

C B 3.66833 6 9

C

C 3.61333 6 3

Alpha 0.05

Error Degrees of Freedom 16

Error Mean Square 0.002796

Critical Value of t 2.11991

Least Significant Difference 0.0458

Means with the same letter are not significantly different.

t Grouping Mean N lab

A 3.81083 12 2

B 3.58833 12 1

Figure 1. Interaction Plots for mg/mL

Histogram for mg/mL Residuals

Figure 2. Diagnostic Plots for mg/mL

Figure 3. Interaction Plots for pH

Histogram for pH Residuals

Figure 4. Diagnostic Plots for mg/mL

B. SAS Code

DATA one;

INPUT time lab mgml ph dual$;

datalines;

1 1 30.03 3.61 T1L1

1 1 30.10 3.60 T1L1

1 1 30.14 3.57 T1L1

3 1 30.10 3.50 T3L1

3 1 30.18 3.45 T3L1

3 1 30.23 3.48 T3L1

6 1 30.03 3.56 T6L1

6 1 30.03 3.74 T6L1

6 1 29.96 3.81 T6L1

9 1 29.81 3.60 T9L1

9 1 29.79 3.55 T9L1

9 1 29.82 3.59 T9L1

1 2 30.12 3.87 T1L2

1 2 30.10 3.80 T1L2

1 2 30.02 3.84 T1L2

3 2 29.90 3.70 T3L2

3 2 29.95 3.80 T3L2

3 2 29.85 3.75 T3L2

6 2 29.75 3.90 T6L2

6 2 29.85 3.90 T6L2

6 2 29.80 3.90 T6L2

9 2 29.75 3.77 T9L2

9 2 29.85 3.74 T9L2

9 2 29.80 3.76 T9L2

;

procprintdata=one;

run;

procboxplotdata=one;

plot mgml*dual;

title'Boxplots for mg/mL Data';

run;

procboxplotdata=one;

plot ph*dual;

title'Boxplots for pH Data';

run;

PROCGLMdata=one;

CLASS time lab;

MODEL mgml=time lab lab*time;

outputout=newmgml r=residm;

means time lab/lsd;

TITLE'2-Factor ANOVA - Ex 15.41, page 935 -- mg/mL Data';

run;

PROCSORTdata=one;BY time lab;

PROCMEANSmeanstddata=one;BY time lab; OUTPUTOUT=cells MEAN=mgml ph;

Title'Cell Means for Ex.6, page 550 Data';

RUN;

PROCGPLOTdata=cells;

PLOT mgml*time=lab;

Title'Interaction Plot - mg/mL Data';

SYMBOL1V=CIRCLE I=JOIN C=BLACK;

SYMBOL2V=DOT I=JOIN C=BLACK;

RUN;

PROCGPLOT;

PLOT mgml*lab=time;

SYMBOL1V=1I=JOIN C=BLACK;

SYMBOL2V=3I=JOIN C=BLACK;

SYMBOL3V=6I=JOIN C=BLACK;

SYMBOL4V=9I=JOIN C=BLACK;

RUN;

procunivariatedata=newmgml normalplot;

var resid;

title'Normal Probability Plot for Residuals - mg/mL Data';

run;

procgchartdata=newmgml;

title'Histogram for mg/mL Residuals';

vbar residm;

run;

PROCSORTdata=one;BY time lab;

PROCUNIVARIATEdata=one plot;

var mgml; by time lab;

run;

PROCGLMdata=one;

CLASS time lab;

MODEL ph=time lab lab*time;

outputout=newph r=residph;

means time lab/lsd;

TITLE'2-Factor ANOVA - Ex 15.41, page 935 -- pH Data';

run;

PROCSORTdata=one;BY time lab;

PROCGPLOTdata=cells;

PLOT ph*time=lab;

Title'Interaction Plot - pH Data';

SYMBOL1V=CIRCLE I=JOIN C=BLACK;

SYMBOL2V=DOT I=JOIN C=BLACK;

RUN;

PROCGPLOTdata=cells;

PLOT ph*lab=time;

SYMBOL1V=1I=JOIN C=BLACK;

SYMBOL2V=3I=JOIN C=BLACK;

SYMBOL3V=6I=JOIN C=BLACK;

SYMBOL4V=9I=JOIN C=BLACK;

RUN;

procunivariatedata=newmgml normalplot;

var resid;

title'Normal Probability Plot for Residuals - pH Data';

run;

procgchartdata=newph;

title'Histogram for pH Residuals';

vbar residph;

run;

PROCSORTdata=one;BY time lab;

PROCUNIVARIATEdata=one plot;

var ph; by time lab;

*------+

| Generated: Tuesday, March 1, 2005 20:34:37 |

| Data: C:\DOCUME~1\00013961\LOCALS~1\Temp\SAS Temporary Files\_TD768\Newmgml |

+------*;

title;

footnote;

title1"Probability Plot for mg/mL Residuals";

*** Probability Plots ***;

goptionsftext=SWISS ctext=BLACK htext=1 cells;

symbolv=SQUARE c=BLUE h=1 cells;

procunivariatedata=Work.Newmgml noprint;

var RESIDM;

probplot / caxes=BLACK cframe=CXF7E1C2 waxis= 1

hminor=0vminor=0 name='PROB'

normal( mu=est sigma=est color=BLUE l=1

w=1);

insetnormal;

run;

symbol;

goptionsftext= ctext= htext=;

*------+

| Generated: Tuesday, March 1, 2005 20:38:45 |

| Data: C:\DOCUME~1\00013961\LOCALS~1\Temp\SAS Temporary Files\_TD768\Newph |

+------*;

title;

footnote;

title1"Probability Plot for pH Residuals";

*** Probability Plots ***;

goptionsftext=SWISS ctext=BLACK htext=1 cells;

symbolv=SQUARE c=BLUE h=1 cells;

procunivariatedata=Work.Newph noprint;

var RESIDPH;

probplot / caxes=BLACK cframe=CXF7E1C2 waxis= 1

hminor=0vminor=0 name='PROB'

normal( mu=est sigma=est color=BLUE l=1

w=1);

insetnormal;

run;

symbol;

goptionsftext= ctext= htext=;

run;