ANOVA Two Factor Models

Script

ANOVA – Two Factor Models

Slide 1

Welcome back. In this module we discuss 2-factor ANOVA models.

Slide 2

In 2-factor models
we are interested at looking at whether two factors, which we’ll call Factors A and B, act either individually or interact to affect the outcome of a certain response variable.
Factor A has little a levels
And Factor B has little b levels
Thus the total number of treatments, that is, combinations of factors A and B is a times b.
So consider a big experiment where each possible observation in the experiment will fall into one and only one of these a times b treatments. We then will select randomly and independently, the same number, r, from each of these treatments. r stands for “replications”. Thus this is a 2-factor experiment with r replications for each treatment.
Given that there are r observations from each of the a times b treatments, this means the total number sampled, n, is r times a times b.
We make a familiar set of assumptions in 2-factor ANOVA
We assume that the distribution of responses for each of the a times b treatments is normal
That the standard deviations of these treatments, although unknown, are equal
And, as we said, we select the samples randomly and independently from each treatment.

Slide 3

The analysis is similar a randomized block experiment except that the replications tend to reduce the variability even further since we are provided with more information by selecting more than one observation for each treatment. Let’s see now how the sum of squares and degrees of freedom are partitioned for this case.
We get the Total sum of squares, SST, in the usual way of subtracting the grand mean from all observations, squaring them and summing them all up; and the total degrees of freedom, as always, is n minus 1, which is r times a times b minus 1.
Now these are divided into two parts: Treatment, with sums of squares due to treatment, SSTr, and degrees of freedom due to treatment equal to the number of treatments, a times b, minus 1
And Error with sums of squares equal to the total sums of squares minus the treatment sums of squares, and degrees of freedom equal to the total degreesof freedom, n minus 1,minus the treatment degrees of freedom, a times b minus 1. This can be simplified to a times b times the quantity r minus 1.
But Treatment can be further broken down into Factor A, with sums of squares SSA and degrees of freedom a minus 1
Factor B, with sums of squares SSB and degrees of freedom b minus 1
And interaction effects, I, with sums of squares equal to the difference in the treatment sums of squares and the Factor A and Factor B sums of squares, and degrees of freedom equal to Treatment degrees of freedom minus Factor A and Factor B degrees of freedom. This can be shown to simplify to the quantity a minus 1 times the quantity b minus 1.

Slide 4

So the resulting ANOVA table is constructed as follows:
The total sum of squares equals the treatment sum of squares plus the sum of squares due to error
But since treatment sum of squares is further broken down into Factor A sum of squares, Factor B sum of squares and interaction sum of squares
The table begins by calculating the sums of squares and degrees of freedom for each of these quantities. The mean square values for Factors A and B and for interaction are found by dividing their respective sums of squares by their degrees of freedom
Note that the overall sample size n is r times a times b, so that the total degrees of freedom is r times a times b minus 1.
Now that these are in place, the sums of squares due to error can be calculated by
subtracting the sums of squares for factors A and B and for interaction from the total sums of squares.
The degrees of freedom due to error is found by subtracting the degrees of freedom for Factors A, B, and interaction from the total degrees of freedom. Finally, the mean square due to error is found by dividing the sums of squares due to error by the degrees of freedom due to error.

Slide 5

So here’s how we do our t-factor analysis.
First,
We ask the question whether two factors interact to affect the mean values.
This is an ordinary F-test of Mean Square Interaction over Mean Square Error compared to F sub alpha with degrees of freedom for interaction in the numerator and the degrees of freedom for error in the denominator
If the answer is YES, we stop right there. We conclude these factors interact to cause changes in the mean values and conclude our analysis with this observation.
But if we cannot conclude that there is interaction (notice we didn’t say we do conclude that there is not interaction) we test whether
Factor A individually affects the mean values
This is an F-test of Factor A’s Mean Square over Mean Square Error compared to F sub alpha with degrees of freedom for Factor A in the numerator and the degrees of freedom for error in the denominator
And we test whether Factor B individually affects the mean values
This is an F-test of Factor B’s Mean Square over Mean Square Error compared to F sub alpha with degrees of freedom for Factor B in the numerator and the degrees of freedom for error in the denominator.

Slide 6

We now look at two examples. Now the calculations for the sums of squares are relatively straightforward. But they are cumbersome to do. So we will only illustrate the calculations and analyses using Excel. In the first example,
Suppose we wish to know if diet programs and exercise affect weight loss in men.
The experiment we will be doing is a 2-factor experiment where the two factors are:
Diet and Exercise programs
We will look at a = 4 different diet plans: no diet, a low calorie diet, a low carbohydrate diet, and a modified liquid diet
And we will look at b = 3 different levels of exercise: no exercise, exercising for an hour three times a week, and exercising for an hour daily.
We will sample four men randomly from each diet exercise treatment – that is four men who have no diet and do no exercise, four men who exercise 3 times a week and are on the low carb diet, four who are on the modified liquid diet and exercise daily and so forth. So we will have 4 replications of 4 times 3 or twelve different treatments – so we will have 48 total observations.
The response variable we will be looking at is the weight loss in these men over a 3-month period. Note a weight gain will be recorded as a negative weight loss.

Slide 7

Here we see the systematic input for the four observations of the 12 diet-exercise combinations.
To perform these analyses in Excel, we select ANOVA 2-factor with replication from the DATA ANALYSIS entry on the TOOLS menu.
This brings up this simple dialogue box. But unlike previous dialogue boxes, we have no choice this time –
We MUST include one row and one column of labels. Now we have two rows of labels, rows 1 and 2, so we must start our input in row 2 with cell A2 – and we highlight all the data from that cell down to cell D18
“Rows per sample” is Excel’s way of asking how many replications for each treatment are there? – and in this case we have 4 observations for each of the 12 treatments. Also we designate a cell where we wish the output to begin, in this case, cell A21.

Slide 8

Now the output gives a lot of interesting data beginning in cell A21, but to draw conclusions about our hypotheses, we need only to look at the p-values for the corresponding tests.
This time Excel calls the sources of variation Sample, Columns, Interaction and Within. We know what we have in the columns are the exercise programs.
This means that Sample refers to the diets.
We know what interaction means, so “Within” this time, is the error.
The first thing we check is the p-value for interaction. .486255 is a large p-value so we cannot conclude that for men, diet and exercise interact to affect weight loss from this study.
Now let’s look at the individual factors. The p-value for diet is .118354. This is large also, so we also cannot conclude that diet alone affects weight loss in men.
But we see that the p-value for exercise is low, .000168, so from this study the conclusion is that for men, exercise alone affects weight loss.

Slide 9

In the second example, suppose we wish to know if diet programs and exercise affect weight loss in women.
Again this is a 2-factor experiment where the two factors are:
Diet and Exercise programs
We will again look at a = 4 different diet plans
And b = 3 different exercise levels
We will sample four women randomly from each diet exercise
The response variable we will be looking at is the weight loss in these women over a 3-month period.

Slide 10

Here we see the systematic input for the women
We again select ANOVA 2-factor with replication from the DATA ANALYSIS entry on the TOOLS menu.
Fill in the dialogue box in the same way
Including one row and one column of labels, so that the input range is from cell A2 down to cell D18
Again there are four replications. And we designate cell A21 as the cell where we wish the output to begin.

Slide 11

Again the output gives a lot of interesting information, but we will concentrate only on the p-values for our hypothesis tests.
Again columns is exercise.
Samples refers to the diets
And “Within” is the Error
We begin again by looking at the p-value for interaction. This time it is a low p-value of .02979, which is less than an alpha value of .05. So we can conclude, that for women, based on this study, diet and exercise interact to affect weight loss.
Since we do conclude interaction, we conclude our analysis.

Slide 12

Let’s review what we’ve discussed in this module.
In this module we have focused on 2-factor designs
Where the population means could be affected by Factor A or Factor B individually, or by interaction between Factors A and B.
We listed the assumptions for these 2-factor experiments
We showed how to calculate the degrees of freedom for factors A, B, Interaction, and Error
And the relationship between the sums of squares for these quantities.
And we again reiterated that mean square values were gotten by dividing sums of squares by degrees of freedom
We said that the approach is to
First check for interaction, and if you can conclude the factors interact to affect the response variable, we conclude our analyses.
But if we cannot conclude that there is interaction, we perform two separate F-tests to determine if the factors act independently to affect the response variable.
These analyses were illustrated in Excel using its Anova: 2 Factor With Replication entry in DATA ANALYSIS.

That’s it for this module. Do any assigned homework and I’ll be back to talk to you again next time.