Fall, 2015Thursday, Nov. 19

Stat 217 – Day 33

Analysis of Variance (Comparing Several Means)

Names>

Investigation 8 (To be submitted, with partner, by start of class Tuesday):

Because about two-thirds of Americans are considered overweight, weight loss is big business. There many different types of diets, but do some work better than others? Is low fat better than low carb, or is some combination best? Gardner et al. (2007) conducted a study similar to the one we saw yesterday. Women aged 25 to 50 with body mass indices (BMIs) of 27 to 40 (overweight and obese) were random assigned to one of four diets: Atkins (very low in carbohydrates), Zone (40:30:30 ratio of carbs, protein, fat, low in carbohydrates), Ornish (low fat), and LEARN (Lifestyle, Exercise, Attitudes, Relationships, and Nutrition; low in fat, based on national guidelines).

The 311 women who volunteered for the program were educated on their assigned diet and were observed periodically as they stayed on the diet for a year. At the end of the year, the researchers calculated the changes (pre-post) in BMI (kg/m2) for each woman and compared the results across the four diets.

(a) Open the DietsBMI.txt file and copy and paste into the Multiple Means applet. Check the box to show boxplots as well. Include a screen capture of your graphs and the descriptive statistics. Summarize what you learn about the differences in the change in BMI among these four diets.

(b) Compare this to the results yesterday’s handout. Do you think this study will be more or less statistically significant? Explain your reasoning.

(c) Compute the max - min statistic for these data. (Show your work.)

(d) Use the applet to carry out a randomization test to assess the statistical significance of your observed statistic:

  • Select the Max-Min from the Statistic pull-down menu and verify your calculation.
  • Check the Show Shuffle Options box.
  • Select the Plot radio button.
  • Press Shuffle Responses.

Notice that the response outcomes are shuffled and reassigned at random to the four groups. Which diet group does the lowest BMI (largest negative) end up in after your reassignment? Is that the same as the original study?

Now

  • Change the Number of Shuffles to 999 and press Shuffle Responses.
  • Use the Count Samples box to approximate the p-value for this study.

Include a screen capture of your results (null distribution and p-value)

(e) Now use the Statistic pull-down menu to change to the MAD statistic. Verify the calculation of this statistic. (Show your work.)

(f) Now find the p-value corresponding to this statistic. How does the p-value compare? Include a screen capture of your results (null distribution and p-value)

Comparison of p-values >

Although the MAD statistic is simple to calculate, it doesn’t include all relevant information to our comparison. This is ok because when we find the p-value from simulation, that will use the observed data. But we would also like a “standardized” statistic that looks at information like the sample sizes involved to give us a more informative measure of how different these groups are relative to what we might expect by random chance alone. We will now explore another statistic looks at how much the individual means deviate from the overall mean, standardized by a measure of the natural variation in the data. In fact, this statistic can be thought of as the ratio of two variances (“variance” = standard deviation squared).

(g) To explore this statistic, switch to the Descriptive Statistics applet (keeping the other one option still).

  • Type in the four group means for the original data. (Clear the existing data first and uncheck the Includes header box.)
  • Have the applet calculate the standard deviation of these four group means.

Write a one sentence interpretation of this number.

(h) To estimate the natural variation in these BMI values, we want to basically average the “within group” variability across the four groups. Take the four group standard deviations and average them. (You can use the same applet or your calculator?) Write a one sentence interpretation of this number.

(i) Now compare your answer to (g) to your answer to (h) by finding the ratio of the squares of these values(“variance” = standard deviation squared), multiplying the numerator by 77, roughly the sample size of each group.

77 x Variability between the group means / variability within the groups

(j) Explain why this might be considered a more useful test statistic in helping us judge whether we have larger differences between the four diets than we might expect by random chance alone.

The statistic you have calculated is very close to what is more generally referred to as the F-statistic. And this process is called “Analysis of Variance” or ANOVA for short. (See Ch. 9 for the actual F-statistic formula.)

(k) Back in the Multiple Means applet, change the Statistic pull-down menu to F-statistic. You should get an observed value in the ballpark of what you just calculated. Now generate a null distribution for this statistic and use it to approximate the p-value. How does this p-value compare to what you found above with the other 2 statistics?

(l) The main advantage of this F-statistic is it can usually be well-approximated by an F-distribution (named after the famous statistician R. A. Fisher). Check the box to Overlay F distribution and display the theory-based p-value. Include a screen capture of your output.Does this appear to be a reasonable approximation of the simulation results?

Validity conditions for the ANOVA test
The F-distribution is a good approximation to the null distribution of the F-statistic as long as
  • Either the sample sizesare at least 20 for each of the groups, or the distribution of the response variable is approximately normal in each of the samples (examine the dotplots for strong skewness or outliers).
  • Also, the standard deviations of the samples are approximately equal to each other (largest standard deviation is not more than twice the value of the smallest standard deviation).

(m) Do these validity conditions appear to be met for these data? Explain how you are deciding in each case.

(n) When you do find a statistically significant F-statistic, you will probably be curious as to which group means are significantly different. An allowable follow-up analysis to a significant F-statistic is to essentially compute all the pairwise two-sample t-confidence intervals. Check the box next to Compute 95% confidence interval(s). (On the far left if the F-statistic is selected.) Include a screen capture of your output and summarize what you learn from these intervals. Including would you recommend any of these diets over the others?

(o) Suppose we have less “within group variability” in our data, but that the four sample means were the same. Explain what this means and how it would change the above F-test results (e.g., What would the dotplots/boxplots look like now? Would the p-value increase or decrease?).

(p) Suppose we have less “within group variability” in our data, but that the four sample means were the same as in (n). Suppose we were using the MAD statistic. Explain whether, and if so how, the MAD statistic would change.

(q) Suppose the sample sizes had been on the order of 150 instead of 75 people in each diet. Would this change the value of the F-statistic? If so, larger or smaller? Would this change the value of the MAD statistic? Is so, larger or smaller?