Example of 2-way ANOVA, STA240/ENV298.01, November 6-8, 2001

Example taken from Regression With Graphics by Hamilton, page 95.

Is there a relationship between bedrock type and lung cancer rates? Data are collected from 26 counties in Pennsylvania, New Jersey and New York. Archer (Archives of Environmental Health, 1987, 42:87-91) found a relationship between lung cancer rates and the underlying bedrock type: counties over Reading Prong granite had more cancer. Since these granites emit radon, a potent carcinogen, it seems plausible that radon from the Reading Prong bedrock causes higher cancer rates.

If radon from bedrock causes variation in county cancer rates, and if the variables are well measured, then we might expect that:

1.The cancer/bedrock relationship weakens when adjusted for radon level (since radon is the omitted third variable that really explains the bedrock/cancer correlation).

2.Radon level significantly predicts cancer rates.

Variables

County name: Total of 27 counties represented.

State: 3 states: PA, NJ, NY

Lung cancer: White female lung cancer rate per 100,000 per year, 1950-1969

Bedrock area: Divided into 3 groups: Reading Prong, Fringe Areas, and Control Areas. Reading Prong areas overlie granite bedrock that has been associated with high indoor radon concentrations. Fringe areas border the Reading Prong, and control areas lie outside it.

Mean house radon: Cohen (Archives of Environmental Health, 1988, 43:313-314) reports mean radon concentrations in pCi/L, for living areas of hundreds of individual houses within each county. Categories used here are: low (0-1.5); mid (1.6-2.4); and high (over 2.5). In five counties, Cohen's means are based on fewer than 10 houses.

State Cancer Bedrock.Area Mean.House

Orange NY 6.0 Reading.Prong Low

Putnam NY 10.5 Reading.Prong Mid

Sussex NY 6.7 Reading.Prong Mid

Warren NJ 6.0 Reading.Prong High

Morris NJ 6.1 Reading.Prong Low

Hunterdon NJ 6.7 Reading.Prong High

Berks PA 5.2 Fringe High

Lehigh PA 5.6 Fringe High

Northampton PA 5.8 Fringe High

Pike PA 4.5 Fringe Low

Dutchess NY 5.5 Fringe Mid

Sullivan NY 5.4 Fringe Low

Ulster NY 6.3 Fringe Low

Columbia NY 6.3 Control Mid

Delaware NY 4.3 Control Mid

Greene NY 4.0 Control Mid

Otswego NY 5.9 Control Mid

Tioga NY 4.7 Control Mid

Carbon PA 4.8 Control Mid

Lebanon PA 5.8 Control High

Lackawanna PA 5.4 Control Low

Luzerne PA 5.2 Control Low

Schuylkill PA 3.6 Control High

Susquehanna PA 4.3 Control Low

Wayne PA 3.5 Control Low

Wyoming PA 6.9 Control Mid

> attach(bedrock)

Model A:

> summary(aov(Cancer~Bedrock.Area))

Df Sum of Sq Mean Sq F Value Pr(F)

Bedrock.Area 2 16.90879 8.454396 6.409624 0.006130991

Residuals 23 30.33736 1.319016

Model B:

> summary(aov(Cancer~Mean.House))

Df Sum of Sq Mean Sq F Value Pr(F)

Mean.House 2 2.83898 1.419490 0.7352024 0.4903395

Residuals 23 44.40717 1.930747

Model C:

> summary(aov(Cancer~Mean.House+Bedrock.Area))

Df Sum of Sq Mean Sq F Value Pr(F)

Mean.House 2 2.83898 1.419490 1.186555 0.3249221

Bedrock.Area 2 19.28464 9.642318 8.060040 0.0025259

Residuals 21 25.12254 1.196311

Model D:

> summary(aov(Cancer~Mean.House*Bedrock.Area))

Df Sum of Sq Mean Sq F Value Pr(F)

Mean.House 2 2.83898 1.419490 1.137682 0.3437904

Bedrock.Area 2 19.28464 9.642318 7.728055 0.0041001

Mean.House:Bedrock.Area 4 3.91159 0.977897 0.783757 0.5512584

Residuals 17 21.21095 1.247703

Don't forget to look at residual plots!!