HW #15 #16 #17 – Correlation and Regression

1. (IN CLASS) Here are data on 14 marathons from 2004. Marathon winning times and number of finishers are given. All the marathons had between 100 and 2000 finishers.

139 finishers = X / 9892 seconds = Y / Journey’s Eagle River, WI
142 / 10394 / Pocatello, ID
264 / 10567 / YMCA Bismark, ND
1079 / 9270 / Little Rock, AR
639 / 10406 / Silicon Valley, CA
247 / 9940 / Mt Rushmore, SD
457 / 10208 / Frederick, MD
156 / 9432 / TriCities, Richland, WA
1019 / 9011 / Mercedes, Birmingham, AL
1556 / 8936 / Napa Valley, CA
150 / 10314 / Eisenhower, Abilene, KS
1808 / 9334 / Lakefront, Milwaukee, WI
925 / 9141 / San Antonio, TX
119 / 9606 / Stowe, VT

1.Make a scatterplot, let x=finishers and y=winning time.

2.Find r

3.Find least squares line

4 Predict winning time for a marathon with 700 finishers

5. Interpret the slope

6. What % of the difference in winning times can be attributed to the least squares line on finishers?

7. What do you think of the prediction in #4?

8. Predict the winning time for a marathon with 30,000 finishers.

9.What do you think of the prediction in #8?

10.Do the data provide good evidence, use 5% significance level that the population correlation coefficient is negative?

11.Give a 95% confidence interval for the winning time in one marathon with 700 finishers.

12.Give a 95% confidence interval for the average winning time in all marathons with 700 finishers.

2. (ANSWER GIVEN) Fourteen of Minnesota’s sweet-corn producing counties were randomly selected and the following information about acres planted (100s of acres) and yield rate (tons of sweet corn per acre harvested).

County / X=acres / Y=yield rate
Waseca / 60 / 5.90
Freeborn / 70 / 6.41
Martin / 20 / 5.90
Dakota / 52 / 5.00
McLeod / 22 / 5.50
Redwood / 62 / 5.90
Stearns / 10 / 6.30
Dodge / 46 / 5.80
Kandiyohi / 20 / 6.50
Olmstead / 87 / 5.60
Goodhue / 54 / 5.70
Meeker / 15 / 6.00
Nicollet / 35 / 5.91
Sherburne / 10 / 7.00

1.Make a scatterplot

2.Find r

3.Find the least squares line and graph it on the scatterplot.

4.Predict the yield in a county with 50 (100s) of acres planted.

5. Interpret the slope.

6.What % of the differences in yield rate is due to the regression line on acres?

7.Do the data provide good evidence, use 5% significance level that the population correlation coefficient is not 0?

8.Give a 95% confidence interval for the yield in one county with 50 (100s) of acres planted.

9.Give a 95% confidence interval for the average yield in all counties with 50 (100s) of acres planted.

3. (SOLUTION GIVEN)Given are data on amount of pesticide per acre and the number of a cotton arthropod pests in a trap in a cotton field are given. Note that the trap allows live arthropods to leave, but the dead ones can’t.

X=pesticide applied per acre (lbs. per acre) / 7.1 / 6.4 / 8.8 / 7.4 / 10.1 / 5.2 / 7.0 / 6.0 / 6.2 / 4.6 / 8.0 / 9.0 / 8.0
Y=arthropods trapped / 35 / 36 / 60 / 38 / 55 / 15 / 16 / 17 / 18 / 18 / 56 / 55 / 53

1.Make a scatterplot

2.Find r

3.Find the least squares line and graph it on the scatterplot.

4.Predict the number of arthropods trapped in a colony with field with 8 lbs of pesticide per acre.

5. Interpret the slope.

6.What % of the differences in arthropods trapped is due to the regression line on pesticide applied?

7.Do the data provide good evidence, use 5% significance level that the population correlation coefficient is positive?

8.Give a 95% confidence interval for the number of arthropods trapped in one field with 8 lbs of pesticide per acre.

9.Give a 95% confidence interval for the average number of arthropods trapped in allfields with 8 lbs of pesticide per acre.

4. (HOMEWORK) Below are data on 23 circular plots of the same size, for each plot the number of cottonwood stumps cut down by beavers and the number of beetle larvae clusters are given.

X=Stumps / 2 / 2 / 1 / 3 / 3 / 4 / 3 / 1 / 2 / 5 / 1 / 3 / 2 / 1
Y=Larvae / 10 / 30 / 12 / 24 / 36 / 40 / 43 / 11 / 27 / 56 / 18 / 40 / 25 / 8
X=Stumps / 2 / 2 / 1 / 1 / 4 / 1 / 2 / 1 / 4
Y=Larvae / 14 / 21 / 16 / 6 / 54 / 9 / 13 / 14 / 50

1-3 will be count as HW15, 4-6 for HW16, 7-9 for HW17.

1.Make a scatterplot

2.Find r

3.Find the least squares line and graph it on the scatterplot.

4.Predict the number of larvae clusters in an area with 4 stumps.

5. Interpret the slope.

6.What % of the differences in larvae clusters is due to the regression line on stumps.

7.Do the data provide good evidence, use 5% significance level that the population correlation coefficient is positive?

8.Give a 95% confidence interval for the number of larvae clusters in one area with 4 stumps.

9.Give a 95% confidence interval for the average number of larvae clusters in all areas with 4 stumps.

5. (ALTERNATE HW) Suppose more data is found for the questions in #5. Redo #5 with the added data.

Stumps / 6 / 2 / 1 / 1 / 3
Larvae / 61 / 20 / 9 / 6 / 44