Ap Stats 3.3 Correlation and Regression Wisdom

Ap Stats 3.3 Correlation and Regression Wisdom

Ap Stats 3.3 Correlation and Regression Wisdom

Limitations of correlation & regression:

1) only works for linear relationships

2) extrapolation can be unreliable

3) not resistant

Outliers & Influential Observations In Regression

outlier - lie away from the overall pattern of the observations (in y direction have large residuals); “oval rule”, not necessarily influential.

influential observations - if removing them markedly change the calculations (x outliers are often influential)

Example #1:

Height (x) : 62 64 66 70 72 68 60 67 84

Weight (y) : 125 130 140 160 180 155 105 220 270

1) scatterplot

2) regression line + graph

3) r, r2

4) possible outliers?

5) influential? redo the calculations to check

Lurking Variable - variable that’s not explanatory or response, but influences the relationships between the variables

Example #2:

# of MethodistAmt. of Cuban Rum

YearMinisters in BostonImported into Boston

1860638376

1865486406

1870537005

1875648486

1880729595

18858010643

18908511265

18957610071

19008010547

19058311008

191010513885

191514018559

Describe the relationship.

Is there a correlation between more ministries and amount of rum imported?

Homework p 238 59 p 242 63 - 65

Review: p 251 77 - 79

1) Height (x) : 60 64 68 72 63 65

Weight (y) : 100 130 150 200 100 220

a) scatterplot

b) regression line + graph

c) what is the slope and interpret it

d) what is the y-intercept and interpret it

e) r

f) r2 and explain

g) possible outliers?

h) influential? redo the calculations to check

i) residual plot

j) predict weight for height of 24 inches

k) predict height for weight of 250 lbs.

2) For the following data, use the oval rule to determine outliers. Test each outlier to determine if it is influential or not.

SAT-M:400500600650550450500550600650400750200

SAT-V:450510450700500400520450600600750750220

a. Draw the scatterplot of the regression of SAT-V on SAT-M. Interpret it. Use the oval rule to determine outliers. Test each outlier to determine if it is influential or not.

b. Create a residual plot. Use it to interpret the linear fit.

c. Interpret linear fit of SAT-V on SAT-M using r.

d. Find and interpret r2.

e. Find r for the regression of SAT-M on SAT-V.

f. Double each SAT-M score and find r for SAT-V on SAT-M.

g. After d) add 50 points to each SAT-V score and find r for SAT-M on SAT-V.

h. What can we say about r and linear transformations?

3) Is there a relationship between Auto Mechanic Aptitude test and number of hours grade school children watch TV? A group of mechanics was surveyed. Their TV hours is normally distributed with a mean of 20 with a standard deviation of 2 hours. Their average test score is 270 with a standard deviation of 35. If correlation is 0.5110:

a) Find the equation of the best-fit line.

b) Find r2 and explain its meaning.

c) Predict the test score for an auto mechanic who watched TV 37 hours.

4) If the best-fit line for predicting weight from height if ŷ = 5x -120, find the correlation if = 10, sx = 2, = 100, and sy = 30.

5)Shown below is output from Minitab:

Dependent variable is: height

R squared = 98.9%R squared (adjusted) = 98.8%

s = 0.256 with 12 – 2 degrees of freedom

VariableConstants.e. of Coefft-ratioprob

Constant64.92830.50841280.0001

Temp0.6349650.021429.70.0001

a) Find the best-fit line.

b) How many observations were used to create the output?

c) Interpret the relationship.

d) If the independent variable is “age”, find the residual for the observation with age 40 and “height” 100.