Tidbits
o Test 1 mean=
o Test 2 standard deviation = = s = 9.73
o N = n = 100
o What general statements can you make?
o Going to the lab UH112 to do DDXL Feb. 10
Onto chapter 7.
In chapters 1-4 we dealt with 1 qualitative value (like birth season) or 2 qualitative values (like birth season and sex).
In chapters 5 and 6 we dealt with 1 quantitative value (like credit hours). But what about 2 quantitative values?
EX1. AASU prof’s salaries and age
EX2. Time of day and # cars on campus
EX3. distance traveled to AASU and exam grade
EX4. amount of education and salary
EX5. height and IQ
View scatterplots
1. Look for direction
2. Look for form
3. Look for scatter – watch for outliers
We say that 2 values are correlated if their scatterplot shows a tight linear relationship.
We can calculate a value r called the correlation coefficient. Properties:
1. -1 r 1 (r is between –1 and +1)
2. We may use one of the values as the
independent=explanatory=predictor variable
and call it X. The other value is the
dependent=response variable and is called Y.
EX. Amount of education vs. salary
3. r is labelless
4. No matter how close to -1 or 1 r is we can not use that to infer causation.
EX. Number of students late to class vs.
Admits to ER 8:30-9:45
5.If r is near 0 we see wide scatter and we say “a big X can pair with either a big or a small Y”. If r is near 1 we say “big X’s pair with big Y’s”. If r is near –1 we say ‘big X’s pair with small Y’s”
CORRELATION WORKSHEET
DATA
Col. 1Col.2 Col.3Col.4Col.5
X Y ZX ZY ZX* ZY
# of grade
letters
......
891
584
480
675
7 61
Mean = 6.0Mean = 78.2
Standard Standard
Deviation s = 1.58Deviation s = 11.26
1. Put X’s z-scores in column 3. 2. Put Y’s z-scores in column 4. 3. Multiply column 3 and column 4 together and place in column 5. 4. Total the numbers in column 5.
5. Divide that total by (#pairs - 1). What is r’s interpretation?
Devaux p. 120
6.
G 90
R
A 70
D
E 50
30
10
1 2 3 4 5 6 7 8 9
NAME
Plot the points. What do you see in form and direction? 7. Does it make sense to conduct a regression analysis? Do so regardless of your answer. DDXL provides the following:
CORRELATION WORKSHEET
DATA
Col. 1Col.2 Col.3Col.4Col.5
X Y ZX ZY ZX* ZY
# of grade
letters
......
1.266 / 1.137 / 1.439 / 13.0-0.633 / 0.515 / -0.326 / 5.7
-1.266 / 0.160 / -0.202 / 1.6
0.000 / -0.284 / 0.000 / -3.2
0.633 / -1.528 / -0.967 / -16.9
891
584
480
675
761
Mean = 6.0Mean = 78.2
Standard Standard
Deviation s = 1.58Deviation s = 11.26
1. Put X’s z-scores in column 3. 2. Put Y’s z-scores in column 4. 3. Multiply column 3 and column 4 together and place in column 5. 4. Total the numbers in column 5.
5. Divide that total by (#pairs - 1). What is r’s interpretation? Ans: so near
zero we say no correlation…no linear association.
CORRELATION WORKSHEET
DATA
Col. 1Col.2 Col.3 Col.4Col.5 (sum of product of z-scores)/(n-1)
X Y ZX ZY ZX* ZY
# of grade
letters
......
8 91
5 84
4 80
6 75
7 61
Mean = 6.0Mean = 78.2
Standard Standard
Deviation s = 1.58Deviation s = 11.26
1. Put X’s z-scores in column 3. 2. Put Y’s z-scores in column 4. 3. Multiply column 3 and column 4 together and place in column 5. 4. Total the numbers in column 5.
5. Divide that total by (#pairs - 1). What is r’s interpretation
6.
G 90
R
A 70
D
E 50
30
10
1 2 3 4 5 6 7 8 9
NAME
Plot the points. What do you see in form and direction?
Note: Omit section ‘straightening scatterplots’ on p. 122
Data from in-class discussion
X=Miles to campus Y=odometer reading Z for xZ for Y product
(in 1,000 miles) of 2 prior cols
5 6 -.9-1.31.17
10126-.5.4-.2
30176 1.4.101.4
1597 0-.10
______
mean=15 mean=101.2 2.37 so r=2.37/(4-1)= .8
stdev=10.8 stdev=71.4
1