Tidbits

o Test 1 mean=

o Test 2 standard deviation =  = s = 9.73

o N = n = 100

o What general statements can you make?

o Going to the lab UH112 to do DDXL Feb. 10

Onto chapter 7.

In chapters 1-4 we dealt with 1 qualitative value (like birth season) or 2 qualitative values (like birth season and sex).

In chapters 5 and 6 we dealt with 1 quantitative value (like credit hours). But what about 2 quantitative values?

EX1. AASU prof’s salaries and age

EX2. Time of day and # cars on campus

EX3. distance traveled to AASU and exam grade

EX4. amount of education and salary

EX5. height and IQ

View scatterplots

1. Look for direction

2. Look for form

3. Look for scatter – watch for outliers

We say that 2 values are correlated if their scatterplot shows a tight linear relationship.

We can calculate a value r called the correlation coefficient. Properties:

1. -1 r 1 (r is between –1 and +1)

2. We may use one of the values as the

independent=explanatory=predictor variable

and call it X. The other value is the

dependent=response variable and is called Y.

EX. Amount of education vs. salary

3. r is labelless

4. No matter how close to -1 or 1 r is we can not use that to infer causation.

EX. Number of students late to class vs.

Admits to ER 8:30-9:45

5.If r is near 0 we see wide scatter and we say “a big X can pair with either a big or a small Y”. If r is near 1 we say “big X’s pair with big Y’s”. If r is near –1 we say ‘big X’s pair with small Y’s”

CORRELATION WORKSHEET

DATA

Col. 1Col.2 Col.3Col.4Col.5

X Y ZX ZY ZX* ZY

# of grade

letters

......

891

584

480

675

7 61

Mean = 6.0Mean = 78.2

Standard Standard

Deviation s = 1.58Deviation s = 11.26

1. Put X’s z-scores in column 3. 2. Put Y’s z-scores in column 4. 3. Multiply column 3 and column 4 together and place in column 5. 4. Total the numbers in column 5.

5. Divide that total by (#pairs - 1). What is r’s interpretation?

Devaux p. 120

6.

G 90

R

A 70

D

E 50

30

10

1 2 3 4 5 6 7 8 9

NAME

Plot the points. What do you see in form and direction? 7. Does it make sense to conduct a regression analysis? Do so regardless of your answer. DDXL provides the following:

CORRELATION WORKSHEET

DATA

Col. 1Col.2 Col.3Col.4Col.5

X Y ZX ZY ZX* ZY

# of grade

letters

......

1.266 / 1.137 / 1.439 / 13.0
-0.633 / 0.515 / -0.326 / 5.7
-1.266 / 0.160 / -0.202 / 1.6
0.000 / -0.284 / 0.000 / -3.2
0.633 / -1.528 / -0.967 / -16.9

891

584

480

675

761

Mean = 6.0Mean = 78.2

Standard Standard

Deviation s = 1.58Deviation s = 11.26

1. Put X’s z-scores in column 3. 2. Put Y’s z-scores in column 4. 3. Multiply column 3 and column 4 together and place in column 5. 4. Total the numbers in column 5.

5. Divide that total by (#pairs - 1). What is r’s interpretation? Ans: so near

zero we say no correlation…no linear association.

CORRELATION WORKSHEET

DATA

Col. 1Col.2 Col.3 Col.4Col.5 (sum of product of z-scores)/(n-1)

X Y ZX ZY ZX* ZY

# of grade

letters

......

8 91

5 84

4 80

6 75

7 61

Mean = 6.0Mean = 78.2

Standard Standard

Deviation s = 1.58Deviation s = 11.26

1. Put X’s z-scores in column 3. 2. Put Y’s z-scores in column 4. 3. Multiply column 3 and column 4 together and place in column 5. 4. Total the numbers in column 5.

5. Divide that total by (#pairs - 1). What is r’s interpretation

6.

G 90

R

A 70

D

E 50

30

10

1 2 3 4 5 6 7 8 9

NAME

Plot the points. What do you see in form and direction?

Note: Omit section ‘straightening scatterplots’ on p. 122

Data from in-class discussion

X=Miles to campus Y=odometer reading Z for xZ for Y product

(in 1,000 miles) of 2 prior cols

5 6 -.9-1.31.17

10126-.5.4-.2

30176 1.4.101.4

1597 0-.10

______

mean=15 mean=101.2 2.37 so r=2.37/(4-1)= .8

stdev=10.8 stdev=71.4

1