251solngr2-051 4/12/05

(Open this document in 'Page Layout' view!) Graded Assignment 2 Name:

Class days and time:

Student number:

Modify the data below as follows: Add the last digit of your student number to 7 in problem 1; Add the second to last number to the 9 in problem 2.

1) For the following joint probability table (i) check for independence, (ii) Compute and , (iii) Compute or and or, (iv) Compute and from the results in (ii) and (iii), (iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .

2) For the following sample (i) Compute the sample mean and variance of , (ii) Compute or and or , (iii) Compute the sample mean and variance of from the results in (i) and (ii). (iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .

9 / 2
4 / 4
6 / 2
2 / 5
1 / 7
1 / 6
3 / 5
10 / -3

Solution: Assume that Seymour Butz’s number is 555555. The table becomes

(i) Check for independence: First you need to find and . Look at the upper left hand probability below. Its value is .11 and it represents . If and are independent , we would have . Since this is not true, and cannot be independent. Even one place where the joint probability is not the product of the marginal probabilities is enough. If this one is not enough to convince you, how about . Notice that the second row is not proportional to the first row.

A zero covariance or correlation would be the consequence of independence, but it is not true that a zero correlation or covariance would prove independence. We have already seen one example where there is a zero correlation, but no independence.


(ii) Compute and : Looking below, we find and .

(iii) Compute and .

13

251solngr2-051 4/12/05

To summarize , , and

13

251solngr2-051 4/12/05

,

and . So that . () The correlation and covariance are negative, indicating a tendency of to fall when rises. hardly exists on a zero to one scale, indicating that the relationship is barely there. Note that always!

(iv) Compute and .

and

To check this do the computations below.

If we run down the columns of the table:

3 / 1 / 4 / .11
3 / 3 / 6 / .10
3 / 9 / 12 / .12
5 / 1 / 6 / .08
5 / 3 / 8 / .14
5 / 9 / 14 / .08
12 / 1 / 13 / .16
12 / 3 / 15 / .10
12 / 9 / 21 / .11


Now collect probabilities that belong to the same value. For example,

4 / .11 / 0.44 / 1.76
6 / .18 / 1.08 / 6.48
8 / .14 / 1.12 / 8.96
12 / .12 / 1.44 / 17.28
13 / .16 / 2.08 / 27.04
14 / .08 / 1.12 / 15.68
15 / .10 / 1.50 / 22.50
21 / .11 / 2.31 / 48.51
1.00 / 11.09 / 148.21

(v) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .

Solution: 251v2out says

and , where has the value or depending on whether the product of and is negative or positive. and .

and

2) 2) For the following sample (i) Compute the sample mean and variance of , (ii) Compute or and or , (iii) Compute the sample mean and variance of from the results in (i) and (ii). (iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .

13

251solngr2-051 4/12/05

The original data

9 / 2
4 / 4
6 / 2
2 / 5
1 / 7
1 / 6
3 / 5
10 / -3

Becomes

14 / 2
4 / 4
6 / 2
2 / 5
1 / 7
1 / 6
3 / 5
10 / -3

13

251solngr2-051 4/12/05

The entire table is below with required computations.

Row

1 14 196 2 4 28

2 4 16 4 16 16

3 6 36 2 4 12

4 2 4 5 25 10

5 1 1 7 49 7

6 1 1 6 36 6

7 3 9 5 25 15

8 10 100 -3 9 -30

sum 41 363 28 168 64

So and .

Then and .

(i) , . ( and ).

(ii) and The correlation and covariance are negative, indicating a tendency of to fall when rises. is fairly large on a zero to one scale, indicating that the relationship is moderately strong. Note that always!

(iii)

.

To check this do the computations below.

14 / 2 / 16 / 256
4 / 4 / 8 / 64
6 / 2 / 8 / 64
2 / 5 / 7 / 49
1 / 7 / 8 / 64
1 / 6 / 7 / 49
3 / 5 / 8 / 64
10 / -3 / 7 / 49
69 / 659

So as above and

. Notice how much larger the variation is in and individually than in . This is reflected in the small variance.

Computer results follow. N* means missing measurements.

Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum

x 8 0 5.13 1.65 4.67 1.00 1.25 3.50 9.00 14.00

y 8 0 3.50 1.12 3.16 -3.00 2.00 4.50 5.75 7.00

x+y 8 0 8.63 1.07 3.02 7.00 7.00 8.00 8.00 16.00

(iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .

Solution: 251v2out says

and , where has the value or depending on whether the product of and is negative or positive. and .

and

Appendix: Minitab Computations (This is mostly a reminder to me of how I checked my work – but there is enough info here to run the routines and I’m happy to give them to anyone who wants them.)

Population Correlation Problem

————— 4/12/2005 9:12:27 PM ————————————————————

Welcome to Minitab, press F1 for help.

Results for: 1gr2-051aa.MTW

MTB > WSave "C:\Documents and Settings\rbove\My Documents\Minitab\1gr2-051aa.MTW";

SUBC> Replace.

Saving file as: 'C:\Documents and Settings\rbove\My

Documents\Minitab\1gr2-051aa.MTW'

MTB > echo

MTB > Execute "C:\Documents and Settings\rbove\My Documents\Minitab\251popcorr.mtb" 1.

Executing from file: C:\Documents and Settings\rbove\My Documents\Minitab\251popcorr.mtb

MTB > #251popcorr

MTB > # Computes population covariance and correlation

MTB > # Put x in C15, y in c17

MTB > # Put a joint probability table in c10 - c14

MTB > # Fill table with zeros to make it 5 by 5.

MTB > name k10 'varx'

MTB > name k11 'vary'

MTB > name k12 'Exy'

MTB > name k13 'covxy'

MTB > name k14 'corr'

MTB > name k15 'sdx'

MTB > name k16 'sumpx'

MTB > name k17 'sdy'

MTB > name k18 'sumpy'

MTB > name k25 'Ex'

MTB > name k26 'Ex2'

MTB > name k27 'Ey'

MTB > name k28 'Ey2'

MTB > execute 'marg973'

Executing from file: marg973.MTB (Input was Probabilities in c10-c14, x in c15, y in c17)

MTB > #marg973.mtb #part of 251popcorr Computes marginal probabilities

MTB > let c18=c10+c11+c12+c13+c14

MTB > let c16(1)=sum(c10)

MTB > let c16(2)=sum(c11)

MTB > let c16(3)=sum(c12)

MTB > let c16(4)=sum(c13)

MTB > let c16(5)=sum(c14)

MTB > let k16=sum(c16) #These are sums of x and y

MTB > let k18=sum(c18) #probabilities and should be 1.

MTB > print c10-c18

Data Display

Row C10 C11 C12 C13 C14 C15 C16 C17 C18

1 0.11 0.08 0.16 0 0 3 0.33 1 0.35

2 0.10 0.14 0.10 0 0 5 0.30 3 0.34

3 0.12 0.08 0.11 0 0 12 0.37 9 0.31

4 0.00 0.00 0.00 0 0 0 0.00 0 0.00

5 0.00 0.00 0.00 0 0 0 0.00 0 0.00

MTB > end (End means the end of an exec and a return to Popcorr)

MTB > execute 'meany973'

Executing from file: meany973.MTB

MTB > #meany.973.mtb part of 251popcorr

MTB > print k16,k18

Data Display

sumpx 1.00000 (A check to see if probabilities add to 1)

sumpy 1.00000

MTB > let c20=c10*c17 (c20-c24 are products for E(xy))

MTB > let c20=c20*c15(1)

MTB > let c21=c11*c17

MTB > let c21=c21*c15(2)

MTB > let c22=c12*c17

MTB > let c22=c22*c15(3)

MTB > let c23=c13*c17

MTB > let c23=c23*c15(4)

MTB > let c24=c14*c17

MTB > let c24=c24*c15(5)

MTB > let c25=c15*c16 # xP(x)

MTB > let k25=sum(c25) # E(x)

MTB > let c27=c17*c17

MTB > let k27=sum(c27) # E(y)?

MTB > end

MTB > execute 'meanz973'

Executing from file: meanz973.MTB

MTB > #meanz973.mtb part of 251popcorr

MTB > let c27=c17*c18

MTB > let k27=sum(c27) #E(y)

MTB > end

MTB > execute 'exysq973'

Executing from file: exysq973.MTB

MTB > #exysq973.mtb part of 251popcorr

MTB > let c26=c15*c25 # xsqP(x)

MTB > let c28=c17*c27 # ysqP(y)

MTB > let k26=sum(c26) # E(xsq)

MTB > let k28=sum(c28) # E(ysq)

MTB > print c20-c28

Data Display

Row C20 C21 C22 C23 C24 C25 C26 C27 C28

1 0.33 0.4 1.92 0 0 0.99 2.97 0.35 0.35

2 0.90 2.1 3.60 0 0 1.50 7.50 1.02 3.06

3 3.24 3.6 11.88 0 0 4.44 53.28 2.79 25.11

4 0.00 0.0 0.00 0 0 0.00 0.00 0.00 0.00

5 0.00 0.0 0.00 0 0 0.00 0.00 0.00 0.00

MTB > print k25-k28


Data Display

Ex 6.93000

Ex2 63.7500

Ey 4.16000

Ey2 28.5200

MTB > end

MTB > execute 'xyvar973'

Executing from file: xyvar973.MTB

MTB > #xyvar973.mtb part of popcorr

MTB > let k10=k26-k25*k25 #variance of x

MTB > let k11=k28-k27*k27 #variance of y

MTB > print k10 k11

Data Display

varx 15.7251

vary 11.2144

MTB > let k20=sum(c20) (Column sums)

MTB > let k21=sum(c21)

MTB > let k22=sum(c22)

MTB > let k23=sum(c23)

MTB > let k24=sum(c24)

MTB > let k12 = k20+k21+k22+k23+k24 # E(xy)

MTB > print k20-k24, k12

Data Display

K20 4.47000

K21 6.10000

K22 17.4000

K23 0

K24 0

Exy 27.9700

MTB > end

MTB > execute 'cov973'

Executing from file: cov973.MTB

MTB > #cov973.mtb part of popcorr

MTB > let k13=k12-k25*k27 #Covariance

MTB > let k15=sqrt(k10) #St. dev of x

MTB > let k17=sqrt(k11) #St. dev of y

MTB > let k14=k13/k15

MTB > let k14=k14/k17 #Corr(x, y)

MTB > print k13-k18

Data Display

covxy -0.858800 Covariance

corr -0.0646707 Correlation

sdx 3.96549 Std. deviation of x

sumpx 1.00000 Sum of x probabilities

sdy 3.34879 Std. deviation of y

sumpy 1.00000 Sum of y probabilities

MTB > end

MTB > execute 'tb2973'

Executing from file: tb2973.MTB

MTB > #tb2973.mtb Part of 251popcorr

MTB > let k30=30

MTB > let k31=10

MTB > let k32=1

MTB > execute 'tb2s973' 5

Executing from file: tb2s973.MTB

MTB > #tb2s973.mtb Subroutine of tb2973

MTB > # Part of popcorr

MTB > let ck30=ck31

MTB > let ck30(6)=c16(k32)

MTB > let ck30(7)=c25(k32)

MTB > let ck30(8)=c26(k32)

MTB > let k30=k30+1

MTB > let k31=k31+1

MTB > let k32=k32+1

MTB > end

MTB > #tb2s973.mtb Subroutine of tb2973

MTB > # Part of popcorr

MTB > let ck30=ck31

MTB > let ck30(6)=c16(k32)

MTB > let ck30(7)=c25(k32)

MTB > let ck30(8)=c26(k32)

MTB > let k30=k30+1

MTB > let k31=k31+1

MTB > let k32=k32+1

MTB > end

MTB > #tb2s973.mtb Subroutine of tb2973

MTB > # Part of popcorr

MTB > let ck30=ck31

MTB > let ck30(6)=c16(k32)

MTB > let ck30(7)=c25(k32)

MTB > let ck30(8)=c26(k32)

MTB > let k30=k30+1

MTB > let k31=k31+1

MTB > let k32=k32+1

MTB > end

MTB > #tb2s973.mtb Subroutine of tb2973

MTB > # Part of popcorr

MTB > let ck30=ck31

MTB > let ck30(6)=c16(k32)

MTB > let ck30(7)=c25(k32)

MTB > let ck30(8)=c26(k32)

MTB > let k30=k30+1

MTB > let k31=k31+1

MTB > let k32=k32+1

MTB > end

MTB > #tb2s973.mtb Subroutine of tb2973

MTB > # Part of popcorr

MTB > let ck30=ck31

MTB > let ck30(6)=c16(k32)

MTB > let ck30(7)=c25(k32)

MTB > let ck30(8)=c26(k32)

MTB > let k30=k30+1

MTB > let k31=k31+1

MTB > let k32=k32+1

MTB > end

MTB > let c35=c18

MTB > let c35(6)=k18

MTB > let c35(7)=k25

MTB > let c35(8)=k26

MTB > let c36=c27

MTB > let c36(6)=k27

MTB > let c37=c28

MTB > let c37(6)=k28


MTB > print c30-c37

Data Display (The final table)

Row C30 C31 C32 C33 C34 C35 C36 C37

1 0.11 0.08 0.16 0 0 0.35 0.35 0.35

2 0.10 0.14 0.10 0 0 0.34 1.02 3.06

3 0.12 0.08 0.11 0 0 0.31 2.79 25.11

4 0.00 0.00 0.00 0 0 0.00 0.00 0.00

5 0.00 0.00 0.00 0 0 0.00 0.00 0.00

6 0.33 0.30 0.37 0 0 1.00 4.16 28.52

7 0.99 1.50 4.44 0 0 6.93

8 2.97 7.50 53.28 0 0 63.75

MTB > write c30-c37.

Data Display (WRITE)

0.11 0.08 0.16 0 0 0.35 0.35 0.35

0.10 0.14 0.10 0 0 0.34 1.02 3.06

0.12 0.08 0.11 0 0 0.31 2.79 25.11

0.00 0.00 0.00 0 0 0.00 0.00 0.00

0.00 0.00 0.00 0 0 0.00 0.00 0.00

0.33 0.30 0.37 0 0 1.00 4.16 28.52

0.99 1.50 4.44 0 0 6.93 * *

2.97 7.50 53.28 0 0 63.75 * *

* NOTE * Column lengths not equal.

MTB > end.

MTB > execute 'tb3973'

Executing from file: tb3973.MTB

MTB > #tb3973.mtb final printout for 252popcorr

MTB >

MTB > write c20-c24;

SUBC> replace.

Data Display (WRITE) (Products for E(x y) again)

0.33 0.4 1.92 0 0

0.90 2.1 3.60 0 0

3.24 3.6 11.88 0 0

0.00 0.0 0.00 0 0

0.00 0.0 0.00 0 0

MTB > end

MTB > end

Sample Correlation Problem

MTB > exec '251samcov'

Executing from file: 251samcov.MTB

MTB > #251samcov Computes sample variances

MTB > # and covariances using 'var973'

MTB > #Input is x column in c40, y column in c42

MTB > # Example in Covex

MTB > name c40 'x'

MTB > name c41 'xsq'

MTB > name c42 'y'

MTB > name c43 'ysq'

MTB > name c44 'xy'

MTB > name k40 'sumx'