251solngr2-051 4/12/05
(Open this document in 'Page Layout' view!) Graded Assignment 2 Name:
Class days and time:
Student number:
Modify the data below as follows: Add the last digit of your student number to 7 in problem 1; Add the second to last number to the 9 in problem 2.
1) For the following joint probability table (i) check for independence, (ii) Compute and , (iii) Compute or and or, (iv) Compute and from the results in (ii) and (iii), (iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .
2) For the following sample (i) Compute the sample mean and variance of , (ii) Compute or and or , (iii) Compute the sample mean and variance of from the results in (i) and (ii). (iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .
9 / 24 / 4
6 / 2
2 / 5
1 / 7
1 / 6
3 / 5
10 / -3
Solution: Assume that Seymour Butz’s number is 555555. The table becomes
(i) Check for independence: First you need to find and . Look at the upper left hand probability below. Its value is .11 and it represents . If and are independent , we would have . Since this is not true, and cannot be independent. Even one place where the joint probability is not the product of the marginal probabilities is enough. If this one is not enough to convince you, how about . Notice that the second row is not proportional to the first row.
A zero covariance or correlation would be the consequence of independence, but it is not true that a zero correlation or covariance would prove independence. We have already seen one example where there is a zero correlation, but no independence.
(ii) Compute and : Looking below, we find and .
(iii) Compute and .
13
251solngr2-051 4/12/05
To summarize , , and
13
251solngr2-051 4/12/05
,
and . So that . () The correlation and covariance are negative, indicating a tendency of to fall when rises. hardly exists on a zero to one scale, indicating that the relationship is barely there. Note that always!
(iv) Compute and .
and
To check this do the computations below.
If we run down the columns of the table:
3 / 1 / 4 / .113 / 3 / 6 / .10
3 / 9 / 12 / .12
5 / 1 / 6 / .08
5 / 3 / 8 / .14
5 / 9 / 14 / .08
12 / 1 / 13 / .16
12 / 3 / 15 / .10
12 / 9 / 21 / .11
Now collect probabilities that belong to the same value. For example,
6 / .18 / 1.08 / 6.48
8 / .14 / 1.12 / 8.96
12 / .12 / 1.44 / 17.28
13 / .16 / 2.08 / 27.04
14 / .08 / 1.12 / 15.68
15 / .10 / 1.50 / 22.50
21 / .11 / 2.31 / 48.51
1.00 / 11.09 / 148.21
(v) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .
Solution: 251v2out says
and , where has the value or depending on whether the product of and is negative or positive. and .
and
2) 2) For the following sample (i) Compute the sample mean and variance of , (ii) Compute or and or , (iii) Compute the sample mean and variance of from the results in (i) and (ii). (iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .
13
251solngr2-051 4/12/05
The original data
9 / 24 / 4
6 / 2
2 / 5
1 / 7
1 / 6
3 / 5
10 / -3
Becomes
14 / 24 / 4
6 / 2
2 / 5
1 / 7
1 / 6
3 / 5
10 / -3
13
251solngr2-051 4/12/05
The entire table is below with required computations.
Row
1 14 196 2 4 28
2 4 16 4 16 16
3 6 36 2 4 12
4 2 4 5 25 10
5 1 1 7 49 7
6 1 1 6 36 6
7 3 9 5 25 15
8 10 100 -3 9 -30
sum 41 363 28 168 64
So and .
Then and .
(i) , . ( and ).
(ii) and The correlation and covariance are negative, indicating a tendency of to fall when rises. is fairly large on a zero to one scale, indicating that the relationship is moderately strong. Note that always!
(iii)
.
To check this do the computations below.
14 / 2 / 16 / 2564 / 4 / 8 / 64
6 / 2 / 8 / 64
2 / 5 / 7 / 49
1 / 7 / 8 / 64
1 / 6 / 7 / 49
3 / 5 / 8 / 64
10 / -3 / 7 / 49
69 / 659
So as above and
. Notice how much larger the variation is in and individually than in . This is reflected in the small variance.
Computer results follow. N* means missing measurements.
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum
x 8 0 5.13 1.65 4.67 1.00 1.25 3.50 9.00 14.00
y 8 0 3.50 1.12 3.16 -3.00 2.00 4.50 5.75 7.00
x+y 8 0 8.63 1.07 3.02 7.00 7.00 8.00 8.00 16.00
(iv) Compute and using the formulas in section K4 of 251v2out or section C1 of 251var2. Note that .
Solution: 251v2out says
and , where has the value or depending on whether the product of and is negative or positive. and .
and
Appendix: Minitab Computations (This is mostly a reminder to me of how I checked my work – but there is enough info here to run the routines and I’m happy to give them to anyone who wants them.)
Population Correlation Problem
————— 4/12/2005 9:12:27 PM ————————————————————
Welcome to Minitab, press F1 for help.
Results for: 1gr2-051aa.MTW
MTB > WSave "C:\Documents and Settings\rbove\My Documents\Minitab\1gr2-051aa.MTW";
SUBC> Replace.
Saving file as: 'C:\Documents and Settings\rbove\My
Documents\Minitab\1gr2-051aa.MTW'
MTB > echo
MTB > Execute "C:\Documents and Settings\rbove\My Documents\Minitab\251popcorr.mtb" 1.
Executing from file: C:\Documents and Settings\rbove\My Documents\Minitab\251popcorr.mtb
MTB > #251popcorr
MTB > # Computes population covariance and correlation
MTB > # Put x in C15, y in c17
MTB > # Put a joint probability table in c10 - c14
MTB > # Fill table with zeros to make it 5 by 5.
MTB > name k10 'varx'
MTB > name k11 'vary'
MTB > name k12 'Exy'
MTB > name k13 'covxy'
MTB > name k14 'corr'
MTB > name k15 'sdx'
MTB > name k16 'sumpx'
MTB > name k17 'sdy'
MTB > name k18 'sumpy'
MTB > name k25 'Ex'
MTB > name k26 'Ex2'
MTB > name k27 'Ey'
MTB > name k28 'Ey2'
MTB > execute 'marg973'
Executing from file: marg973.MTB (Input was Probabilities in c10-c14, x in c15, y in c17)
MTB > #marg973.mtb #part of 251popcorr Computes marginal probabilities
MTB > let c18=c10+c11+c12+c13+c14
MTB > let c16(1)=sum(c10)
MTB > let c16(2)=sum(c11)
MTB > let c16(3)=sum(c12)
MTB > let c16(4)=sum(c13)
MTB > let c16(5)=sum(c14)
MTB > let k16=sum(c16) #These are sums of x and y
MTB > let k18=sum(c18) #probabilities and should be 1.
MTB > print c10-c18
Data Display
Row C10 C11 C12 C13 C14 C15 C16 C17 C18
1 0.11 0.08 0.16 0 0 3 0.33 1 0.35
2 0.10 0.14 0.10 0 0 5 0.30 3 0.34
3 0.12 0.08 0.11 0 0 12 0.37 9 0.31
4 0.00 0.00 0.00 0 0 0 0.00 0 0.00
5 0.00 0.00 0.00 0 0 0 0.00 0 0.00
MTB > end (End means the end of an exec and a return to Popcorr)
MTB > execute 'meany973'
Executing from file: meany973.MTB
MTB > #meany.973.mtb part of 251popcorr
MTB > print k16,k18
Data Display
sumpx 1.00000 (A check to see if probabilities add to 1)
sumpy 1.00000
MTB > let c20=c10*c17 (c20-c24 are products for E(xy))
MTB > let c20=c20*c15(1)
MTB > let c21=c11*c17
MTB > let c21=c21*c15(2)
MTB > let c22=c12*c17
MTB > let c22=c22*c15(3)
MTB > let c23=c13*c17
MTB > let c23=c23*c15(4)
MTB > let c24=c14*c17
MTB > let c24=c24*c15(5)
MTB > let c25=c15*c16 # xP(x)
MTB > let k25=sum(c25) # E(x)
MTB > let c27=c17*c17
MTB > let k27=sum(c27) # E(y)?
MTB > end
MTB > execute 'meanz973'
Executing from file: meanz973.MTB
MTB > #meanz973.mtb part of 251popcorr
MTB > let c27=c17*c18
MTB > let k27=sum(c27) #E(y)
MTB > end
MTB > execute 'exysq973'
Executing from file: exysq973.MTB
MTB > #exysq973.mtb part of 251popcorr
MTB > let c26=c15*c25 # xsqP(x)
MTB > let c28=c17*c27 # ysqP(y)
MTB > let k26=sum(c26) # E(xsq)
MTB > let k28=sum(c28) # E(ysq)
MTB > print c20-c28
Data Display
Row C20 C21 C22 C23 C24 C25 C26 C27 C28
1 0.33 0.4 1.92 0 0 0.99 2.97 0.35 0.35
2 0.90 2.1 3.60 0 0 1.50 7.50 1.02 3.06
3 3.24 3.6 11.88 0 0 4.44 53.28 2.79 25.11
4 0.00 0.0 0.00 0 0 0.00 0.00 0.00 0.00
5 0.00 0.0 0.00 0 0 0.00 0.00 0.00 0.00
MTB > print k25-k28
Data Display
Ex 6.93000
Ex2 63.7500
Ey 4.16000
Ey2 28.5200
MTB > end
MTB > execute 'xyvar973'
Executing from file: xyvar973.MTB
MTB > #xyvar973.mtb part of popcorr
MTB > let k10=k26-k25*k25 #variance of x
MTB > let k11=k28-k27*k27 #variance of y
MTB > print k10 k11
Data Display
varx 15.7251
vary 11.2144
MTB > let k20=sum(c20) (Column sums)
MTB > let k21=sum(c21)
MTB > let k22=sum(c22)
MTB > let k23=sum(c23)
MTB > let k24=sum(c24)
MTB > let k12 = k20+k21+k22+k23+k24 # E(xy)
MTB > print k20-k24, k12
Data Display
K20 4.47000
K21 6.10000
K22 17.4000
K23 0
K24 0
Exy 27.9700
MTB > end
MTB > execute 'cov973'
Executing from file: cov973.MTB
MTB > #cov973.mtb part of popcorr
MTB > let k13=k12-k25*k27 #Covariance
MTB > let k15=sqrt(k10) #St. dev of x
MTB > let k17=sqrt(k11) #St. dev of y
MTB > let k14=k13/k15
MTB > let k14=k14/k17 #Corr(x, y)
MTB > print k13-k18
Data Display
covxy -0.858800 Covariance
corr -0.0646707 Correlation
sdx 3.96549 Std. deviation of x
sumpx 1.00000 Sum of x probabilities
sdy 3.34879 Std. deviation of y
sumpy 1.00000 Sum of y probabilities
MTB > end
MTB > execute 'tb2973'
Executing from file: tb2973.MTB
MTB > #tb2973.mtb Part of 251popcorr
MTB > let k30=30
MTB > let k31=10
MTB > let k32=1
MTB > execute 'tb2s973' 5
Executing from file: tb2s973.MTB
MTB > #tb2s973.mtb Subroutine of tb2973
MTB > # Part of popcorr
MTB > let ck30=ck31
MTB > let ck30(6)=c16(k32)
MTB > let ck30(7)=c25(k32)
MTB > let ck30(8)=c26(k32)
MTB > let k30=k30+1
MTB > let k31=k31+1
MTB > let k32=k32+1
MTB > end
MTB > #tb2s973.mtb Subroutine of tb2973
MTB > # Part of popcorr
MTB > let ck30=ck31
MTB > let ck30(6)=c16(k32)
MTB > let ck30(7)=c25(k32)
MTB > let ck30(8)=c26(k32)
MTB > let k30=k30+1
MTB > let k31=k31+1
MTB > let k32=k32+1
MTB > end
MTB > #tb2s973.mtb Subroutine of tb2973
MTB > # Part of popcorr
MTB > let ck30=ck31
MTB > let ck30(6)=c16(k32)
MTB > let ck30(7)=c25(k32)
MTB > let ck30(8)=c26(k32)
MTB > let k30=k30+1
MTB > let k31=k31+1
MTB > let k32=k32+1
MTB > end
MTB > #tb2s973.mtb Subroutine of tb2973
MTB > # Part of popcorr
MTB > let ck30=ck31
MTB > let ck30(6)=c16(k32)
MTB > let ck30(7)=c25(k32)
MTB > let ck30(8)=c26(k32)
MTB > let k30=k30+1
MTB > let k31=k31+1
MTB > let k32=k32+1
MTB > end
MTB > #tb2s973.mtb Subroutine of tb2973
MTB > # Part of popcorr
MTB > let ck30=ck31
MTB > let ck30(6)=c16(k32)
MTB > let ck30(7)=c25(k32)
MTB > let ck30(8)=c26(k32)
MTB > let k30=k30+1
MTB > let k31=k31+1
MTB > let k32=k32+1
MTB > end
MTB > let c35=c18
MTB > let c35(6)=k18
MTB > let c35(7)=k25
MTB > let c35(8)=k26
MTB > let c36=c27
MTB > let c36(6)=k27
MTB > let c37=c28
MTB > let c37(6)=k28
MTB > print c30-c37
Data Display (The final table)
Row C30 C31 C32 C33 C34 C35 C36 C37
1 0.11 0.08 0.16 0 0 0.35 0.35 0.35
2 0.10 0.14 0.10 0 0 0.34 1.02 3.06
3 0.12 0.08 0.11 0 0 0.31 2.79 25.11
4 0.00 0.00 0.00 0 0 0.00 0.00 0.00
5 0.00 0.00 0.00 0 0 0.00 0.00 0.00
6 0.33 0.30 0.37 0 0 1.00 4.16 28.52
7 0.99 1.50 4.44 0 0 6.93
8 2.97 7.50 53.28 0 0 63.75
MTB > write c30-c37.
Data Display (WRITE)
0.11 0.08 0.16 0 0 0.35 0.35 0.35
0.10 0.14 0.10 0 0 0.34 1.02 3.06
0.12 0.08 0.11 0 0 0.31 2.79 25.11
0.00 0.00 0.00 0 0 0.00 0.00 0.00
0.00 0.00 0.00 0 0 0.00 0.00 0.00
0.33 0.30 0.37 0 0 1.00 4.16 28.52
0.99 1.50 4.44 0 0 6.93 * *
2.97 7.50 53.28 0 0 63.75 * *
* NOTE * Column lengths not equal.
MTB > end.
MTB > execute 'tb3973'
Executing from file: tb3973.MTB
MTB > #tb3973.mtb final printout for 252popcorr
MTB >
MTB > write c20-c24;
SUBC> replace.
Data Display (WRITE) (Products for E(x y) again)
0.33 0.4 1.92 0 0
0.90 2.1 3.60 0 0
3.24 3.6 11.88 0 0
0.00 0.0 0.00 0 0
0.00 0.0 0.00 0 0
MTB > end
MTB > end
Sample Correlation Problem
MTB > exec '251samcov'
Executing from file: 251samcov.MTB
MTB > #251samcov Computes sample variances
MTB > # and covariances using 'var973'
MTB > #Input is x column in c40, y column in c42
MTB > # Example in Covex
MTB > name c40 'x'
MTB > name c41 'xsq'
MTB > name c42 'y'
MTB > name c43 'ysq'
MTB > name c44 'xy'
MTB > name k40 'sumx'