251solngr1 2/12/07 (Open this document in 'Print Layout' view!)

Graded Assignment 1

1) Problem: Using the computational formula, find the sample variance of the following data: Also find the median and the third quartile. Show your work!

Neatness counts on all assignments, as does the quality of your writeups. Staple your pages!

(in order)

4 16 1 -9 81

-3 9 2 -3 9

-9 81 3 -1 1

10 100 4 4 16

13 169 5 9 81

16 256 6 10 100

12 144 7 12 144

12 144 8 12 144

9 81 9 13 169

-1 1 10 16 256

16 256 11 16 256

19 361 12 19 361

98 1618 98 1618

Variance:

Median: The most common error in computing measures of position in ungrouped data is failing to put the numbers in order! Since the middle numbers in the ordered data are 10 and 12, the median is 11. We can, of course use . So , and

Third Quartile: The basic formulas are and

For the third quartile so . So , , and


2) Computer Assignment

To get credit for the remainder of this assignment you must turn in original spreadsheets.

This Computer Assignment has as its goal to do most of Problem G3 using Excel.

Step 1: Open up an Excel worksheet and start by filling locations A1-14 by labeling the column with: the word "Class" in A1, the labels for the classes in A2 to A13, the word "Total" in A14. If you have trouble with some of these labels, using single quotes can help.

Step 2: Put the abbreviation "Mdpt" for midpoint in B1. Fill Column B with the midpoints for the classes by putting 2.5 in B2 and 7.5 in B3. Highlight B2 and B3 and drag the fill handle down to fill the cells down to B13. (To do this point to the highlighted area until you get a black cross, then move the pointer down.) You should now have the numbers 2.5 to 57.5 in column B.

Step 3: You now need the frequency column. Put "f " in C1. Copy the frequencies in C2 through C13. Highlight cell C14 and find the AutoSum logo, which is simply a summation sign on the top. Click on the summation sign and "enter." This will give you the sum of the frequencies. Remember that .

Step 4: Now compute the cumulative frequencies in column C. Head the column with "F" and copy the one from C2 to D2. In D3 put "=D2+C3." In D4 put "=D3+C4." Now highlight D3 and D4 and use the fill handle to fill in D5 through D13. The last cumulative frequency in D13 should be 90.

Step 5: Label column E with "xf " or "fx." To compute this column, put "=B2*C2" in E2 and "=B3*C3" in E3. Use the fill handle to fill in the rest of the products down to E13. Use the AutoSum feature to put a total in E14.

Step 6: In column F you will compute or . Label the column "xsqf " or "fxsq." Now F2 will be done with "=B2*E2," and F3 with "=B3*E3" and you can fill the column down to F13 with the fill handle, and get its sum using AutoSum.

Step 7: Now compute the mean, variance and standard deviation using the computational formulas. In K1 put the words "n =" and copy into L1, perhaps by writing "=c14" in L1. In K2 put the words "Mean =," and in L2 compute by using "=E14/C14." Now you are ready to do . Put "Var =" in K3. Remember that is in F14, is in C14 and is in L2. You should compute by multiplying the mean by itself, so the formula in L3 should be "=(F14-C14*L2*L2)/(C14-1)." You should get 132.5843.

Now you need the standard deviation, which is simply the square root of the variance, so put "StDev =" in K4 and "=SQRT(L3)" in L4.


Step 8: Now we are going to use the definitional formula to compute the skewness. Start by writing the value that you got for the mean in cell M2. Do not copy this value or use an equation to get it into M2, just type it in. Excel will mess up some calculations if you do. Use the fill handle to fill the M column with the mean.

You are now going to compute , , and in columns G, H, I and J. Head these columns with "x' ," "'fx'," "fx'sq" and "fx'cu." In G2 enter "=B2-M2" and use the fill handle to put the values of the midpoints minus the mean in G2 through G13. In H2 enter "=G2*C2, " use the fill handle to fill in to H13 and use AutoSum to put the sum of column H in H14. This sum should be zero. In I2 enter "=G2*H2" and use the fill handle again. In I14 get . You can use this to check the accuracy of your variance computation. In J2 enter "=G2*I2" and again fill the column. Use AutoSum to get in J14. Then in cell K5 type "K3 =," and in L5 compute by typing "=C14*J14/((C14-1)*(c14-2))

Step 9: Use the formatting toolbar to center the first row of columns B through J. Make any other formatting changes that you think would improve the legibility if the spreadsheet and print out the results.

a) For the original solution to this problem see SolnG3A.

b) For the Excel solution see grdat1 . You have to have Excel to read this.

3) Research Assignment. Talk to about 20 students. Ask them how many hours they studied over some period of time - it could be a day, several days or a week. Use Excel to analyze your data. Put a heading in cell A1 and your data below it. Use the "Tools" pull-down menu. Pick "Data Analysis" and "Descriptive Statistics." Check "Labels in First Row." and "Summary Statistics." (If you cannot find this, use Tools and Add-Ins to put in the analysis packs.) Specify a range (like A1: A50) Comment on the output in a short paragraph. Are you close to the mean? Why or why not. The skewness statistic computed here is K3, so compute a measure of relative skewness and tell me if the data is highly skewed and in what direction. Write in brief literate English!

Possible Solution: As an example, your data in column A might be something like:

Times, 1, 2, 2, 3, 4, 4, 4, 5, 5, 5, 7, 8, 8, 8, 9, 10, 11, 12, 14, 16, 16, 16

The data analysis software gave me the results below on the 4th page of my worksheet.

Column1

Mean 7.727272727

Standard Error 1.017555466

Median 7.5

Mode 4

Standard Deviation 4.772758194

Sample Variance 22.77922078

Kurtosis -0.851661267

Skewness 0.515444345

Range 15

Minimum 1

Maximum 16

Sum 170

Count 22


And I might have said:

The data above represent the weekly study time of 22 randomly selected respondents. I study 30 hours a week, so that my study time seems to be far above any of the times in my survey. The skewness coefficient of 0.515 when divided by the third power of the standard deviation, gives me a relative skewness of about 0.005, which indicates that the data is almost symmetrical (actually barely skewed to the right). This seems to be brought out by the similarity of the median and the mean, even though the mode is much smaller.

An actual copy of more recent results, set up so that the data and results are on the same page appears below.

Times
1 / Times
2
2 / Mean / 7.727273
3 / Standard Error / 1.017555
4 / Median / 7.5
4 / Mode / 4
4 / Standard Deviation / 4.772758
5 / Sample Variance / 22.77922
5 / Kurtosis / -0.85166
5 / Skewness / 0.515444
7 / Range / 15
8 / Minimum / 1
8 / Maximum / 16
8 / Sum / 170
9 / Count / 22
10
11
12
14
16
16
16


4) Extra Credit

a) The data was set up as follows in columns c1, c2 and c10:

Row f x Class

1 1 2.5 0-4.9

2 0 7.5 5-9.9

3 3 12.5 10-14.9

4 7 17.5 15-19.9

5 15 22.5 20-24.9

6 16 27.5 25-29.9

7 12 32.5 30-34.9

8 11 37.5 35-39.9

9 9 42.5 40-44.9

10 9 47.5 45.49.9

11 6 52.5 50-54.9

12 1 57.5 55-59.9

These were saved in the Minitab data file 251G3o. The three command files were run by copying from the website, with the following results. This is a highly edited version of my 2003 run. The 2005 run got identical results, but had to be done anyway because of changes in Minitab. Some blank lines have been edited out to preserve continuity. # creates instructions that are not read by Minitab and is used for student names, routine names and comments .

————— 9/4/2003 6:57:31 PM ————————————————————

Welcome to Minitab, press F1 for help.

MTB > Retrieve "C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive D\MINITAB\251G3o.MTW".

Retrieving worksheet from file: C:\Documents and Settings\RBOVE.WCUPANET\My Documents\Drive D\MINITAB\251G3o.MTW

# Worksheet was saved on Thu Sep 04 2003

MTB > #Roger Bove

MTB > #grp.mtb Here’s where I copied in the first subroutine from the website.

Results for: 251G3o.MTW

MTB > let c3 = c1*c2 #You have already given these columns names

MTB > let c4 = c3*c2 #They will be designated by column numbers here.

MTB > let c5 = c4*c2

MTB > name k1 'n' #The Built-in constants k1-k9 are given names.

MTB > name k2 'mean'

MTB > name k3 'Sfx'

MTB > name k4 'Sfx2'

MTB > name k5 'Sfx3'

MTB > name k7 'Sfx^'

MTB > name k8 'Sfx^2'

MTB > name k9 'Sfx^3'

MTB > let k1 = sum(c1) #This is how we sum a column.

MTB > let k3 = sum(c3)

MTB > let k4 = sum(c4)

MTB > let k5 = sum(c5)

MTB > let k2 = k3/k1

MTB > print c10, c1-c5


Data Display - These are the columns on the first page of the solution to problem G3.

Row Class f x fx fxsq fxcu

1 0-4.9 1 2.5 2.5 6.3 16

2 5-9.9 0 7.5 0.0 0.0 0

3 10-14.9 3 12.5 37.5 468.8 5859

4 15-19.9 7 17.5 122.5 2143.8 37516

5 20-24.9 15 22.5 337.5 7593.8 170859

6 25-29.9 16 27.5 440.0 12100.0 332750

7 30-34.9 12 32.5 390.0 12675.0 411938

8 35-39.9 11 37.5 412.5 15468.8 580078

9 40-44.9 9 42.5 382.5 16256.3 690891

10 45.49.9 9 47.5 427.5 20306.3 964547

11 50-54.9 6 52.5 315.0 16537.5 868219

12 55-59.9 1 57.5 57.5 3306.3 190109

MTB > print k1-k5

Data Display

n 90.0000 #This is k1.

mean 32.5000 #This is k2.

Sfx 2925.00 #This is k3.

Sfx2 106863 #This is k4.

Sfx3 4252781 #This is k5.

MTB > let c6 = c2-k2

MTB > let c7 = c1*c6

MTB > let c8 = c7*c6

MTB > let c9 = c8*c6

MTB > let k7 = sum(c7)

MTB > let k8 = sum(c8)

MTB > let k9 = sum(c9)


MTB > print c10, c6-c9

Data Display These are the columns on the Third page of the solution to problem G3.

Row Class x^ fx^ fx^sq fx^cu

1 0-4.9 -30 -30 900 -27000

2 5-9.9 -25 0 0 0

3 10-14.9 -20 -60 1200 -24000

4 15-19.9 -15 -105 1575 -23625

5 20-24.9 -10 -150 1500 -15000

6 25-29.9 -5 -80 400 -2000

7 30-34.9 0 0 0 0

8 35-39.9 5 55 275 1375

9 40-44.9 10 90 900 9000

10 45.49.9 15 135 2025 30375

11 50-54.9 20 120 2400 48000

12 55-59.9 25 25 625 15625

MTB > print k7-k9

Data Display

Sfx^ 0 #This is k7.

Sfx^2 11800.0 #This is k8.

Sfx^3 12750.0 #This is k9.

MTB > end

MTB > #grpv.mtb Here’s where I copied in the second subroutine from the website.

MTB > let k6 = k1-1 #k6 is .

MTB > name k10 'var1' #The Built-in constants are given names.

MTB > name k11 'var2'

MTB > name k17 'stdev'

MTB > let k10 = k8/k6 #In k10

MTB > let k11 = k1*k2*k2

MTB > let k11 = k4-k11

MTB > let k11 = k11/k6 #k11

MTB > let k17 = sqrt(k11) # In k12

MTB > end

MTB > #grps.mtb Here’s where I copied in the third subroutine from the website.

MTB > name k14 'k31' #The 3rd k-statistic or skewness

MTB > name k16 'k32'

MTB > name k18 'g11' #Relative skewness.

MTB > name k19 'g12'

MTB > let k12 = k6-1 #

MTB > let k13 = k1/k6

MTB > let k13 = k13/k12 # In k13

MTB > let k14 = k13*k9 #

MTB > let k15 = 2*k1*k2*k2*k2

MTB > let k16 = 3*k2*k4

MTB > let k15 = k5-k16+k15

MTB > let k16 = k13*k15 # In k16

MTB > let k18 = k16/k11

MTB > let k18 = k18/k17 # (Got twisted.)

MTB > let k19 = k14/k11

MTB > let k19 = k19/k17 # In k19

MTB > print k1-k19

Data Display

n 90.0000

mean 32.5000

Sfx 2925.00

Sfx2 106863

Sfx3 4252781

K6 89.0000 #

Sfx^ 0

Sfx^2 11800.0

Sfx^3 12750.0

var1 132.584 #

var2 132.584 #

K12 88.0000 #

K13 0.0114913 # Intermediate computation.

k31 146.514 #

K15 12750.0 # Intermediate.

k32 146.514 #

stdev 11.5145 #

g11 0.0959714 #

g12 0.0959714 #

MTB > end #This is the end of what you were asked to do.

2