252x0721 3/19/07

ECO252 QBA2 Name

SECOND HOUR EXAM

March 23 2007

Show your work! Make Diagrams! Exam is normed on 50 points. Answers without reasons are not usually acceptable.

I. (8 points) Do all the following. Make diagrams!

- If you are not using the supplement table, make sure that I know it.

1.

2.

3.

4. (Do not try to use the t table to get this.)


II. (22+ points) Do all the following? (2points each unless noted otherwise). Look them over first – the computer problem is at the end. Show your work where appropriate.

Note the following:

1. This test is normed on 50 points, but there are more points possible including the take-home. You are unlikely to finish the exam and might want to skip some questions.

2. A table identifying methods for comparing 2 samples is at the end of the exam.

3. If you answer ‘None of the above’ in any question, you should provide an alternative answer and explain why. You may receive credit for this even if you are wrong.

4. Use a 5% significance level unless the question says otherwise.

5. Read problems carefully. A problem that looks like a problem on another exam may be quite different.

6. Make sure that you state your null and alternative hypothesis, that I know what method you are using and what the conclusion is when you do a statistical test.

1. (Anderson, Sweeny, Williams) We wish to compare miles per gallon of two similar automobiles. A random sample of 8 automobiles is chosen and 8 drivers are asked to drive the cars on identical roads. The data is as follows.

Row Driver Model 1 Model 2 difference

1 1 28 26 2

2 2 23 22 1

3 3 25 27 -2

4 4 23 22 1

5 5 24 23 1

6 6 26 25 1

7 7 29 27 2

8 8 24 26 -2

I have computed , , and

a. Compute the sample variance for the column – Show your work! (2)

b. Is there a significant difference between the gas consumption in the two models? State your hypotheses! (2)

c. Test to see if the variances of the two cars’ gas consumption are similar. (2) [6]


Exhibit 1: A quality control engineer is in charge of manufacture of computer disks. Two different processes can be used to manufacture the disks. The engineer suspects that the Kohler method produces a greater proportion of defective disks than the Russell method. Out of a sample of 150 Kohler disks, 27 are defective. Out of a sample of 200 Russell disks, 18 are defective. If Kohler disks are sample 1 and Russell disks are sample 2, test the engineer’s suspicion at the 1% level.

2. The hypotheses that should be tested in exhibit 1 are

a. and

b. and

c. and

d. and

e. and

f. and

g. None of the above. (Write in correct answer.) [8]

3. For exhibit 1, find the value of the test ratio. (3) [11]

4. For exhibit 1, the hypotheses in 2 and the test ratio in 3 draw an approximately normal curve and show the ‘reject’ region by shading it. (3) [14]

5. For exhibit 1and the hypotheses in 2, find a p-value for the test. (2) [16]

5a.For exhibit 1, find a 17% 2-sided confidence interval for the difference between the 2 proportions. (4)
Exhibit 2: A data entry operation sends a group of its employees to a typing course. The table below shows their speed before and after . . and represent the ranks of the numbers when the before and after speeds are ranked between 1 and 16. is the absolute value of the items in the column. drops the zero and ranks the numbers in from 1 to 7 and is the ranks with their signs added.

Row Processor Before rB After rA d abs d rank sRank

1 1 59 5.5 57 3.5 2 2 2.5 2.5

2 2 57 3.5 62 9.0 -5 5 7.0 -7.0

3 3 60 7.5 60 7.5 0 0 * *

4 4 66 12.0 63 10.5 3 3 4.0 4.0

5 5 68 13.0 69 14.0 -1 1 1.0 -1.0

6 6 59 5.5 63 10.5 -4 4 5.5 -5.5

7 7 72 15.0 74 16.0 -2 2 2.5 -2.5

8 8 52 1.0 56 2.0 -4 4 5.5 -5.5

6. Assume that exhibit 2 represents the scores of one sample of eight employees before and after the training. Can we say that the median speed has risen? Do an appropriate statistical test. (3) [19]

7. Assume that instead the before and after columns represent independent samples. Can we say that the median speed has risen? Do an appropriate statistical test. (3) [22]


8. The owner of Mother Truckers (which actually moved me once) wants to prove that her firm is superior to her arch rivals Wallflower Van Lines and wants to use proportion of shipment with claims filed as a way of doing that. She assembles the following data.

Mother Truckers Wallflower

Total Shipments Sampled 900 750

Total number of shipments with 162 60

claims over $50

Which would be proper to analyse the data?

a. test for independence.

b. test for homogeneity

c. ANOVA

d. z test for comparing 2 proportions.

e. Sign test

f. The McNemar Test

g. None of the above.

9. Which is the closest to the probability that a random variable with 4 degrees of freedom will be greater than 10?

a. .01

b. .05

c. .10

d. .99

e. .95

f. .90

10. During a period of 20 days 720 patients arrive at a hospital or an average of 1.5 per hour over 480 hours. For example during 106 of the 480 hours there were no arrivals. See if a Poisson distribution fits these data. (6) [32]

Row x O xO

1 0 106 0

2 1 140 140

3 2 125 250

4 3 106 318

5 4 3 12

6 5 or more 0 0

480 720


11. Computer question.

a. Turn in your first computer output. Only do b, c and d if you did. (3)

b. A researcher believes that bank CEOs are paid more than utility CEOs. A random sample of eight salaries (in thousands) is collected for each industry. What were the null and alternative hypotheses tested? At the 95% confidence level could the researcher state that bank CEOs are paid more than utility CEOs? Why? How would the results be affected if we insist on a 99% confidence level? (2)

c. What is the difference between the two hypothesis tests that were done with the salary data? (1)

d. (Lee) A manufacturer is afraid that the company is producing slow egg timers. A sample of 12 timers is chosen and the time in seconds that was needed for the timers to run out was recorded. What hypotheses were tested? Can the manufacturer conclude that the timers are slow if a 95% confidence level is used? Why? How would the results be affected if we insist on a 99% confidence level? (2) [40 actually 44]

————— 3/19/2007 7:55:12 PM ————————————————————

Welcome to Minitab, press F1 for help.

MTB > print c1 c2

Data Display

Row Banks Utilities

1 755 620

2 712 395

3 845 653

4 985 1050

5 1300 1030

6 1143 528

7 733 610

8 1189 964

MTB > describe c1 c2

Descriptive Statistics: Banks, Utilities

Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3

Banks 8 0 957.8 81.3 230.0 712.0 738.5 915.0 1177.5

Utilities 8 0 731.3 87.9 248.6 395.0 548.5 636.5 1013.5

Variable Maximum

Banks 1300.0

Utilities 1050.0

MTB > TwoSample c1 c2;

SUBC> Alternative 1.

Two-Sample T-Test and CI: Banks, Utilities

Two-sample T for Banks vs Utilities

SE

N Mean StDev Mean

Banks 8 958 230 81

Utilities 8 731 249 88

Difference = mu (Banks) - mu (Utilities)

Estimate for difference: 226.500

95% lower bound for difference: 14.437

T-Test of difference = 0 (vs >): T-Value = 1.89 P-Value = 0.041 DF = 13

MTB > TwoSample c1 c2;

SUBC> Pooled;

SUBC> Alternative 1.

Two-Sample T-Test and CI: Banks, Utilities

Two-sample T for Banks vs Utilities

SE

N Mean StDev Mean

Banks 8 958 230 81

Utilities 8 731 249 88

Difference = mu (Banks) - mu (Utilities)

Estimate for difference: 226.500

95% lower bound for difference: 15.589

T-Test of difference = 0 (vs >): T-Value = 1.89 P-Value = 0.040 DF = 14

Both use Pooled StDev = 239.4934

MTB > print c6

Data Display

Seconds

190 199 198 176 180 174 181 183 208 188 198 165

MTB > describe seconds

Descriptive Statistics: Seconds

Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3

Seconds 12 0 186.67 3.60 12.47 165.00 177.00 185.50 198.00

Variable Maximum

Seconds 208.00

MTB > Onet c6;

SUBC> Test 180;

SUBC> Alternative 1.

One-Sample T: Seconds

Test of mu = 180 vs > 180

95%

Lower

Variable N Mean StDev SE Mean Bound T P

Seconds 12 186.667 12.471 3.600 180.202 1.85 0.046

The methods were listed in the outline in the following table.

Paired Samples / Independent Samples
Location - Normal distribution.
Compare means. / Method D4 / Methods D1- D3
Location - Distribution not Normal. Compare medians. / Method D5b / Method D5a
Proportions / Method D6b / Method D6a
Variability - Normal distribution. Compare variances. / Method D7


Blank page.
ECO252 QBA2

SECOND EXAM

March 23, 2007

TAKE HOME SECTION

-

Name: ______

Student Number: ______

III. Neatness Counts! Show your work! Always state your hypotheses and conclusions clearly. (19+ points). In each section state clearly what number you are using to personalize data (Your Version number). There is a penalty for failing to include your student number on this page and not stating version number in each section. Please write on only one side of the paper.

1. A bicycle manufacturer wishes to test the proposition that the age of bicycle buyers is older in mountain biking country than in flatter land. In the course of a few hours in Mountain City and Flatland City two sets of customer data are collected - 11 ages in Mountain City and 9 in Flatland City. Personalize the data as follows. The manufacturer’s researcher brings his little brother along. The brother is 10 + x years old, where x is the second to last digit of your student number. The brother puts his age in as a last item in both columns. So now the researcher has one column of 12 ages and another of 10 ages. Example: Ima Badrisk has the number 375290, so the 12th number in the ‘Mtn’ column is 19 as is the 10th number in the ‘Fltlnd’ column.

Row Mtn Fltlnd

1 29 11

2 38 14

3 31 15

4 17 12

5 36 14

6 28 25

7 44 14

8 9 11

9 32 8

10 23

11 35

a. You are the data analyst and you are fairly clueless. So you compare the ages every way possible. First you compute means and standard deviations for both columns (Show your work!) (3)

b. With no good reason to do so, you compare the mean ages assuming a Normal distribution with equal variances (4). You may use a test ratio, a critical value or a confidence interval (2 points extra if you use all three and get the same result each time.

c. Now you are not sure that was right and repeat the analysis while dropping the assumption of equal variances. (4 extra credit)

d. But you are not really sure that that was right either, so repeat the analysis by comparing medians. (3) [10]

e. So now you have three different sets of results and you have to decide which one to present to your boss. To decide whether you should have used the method in b) or in c) you compare variances. (2)

f. But since, perhaps, you should have compared medians instead, you use a test to see if the data in Mountain city was Normally distributed. (4).

g. So, on the basis of these tests, which method should you have used? Make a decision and present your results. (1) [17]

2. A corporate president is beginning to worry that his customer representatives are dressing too informally. A sample of 11 representatives are selected and told not to wear a suit the first week and then told to wear a suit the following week. Customers are asked to rate the representatives according to how professionally they were treated, and from their questionnaires, each representative is given a rating.

The ratings appear below. Personalize the data as follows. The 10 in the ‘without’ column is an obvious error. ‘Correct’ it by adding the last digit of your student number to it, and make a corresponding correction in the difference column. If your student number ends in zero add 10. Example: Ima Badrisk has the number 375290, so the 11th number in the ‘Without’ column is 20 and the 11th number in the ‘Difference’ column is

2.

a. Test to see if the Reps received significantly higher ratings when wearing suits assuming that the samples come from the Normal distribution. (3)

b. Test to see if the Reps received significantly higher ratings when wearing suits without assuming a Normal distribution. (3)

c. So, given the source of the data, which of the two is the correct method to use? Why? (1) [24]

Row Rep With Without Difference

1 A 27 22 5

2 B 23 16 7

3 C 25 25 0

4 D 22 19 3

5 E 25 21 4

6 F 26 24 2

7 G 21 20 1

8 H 25 19 6

9 I 26 23 3

10 J 28 26 2

11 K 22 10 12

For your convenience, the following sums have been calculated for the first 10 numbers in each column.

With Sum 248 Sum of squares 6194

Without Sum 215 Sum of squares 4709

Difference Sum 33 Sum of squares 153

3. The table below is data that were assembled to see if there is a difference in numbers of children among students of various types of higher education institutions. Samples were taken in community colleges (CC), large universities (LU) and small colleges (SC). Personalize the data by adding the third to last digit of your student number to the 25 in the upper right-hand corner.