C22.0103
FINAL EXAM
Name:______

Write your answers to the first five questions on the attached sheets, in the spaces provided. Circle the choice which best answers questions 6-15. Do not write anything else on this page (besides your name and the circles). When you are finished, hand in the entire exam (both question sheets and answer sheets). Please do not remove any pages from the exam paper. There are 15 questions, each worth 5 points. Everyone receives 25 points for free. Good Luck!

1)  WRITTEN 11) (A) (B) (C) (D) (E)

2)  WRITTEN 12) (A) (B) (C) (D) (E)

3)  WRITTEN 13) (A) (B) (C) (D) (E)

4)  WRITTEN 14) (A) (B) (C) (D) (E)

5)  WRITTEN 15) (A) (B) (C) (D) (E)

6)  (A) (B) (C) (D) (E)

7)  (A) (B) (C) (D) (E)

8)  (A) (B) (C) (D) (E)

9)  (A) (B) (C) (D) (E)

10)  (A) (B) (C) (D) (E)


Please Leave This Page Blank
Answer For Question 1:


Answer for Question 2:


Answer for Question 3:


Answer for Question 4:


Answer for Question 5:


C22.0103

FINAL EXAM

Questions 1) - 5) are based on data gathered by Martino Ghezzi from the 2003 Zagat guide to restaurants in Manhattan. The response variable, Price, is the estimated price of a dinner, including one drink and tip, in Dollars. The explanatory variables are Food, Décor and Service, which give the ratings from the Zagat guide on food quality, décor and service for the restaurant. Higher ratings are better than lower ratings. The highest possible rating is 30. Figure 1 gives a fitted line plot for Price vs. Food. This is followed by the corresponding Minitab Simple Linear Regression output.


Regression Analysis: Price versus Food

The regression equation is

Price = - 5.46 + 2.17 Food

Predictor Coef SE Coef T P

Constant -5.461 3.882 -1.41 0.161

Food 2.1670 0.1967 11.01 0.000

S = 12.5621 R-Sq = 28.9% R-Sq(adj) = 28.7%

Analysis of Variance

Source DF SS MS F P

Regression 1 19146 19146 121.33 0.000

Residual Error 298 47026 158

Total 299 66172

1)

A)  Does the fitted line plot in Figure 1 suggest any problems with the linear regression model? (1 Point).

B)  Does the Minitab output above indicate a strong and statistically significant linear relationship between Price and Food quality? Does the relationship (if any) appear to be in the direction that would have been predicted before examining the data? (2 Points).

C)  Is there evidence to suggest that the intercept is different from zero? (2 Points).

2) The table below gives Minitab output for the multiple regression of Price vs. Décor and Service.

A)  Do Décor and Service, taken together, seem to have far greater ability to explain Price than the Food quality rating taken by itself? Justify your answer using the Minitab output from this problem and the previous problem. (3 Points).

B)  Is there evidence to suggest that as the rating for Service increases while Décor is held fixed, the Price increases? (2 Points).

Regression Analysis: Price versus Decor, Service

The regression equation is

Price = - 21.9 + 1.03 Decor + 2.50 Service

Predictor Coef SE Coef T P

Constant -21.909 1.983 -11.05 0.000

Decor 1.0334 0.1200 8.61 0.000

Service 2.4982 0.1666 15.00 0.000

S = 7.06198 R-Sq = 77.6% R-Sq(adj) = 77.5%

Analysis of Variance

Source DF SS MS F P

Regression 2 51360 25680 514.93 0.000

Residual Error 297 14812 50

Total 299 66172

3) The table below gives Minitab output for the multiple regression of Price vs. Food, Décor and Service.

A)  Use an appropriate p-value to decide if there is a strong and statistically significant positive relationship between Food quality and Price for a given level of Décor and Service? (2 Points).

B)  Calculate a 95% confidence interval for the true coefficient of Food quality based on the multiple regression. (2 Points).

C)  Elaine’s restaurant had Food, Décor and Service ratings of 11, 12 and 13, respectively, and a Price of $46. Calculate the residual for Elaine’s. (1 Point).

Regression Analysis: Price versus Food, Decor, Service

The regression equation is

Price = - 21.2 - 0.107 Food + 1.01 Decor + 2.60 Service

Predictor Coef SE Coef T P

Constant -21.195 2.271 -9.33 0.000

Food -0.1073 0.1656 -0.65 0.518

Decor 1.0075 0.1266 7.96 0.000

Service 2.6032 0.2325 11.20 0.000

S = 7.06889 R-Sq = 77.6% R-Sq(adj) = 77.4%

Analysis of Variance

Source DF SS MS F P

Regression 3 51381 17127 342.75 0.000

Residual Error 296 14791 50

Total 299 66172

4)

A)  According to the model used in Problem 3, what happens to the predicted Price as Food quality is increased, for given fixed values of Decor and Service? Does this result seem to make sense? (2 Points).

B)  Use the Analysis of Variance output from the tables in Problem 2 and 3 to show that the is higher when all three explanatory variables are used than when Décor and Service but not price are used. Does this imply that Food is an important variable? (2 Points).

C)  The values are 9.64 for the model with all three variables, and 9.63 for the model with just Décor and Service. In view of this, and everything else we have seen, which seems to be the best model for describing Price? (1 Point).


5) Figure 2 below gives a normal probability plot for the residuals of Price based on Food, Décor and Service. The null hypothesis here is that the residuals are normally distributed. Do the plot and p-value support this null hypothesis? Explain.

6) True or false: If a regression coefficient for a given explanatory variable is statistically significant in a simple regression, then it must also be statistically significant in a multiple regression that includes additional variables?

A)  True B) False

7) Your store has put a certain brand of canned peaches on sale. The limit is two cans per customer. For a given customer, if the items are still available, there is a 75% chance that they will buy zero cans, a 20% chance that they will buy one can, and a 5% chance that they will buy two cans. If there are 100 cans in stock and 400 customers, what is the probability that the store will sell all of the cans? (Retain four digits of accuracy in your calculations).

A)  .4641 B) .5359 C) .0359 D) .9641 E) None of the Above

8) In simple linear regression if the correlation coefficient r is .6 then what proportion of variability in Y is explained by X?

A)  100% B) 84% C) 36% D) 60% E) None of the Above

9) Suppose we wish to test versus based on a sample of size 5. If the p-value is .01 then what is the value of the t-statistic?

A)  -2.326 B) -2.576 C) -3.747 D) -4.604 E) None of the Above

10) Consider a test of versus at level .01 using a t-test based on a sample of size n=10. Suppose that the t-statistic is positive and falls exactly on the border of the rejection region. Suppose also that the sample mean is 2. Then the value of the sample standard deviation is:

A)  .3545 B) 1.121 C) .9730 D) .8921 E) None of the Above

11)  Suppose we wish to test versus based on a large sample. If is true, then the probability that the p-value is less than .01 is:

A) .01 B) Greater than .01 C) Less than .01 D) It Cannot Be Determined

12)  It has been reported that the 1-Euro coin is more likely to yield Heads when flipped than it is to yield Tails. An experiment of 500 flips yielded 265 Heads. Based on hypothesis testing methodology, we can reject the null hypothesis of a 50% probability of Heads:

A) At Level .1 but not .05 B) At level .05 but not .01 C) At level .01 but not .005 D) At level .005 but not .001. E) None of the Above

13)  In linear regression based on a very large sample, suppose that the t-statistic for one of the regression coefficients is 2.1. Then the two-tailed p-value corresponding to the given coefficient is:

A) .9821 B) .9642 C) .0358 D) .0179 E) None of the Above.

14)  A random sample of size 11 from a normal population yields a sample mean of 2.5 and a sample standard deviation of 3.6. The 95% confidence interval for the population mean is:

A) (.5332 , 4.467) B) (.0816 , 4.918) C) (.1109 , 4.889)

D) (.3725 , 4.627) E) None of the Above.

15)  How many 6-digit numbers are there that contain all of the digits 1,2,3,4,5,6? (Two examples are 651234 and 412356).

A) 1000000 B) 46656 C) 60 D) 720 E) None of the Above