COR1-GB.1305.03
FINAL EXAM

This is the question sheet. There are 10 questions, each worth 10 points. Please write all answers in the answer book, and justify your answers. Good Luck!

For Questions 1 to 5, we consider data onthe top 10 grossing movies of 2013. The response variable (y) is the Total Gross, in Millions of US Dollars.

The explanatory variables (x1, x2, x3)are Rotten Tomatoes Critics’ Score (on the scale 0-100, the higher the better), Opening Gross (in Millions of US Dollars), and Production Budget (in Millions of US Dollars).

1)Figure 1 below provides a scatterplot of Total Gross vs. Rotten Tomatoes Critics’ Score, followed by the Minitab output from a simple regression of y on x1.

Regression Analysis: TotalGross versus Rotten Tomatoes Critics Score

Analysis of Variance

Source DF SS MS F-Value P-Value

Regression 1 15430 15430 3.48 0.099

Error 8 35426 4428

Total 9 50856

Model Summary

S R-sq R-sq(adj) R-sq(pred)

66.5447 30.34% 21.63% 0.00%

Coefficients

Term Coef SE Coef T-Value P-Value

Constant 152.3 90.6 1.68 0.131

Rotten Tomatoes Critics Score 2.25 1.21 1.87 0.099

Regression Equation

TotalGross = 152.3 +2.25RottenTomatoesCriticsScore

A)Give an interpretation of the slope of the fitted model, in practical terms. (2 points).

B)In the last row of the Coefficients table, what null hypothesis and what alternative hypothesis is being tested? (3 points).

C)Explain why a right-tailed test might be more appropriate in Part B). (2 points).

D)Is there evidence of a positive linear relationship between Total Gross and Rotten Tomatoes Critics’ Score at the 5% level of significance? (3 points)

2) Consider the same simple regression as in Question 1.

A)Compute the residual for the movie Gravity, which had a Rotten Tomatoes Critics’ Score of 97 and a Total Gross of 274.093. (5 points).

B)Based on your answer to Part A), is the data point for Gravity above or below the fitted line? (5 points).

3) For the same simple regression as in Question 1, construct a 95% confidence interval for the true slope. (10 Points).

4) Next, we consider a simple regression of Total Gross on x2 (Opening Gross). Figure 2 below gives the fitted line plot, followed by the Minitab regression output.

Regression Analysis: TotalGross versus OpeningGross

Analysis of Variance

Source DF SS MS F-Value P-Value

Regression 1 16845 16845 3.96 0.082

Error 8 34011 4251

Total 9 50856

Model Summary

S R-sq R-sq(adj) R-sq(pred)

65.2030 33.12% 24.76% 5.08%

Coefficients

Term Coef SE Coef T-Value P-Value

Constant 207.9 58.5 3.56 0.007

OpeningGross 1.102 0.554 1.99 0.082

Regression Equation

TotalGross = 207.9 +1.102OpeningGross

A)Test the null hypothesis that the true coefficient of Opening Gross is 1 in this regression, using a two-tailed alternative hypothesis, at the 5% level of significance.(5 points).

B)In part A), what is the interpretation of the hypothesis that the true coefficient of Opening Gross is 1? (5 points).

5) Next, we consider the multiple regression ofy on x1, x2, x3. Here is the Minitab regression output.

Regression Analysis: TotalGross versus Rotten Tomatoes , OpeningGross, ProductionBudget

Analysis of Variance

Source DF SS MS F-Value P-Value

Regression 3 38023 12674 5.93 0.032

Error 6 12833 2139

Total 9 50856

Model Summary

S R-sq R-sq(adj) R-sq(pred)

46.2467 74.77% 62.15% 39.38%

Coefficients

Term Coef SE Coef T-Value P-Value

Constant 133 102 1.30 0.242

Rotten Tomatoes Critics Score 1.901 0.894 2.13 0.078

OpeningGross 1.213 0.396 3.06 0.022

ProductionBudget -0.439 0.304 -1.44 0.199

Regression Equation

TotalGross = 133 +1.901RottenTomatoesCriticsScore +1.213OpeningGross

-0.439ProductionBudget

A)Compute the right-tailed p-value for the true coefficient of Production Budget, and interpret the results. (4 Points).

B)Does the fact that the R2 for this regression is the highest of all three regression models considered so far imply that this model is the best of the three? Explain. (3 Points).

C)Use the AICC to select the best of the three regression models considered so far. (3 Points).

6) In the regressions of Questions 1) – 5), notice that the value of s was less for the multiple regression than for either of the simple regressions. Based on the formula for s in Handout 22, is it necessarily true that when variables are added to a regression the value of s must go down? Explain.

7) Suppose that a sample of size n = 50 has a sample mean of 2.7 and a sample standard deviation of s = .9. Compute the p-value for testing the null hypothesis versus the alternative hypothesis .

8) In testing versus , suppose that you obtain a p-value that is less than .5. Is it possible that Prove your answer.

9) If A, B are independent events with P(A)≠0, P(B)≠1, is it possible to have P(A∩B) = P(A)? Justify your answer.

10) You and your friend are each going to draw a random sample of size 2 (without replacement) from the same population. The population is of size three, and the values in the population are 1, 2 and 3.What is the probability that your sample mean will be greater than your friend's? Assume that your friend's sample is independent of yours.