COR1-GB.1305.03
FINAL EXAM
This is the question sheet. There are 10 questions, each worth 10 points. Please write all answers in the answer book, and justify your answers. Good Luck!
For Questions 1 to 5, we consider data onthe top 10 grossing movies of 2013. The response variable (y) is the Total Gross, in Millions of US Dollars.
The explanatory variables (x1, x2, x3)are Rotten Tomatoes Critics’ Score (on the scale 0-100, the higher the better), Opening Gross (in Millions of US Dollars), and Production Budget (in Millions of US Dollars).
1)Figure 1 below provides a scatterplot of Total Gross vs. Rotten Tomatoes Critics’ Score, followed by the Minitab output from a simple regression of y on x1.
Regression Analysis: TotalGross versus Rotten Tomatoes Critics Score
Analysis of Variance
Source DF SS MS F-Value P-Value
Regression 1 15430 15430 3.48 0.099
Error 8 35426 4428
Total 9 50856
Model Summary
S R-sq R-sq(adj) R-sq(pred)
66.5447 30.34% 21.63% 0.00%
Coefficients
Term Coef SE Coef T-Value P-Value
Constant 152.3 90.6 1.68 0.131
Rotten Tomatoes Critics Score 2.25 1.21 1.87 0.099
Regression Equation
TotalGross = 152.3 +2.25RottenTomatoesCriticsScore
A)Give an interpretation of the slope of the fitted model, in practical terms. (2 points).
B)In the last row of the Coefficients table, what null hypothesis and what alternative hypothesis is being tested? (3 points).
C)Explain why a right-tailed test might be more appropriate in Part B). (2 points).
D)Is there evidence of a positive linear relationship between Total Gross and Rotten Tomatoes Critics’ Score at the 5% level of significance? (3 points)
2) Consider the same simple regression as in Question 1.
A)Compute the residual for the movie Gravity, which had a Rotten Tomatoes Critics’ Score of 97 and a Total Gross of 274.093. (5 points).
B)Based on your answer to Part A), is the data point for Gravity above or below the fitted line? (5 points).
3) For the same simple regression as in Question 1, construct a 95% confidence interval for the true slope. (10 Points).
4) Next, we consider a simple regression of Total Gross on x2 (Opening Gross). Figure 2 below gives the fitted line plot, followed by the Minitab regression output.
Regression Analysis: TotalGross versus OpeningGross
Analysis of Variance
Source DF SS MS F-Value P-Value
Regression 1 16845 16845 3.96 0.082
Error 8 34011 4251
Total 9 50856
Model Summary
S R-sq R-sq(adj) R-sq(pred)
65.2030 33.12% 24.76% 5.08%
Coefficients
Term Coef SE Coef T-Value P-Value
Constant 207.9 58.5 3.56 0.007
OpeningGross 1.102 0.554 1.99 0.082
Regression Equation
TotalGross = 207.9 +1.102OpeningGross
A)Test the null hypothesis that the true coefficient of Opening Gross is 1 in this regression, using a two-tailed alternative hypothesis, at the 5% level of significance.(5 points).
B)In part A), what is the interpretation of the hypothesis that the true coefficient of Opening Gross is 1? (5 points).
5) Next, we consider the multiple regression ofy on x1, x2, x3. Here is the Minitab regression output.
Regression Analysis: TotalGross versus Rotten Tomatoes , OpeningGross, ProductionBudget
Analysis of Variance
Source DF SS MS F-Value P-Value
Regression 3 38023 12674 5.93 0.032
Error 6 12833 2139
Total 9 50856
Model Summary
S R-sq R-sq(adj) R-sq(pred)
46.2467 74.77% 62.15% 39.38%
Coefficients
Term Coef SE Coef T-Value P-Value
Constant 133 102 1.30 0.242
Rotten Tomatoes Critics Score 1.901 0.894 2.13 0.078
OpeningGross 1.213 0.396 3.06 0.022
ProductionBudget -0.439 0.304 -1.44 0.199
Regression Equation
TotalGross = 133 +1.901RottenTomatoesCriticsScore +1.213OpeningGross
-0.439ProductionBudget
A)Compute the right-tailed p-value for the true coefficient of Production Budget, and interpret the results. (4 Points).
B)Does the fact that the R2 for this regression is the highest of all three regression models considered so far imply that this model is the best of the three? Explain. (3 Points).
C)Use the AICC to select the best of the three regression models considered so far. (3 Points).
6) In the regressions of Questions 1) – 5), notice that the value of s was less for the multiple regression than for either of the simple regressions. Based on the formula for s in Handout 22, is it necessarily true that when variables are added to a regression the value of s must go down? Explain.
7) Suppose that a sample of size n = 50 has a sample mean of 2.7 and a sample standard deviation of s = .9. Compute the p-value for testing the null hypothesis versus the alternative hypothesis .
8) In testing versus , suppose that you obtain a p-value that is less than .5. Is it possible that Prove your answer.
9) If A, B are independent events with P(A)≠0, P(B)≠1, is it possible to have P(A∩B) = P(A)? Justify your answer.
10) You and your friend are each going to draw a random sample of size 2 (without replacement) from the same population. The population is of size three, and the values in the population are 1, 2 and 3.What is the probability that your sample mean will be greater than your friend's? Assume that your friend's sample is independent of yours.