SOLUTIONS TO FINAL EXAM
VERSION 1

1) 

A)  Note that the estimated slope is We estimate that each additional point in the Rotten Tomatoes Critics’ Score adds 2.25 Million Dollars to the expected total Gross.

B)  The null hypothesis is and the alternative hypothesis is two-tailed, where β is the slope of the true line.

C)  Since the Rotten Tomatoes Critics’ Score should ideally serve as a gauge of the movie’s ultimate popularity, we would have predicted, before examining the data, that the relationship between Rotten Tomatoes Critics’ Score and Total Gross should be a positive one, corresponding to the right-tailed alternative hypothesis .

D)  Since we can compute the right-tailed p-value by dividing Minitab’s two-tailed p-value by two, obtaining p = .099/2 = .0495. Since this is less than .05, there is indeed evidence of a positive linear relationship between Total Gross and Rotten Tomatoes Critics’ Score at the 5% level of significance.

2)

A)  First, we compute the fitted value, Next, we compute the residual .

B)  Since the residual for Gravity is negative, the data point is below the fitted line. Indeed, relative to the model’s prediction, this movie underperformed by almost 100 Million Dollars.

3) The confidence interval is where SE=1.21 is the estimated standard error for , and since we have degrees of freedom, Table 6 yields Thus, the confidence interval is , that is, (−.54, 5.04).

4)

A) First, we compute the t-statistic by hand, as t = (1.102 – 1)/.554 = .184. Note that Minitab’s t-statistic is based on a different null hypothesis, so cannot be used here. The critical values for the two-tailed hypothesis test are −2.306 and 2.306. Since the absolute value of our observed t-statistic (.184) does not exceed 2.306, we do not reject the null hypothesis. We therefore do not have strong evidence that β is different from 1.

B) The null hypothesis implies that each additional million dollars in Opening Gross is associated with an increase of 1 Million Dollars in the expected Total Gross. Since Opening Gross is a part of the Total Gross, the null hypothesis, which we have not rejected, says that the Opening Gross is useless as a predictor of the additional (post-opening) gross for the movie.

5)

A)  One might have hoped, before seeing the data, for a positive relationship between Production Budget and Total Gross. Thus, we are testing the null hypothesis versus the right-tailed alternative hypothesis Unfortunately, the estimated slope is negative, so is not in the direction predicted by the alternative hypothesis. We are therefore going to get a large p-value. In fact, the p-value is the area to the right of −1.44 under the t-distribution with 8 degrees of freedom. From the Minitab output, we see that the area to the left of −1.44 is .199/2 = .0995, so the area to the right is 1−.0995 = .9005. The interpretation is that we have virtually no evidence to support a positive relationship between Production Budget and Total Gross.

B)  No, since as we know, any time you add variables to a regression the residual sum of squares goes down (or stays the same) so that the R2 goes up (or stays the same). So there is nothing surprising about the fact that the multiple regression model here has a larger R2 than either of the simple regression models.

C)  We have The simple regression models have k=1, and the multiple regression model has k=3. For all of these models, the sample size is n=10. So the AICC values for the simple regressions on x1, on x2, and for the multiple regression on (x1, x2, x3) are log(35426)+2(3)/(10−1−3)=11.4752, log(34011)+2(3) /(10−1−3)=11.43444 and log(12833)+2(5)/ (10−3−3)=11.95978. The selected model is the one with the smallest value of AICC, which is the simple regression on Opening Gross.

6) Note that When variables are added to a regression, we know that the value of SSE goes down (or stays the same), but the denominator, , also goes down (since k goes up). So it’s not necessarily true that (and therefore s) goes down.

7) First, we compute the t-statistic, . Since we can assume that the distribution of the t-statistic under the null hypothesis is standard normal. Thus, from Table 5, the two-tailed p-value is 2Prob{Standard Normal > 2.36}=2(.5−.4909)=.0182.

8) The p-value is the right tail-area under a t-distribution (with n−1 degrees of freedom). We take the area under the curve from the observed t-statistic to ∞. Since the t-distribution is symmetric about 0, the fact that this area is less than .5 implies that the observed t-statistic is greater than 0. Now, since the fact that implies that Therefore, it is not possible that

9) Since A, B are independent, we have P(A∩B)=P(A)P(B). Thus, the equation P(A∩B) = P(A) is equivalent to P(A)P(B)=P(A). Cancelling P(A) from both sides (which we can do since P(A) ≠0), we obtain P(B)=1. But since we have assumed that P(B)≠1 it is not possible to have P(A∩B) = P(A) under the given assumptions.

10) There are three possible samples: (1,2), (1,3) and (2,3). Thus, the sample mean is a discrete random variable taking values 1.5, 2 and 2.5 with probabilities 1/3, 1/3, 1/3. Since you and your friend act independently, there are 9 equally likely outcomes for (your , your friend’s ), as follows: (1.5,1.5), (1.5,2), (1.5,2.5), (2,1.5), (2,2), (2,2.5), (2.5,1.5), (2.5,2), (2.5,2.5). In three of these cases, the first entry is greater than the second. So the probability that your sample mean will be greater than your friend’s is 3/9 = 1/3.