STA 3024 Fall 2014Exam 2 RIPOL NAME:______
TEST FORM CODE: C (top right corner of scantron) UF ID # ______Section 4433 – 4th period
SPECIAL CODE: 22 (on bottom of scantron)
INSTRUCTIONS:
- FILL OUT your personal information above.
- BUBBLE IN SCANTRON: Name, UFID, section number, Test Form Code and Special Code.
- FORMAT: This exam contains 29 Multiple Choice questions. Each question is worth 3.5 points, for a total of 101.5 points (so there are 1.5 bonus points on the test).
- ANSWERS: Select the best answer among the alternatives given. You may write whatever you want on this test, but only the answers bubbled in the scantron sheet will be graded.
- YOU MUST SUBMIT THIS TEST to the instructors together with the scantron sheet when you are finished.
- DOUBLE CHECK your personal information, test code and special code. If you miss any of those, 1.5 points will be deducted from your score.
- SCORES on the exam will be posted in Sakai within a week – please see course web page for details.
- SIGN HONOR PLEDGE:
"On my honor, I have neither given nor received unauthorized aid on this examination."
Signature: ______
1. The adjusted R2 is "adjusted” for:
a) Response variable.
b) Dummy variables.
c) Number of predictors.
d) Sample size.
e) All of the above.
2. Which of the following statements about the correlation coefficient are true?
I. Correlations of +0.87 and −0.87 indicate the same degree of clustering around the regression line.
II. The correlation coefficient and the slope of the regression line may have opposite signs.
III. A correlation of 1 indicates a perfect cause-and-effect relationship between the variables.
a) I onlyb) II onlyc) III onlyd) I and IIe) I, II, and II
3. In a linear regression, why do we need to be concerned with the range of the predictor variables?
a) To identify baseline groups.
b) To establish multicollinearity.
c) To avoid extrapolation.
d) To add influential points.
e) All of the above
4. To apply multiple regression to situations when predictor variables are categorical, ______variables are used.
a) dependent b) correlated c) dummyd) interaction e) squared
5. Which of the following statements are true about the regression model where x2 is a dummy variable?
a) and are intercepts
b) and are intercepts
c) and are slopes
d) and are slopes
e) none of the above
Data was collected to predict the amount of student Loans owed (in dollars) based on Gender (coded as Female=0, Male=1), Class (Freshmen=1, Sophomore=2, Junior=3, Senior=4), number of credit Hours taken this semester, number of hours of Work per week, and amount of CreditCard Debt the student has. Graphs and partial output for several regression analyses appear on the last page of this exam. Use that page to answer the following questions.iid
6. The assumptions can be written as ε ~ N(0, σ) for which model?
a) all of themb) Output 5c) Output 4d) Output 3 e) Output 2
7. Is it reasonable to say the assumption of random samples is satisfied here?
a) No – the plot of Residuals vs Fit does not show a linear pattern.
b) No – there is not enough information given to determine this.
c) No – some of the predictor variables were not significant predictors of loans.
d) Yes – the histogram of the residuals is roughly bell-shaped.
e) Yes – the plot of Residuals vs Order shows a random pattern.
8. Is it reasonable to say the assumption of Normal distribution is satisfied here?
a) No – the plot of Residuals vs Fit does not show a linear pattern.
b) No – there is not enough information given to determine this.
c) No – some of the predictor variables were not significant predictors of loans.
d) Yes – the histogram of the residuals is roughly bell-shaped.
e) Yes – the plot of Residuals vs Order shows a random pattern.
9. All of the models presented have:
a) different number of observations and different number of predictor variables.
b) different number of observations and different number of response variables.
c) the same number of observations and the same number of predictor variables.
d) the same number of observations but different number of predictor variables.
e) the same number of predictor variables, observations and responses.
10. The method used to refine the model in Outputs 2-5 was:
a) Data Mining
b) Baseline Modeling
c) Best Subsets
d) Backwards Elimination
e) Forward Selection
11. The term Class*Work in the model in output 6:
a) takes into account the fact that loans are likely to increase over a college career.
b) allows us to test whether there is an interaction between the two variables.
c) implies that upperclassmen are more likely to have a job than underclassmen.
d) considers the possibility that students with loans need to get a job.
e) all of the above.
12. The model in output 6:
a) Does not need to be refined because the only t-test we interpret here is the interaction.
b) Should be refined by eliminating both Class*Work and Work because they have high p-values.
c) Should be refined by eliminating Class because it has a low p-value.
d) Should be refined by eliminating the Constant, since we never interpret it.
e) Should be refined by eliminating Class*Work because it has a high p-value.
13. For the model in output 6, the null hypothesis for the ANOVA test should be written as:
a) == b) === 0
c) = 0 d) = 0 e) = 0
14. Find R2 for the model in output 3:
a) 64.3%b) 53.8%c) 50.5%d) 32.5%e) 49.5%
15. Which of the models exhibits multicollinearity?
a) Output 6 b) Output 5c) Output 4d) Output 3 e) Output 2
16. Use the model in output 6 to predict Loans for a Sophomore who works 5 hours a week.
a) $7264 b) $3485 c) $3982d) $4291 e) $6482
17. Which single variable is better at predicting GPA?
a) Genderb) Class c) Hours d) Work e) CreditCard Debt
18. Which of the models is the best at predicting Loans?
a) Output 6 b) Output 5c) Output 4d) Output 3 e) Output 2
19. For the model in output 3, the margin of error for a 95% confidence interval for the coefficient of Class is:
a) (-0.15) (742.9) b) (-0.15) (3985.77)
c) (2.776) (742.9) d) (2.776) (3985.77)
e) none of the above
20. For the model in output 2, find the test statistic to determine if Class is a good predictor of Loans.
a) -0.37 b) -0.15c) 4.32 d) 0.12 e) 3.44
21. Find the test statistic for the ANOVA test for the model in output 3.
a) 27.60b) 20.27c) 15.86d) 42.06e) 28.24
Match the model with the output:
___ 22. Output 6a)
___ 23. Output 5b)
___ 24. Output 4c)
___ 25. Output 3d)
___ 26. Output 2e) none of the above
27. Interpret the intercept for the model in output 2.
a) Females owe, on average, $3257 in student loans, keeping all other variables constant.
b) Females owe, on average, $3257 less than makes in student loans, keeping all other variables constant.
c) Males owe, on average, $3257 in student loans, keeping all other variables constant.
d) Males owe, on average, $3257 more than females in student loans, keeping all other variables constant.
e) We should not interpret the intercept for this model.
28. Interpret the coefficient of Gender for the model in output 2. Keeping all other variables constant,
a) females owe, on average, $18700 in student loans, which is a significant amount.
b) males owe, on average, $187 more than females in student loans, but this is not a significant difference.
c) males owe, on average, $187 less than females in student loans, but this is not a significant difference.
d) males owe, on average, $18700 less than females in student loans, which is a significant difference.
e) males owe, on average, $18700 in student loans, which is a significant amount.
29. Comparing the models in output 2 and output 3 we can say that:
a) Model 2 is better because it has higher MSE.
b) Model 2 is better because it has more predictor variables.
c) Model 3 is better because it has a higher R2 adjusted.
d) Model 3 is better because it has less predictor variables.
e) They are both equally good because their ANOVA p-values are the same.