Professor Kasey Buckles, Spring 2007
Economics 30331: Econometrics
M I D - T E R M E X A M I N A T I O N
Instructions
· Write your name on this test and on the blue book. Multiple choice answers should be written in the space provided on this exam, short answers should be given in the blue book.
· The exam is open book, and you may use your one page of additional notes. You may also use a calculator, which you are not allowed to share with other students.
· You have 75 minutes to take the exam. The exam is 75 points, and the points assigned to each question should give you an idea of how many minutes to spend on each question.
· There are 7 Multiple Choice questions and 5 Short Answers. Please check to make sure that you have the full test.
· This exam is administered under Notre Dame’s Honor Code.
Name: ______
(Signing your name indicates that you have read and understood the above instructions)
I. Multiple Choice (3 points each)
Select the best answer for each question, and write the letter of your answer to each question in the following space. No explanations are necessary.
1. ______
2. ______
3. ______
4. ______
5. ______
6. ______
7. ______
1. An econometrician would like to estimate the following model:
score = β0 + β1 black + β2 faminc + β3 schoolQ + u
where score = score on school achievement test
black = 1 if the student is black, 0 otherwise
faminc = student’s family income
schoolQ = quality of the student’s school
Suppose the econometrician is unable to observe schoolQ, and therefore leaves it out of the regression. If cov(black, schoolQ)<0, cov(faminc, schoolQ)>0, cov(black, faminc)<0, and cov(score, schoolQ)>0, then:
A. we expect our estimate of β1 to be negatively biased and of β2 to be positively biased.
B. we expect our estimate of β1 to be positively biased and of β2 to be negatively biased.
C. we expect our estimates of β1 and β2 to be negatively biased.
D. we expect our estimates of β1 and β2 to be positively biased.
E. we expect our estimates of β1 and β2 to be unbiased.
2. An econometrician estimates the following model of wages, where every person in the sample lives in one of the four regions (north, south, east, or west):
wage = β0 + β1 north + β2 south + β3 east + β4 west + u
Which of the following is false?
A. OLS will not be BLUE.
B. The parameter estimates will be unbiased.
C. Inference will not be valid.
D. The variance of the parameter estimates will be undefined.
Questions 3 & 4: Suppose a researcher estimates a wage model, and gets the following STATA results, where:
lwage = log wages
educ = years of education
exper = years of experience
female = 1 if individual is female, 0 otherwise.
Source | SS df MS Number of obs = 526
------+------F( 3, 522) = 94.75
Model | 52.29391 3 17.4313033 Prob > F = 0.0000
Residual | 96.0358517 522 .183976727 R-squared =
------+------Adj R-squared = 0.3488
Total | 148.329762 525 .28253288 Root MSE = .42893
------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
------+------
educ | .0912897 .0071232 12.82 0.000 .0772962 .1052833
exper | .0094139 .0014493 6.50 0.000 .0065667 .012261
female | -.3435967 .0376668 -9.12 0.000 -.4175939 -.2695995
_cons | .4808356 .1050163 4.58 0.000 .2745292 .6871421
------
3. What is the R-squared in this regression?
A. 0.3488
B. 0.3526
C. 0.4554
D. 0.5446
E. 0.6475
4. Suppose we added the variable kids to the original model, where kids is the number of children. If kids is an irrelevant variable, we would expect all of the following except:
A. Adjusted R-squared to decrease.
B. R-squared to increase or stay the same.
C. the variances of all estimated parameters to increase.
D. we would not be able to reject Ho: the coefficient on kids is equal to zero.
E. None of the above—we would expect all of these would happen.
5. The normality assumption (#6) is required for which of the following (for any sample size):
A. Inference using OLS is valid.
B. The standard errors calculated using OLS are valid.
C. OLS is BLUE.
D. OLS parameters estimates are unbiased.
E. None of the above—normality is never required for any of these.
6. In a model with heteroskedasticity, but all other Classical Linear Model Assumptions satisfied, which of the following is not true:
A. OLS is unbiased.
B. OLS is consistent.
C. OLS is BLUE.
D. Inference will be invalid.
E. None of the above—all of these are true.
7. To estimate a model of NBA player salaries on # games played per season and # minutes played per season, we estimate:
. reg wage games minutes
Source | SS df MS Number of obs = 269
------+------F( 2, 266) = 77.81
Model | 98873273.6 2 49436636.8 Prob > F = 0.0000
Residual | 169005652 266 635359.594 R-squared = 0.3691
------+------Adj R-squared = 0.3644
Total | 267878925 268 999548.229 Root MSE = 797.09
------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
------+------
games | -19.57968 4.19345 -4.67 0.000 -27.83626 -11.3231
minutes | .9560003 .0884907 10.80 0.000 .781769 1.130232
_cons | 1102.523 189.3176 5.82 0.000 729.7711 1475.275
------
What is likely to be the best explanation for why games has a negative coefficient?
A. Players who play more games earn less than players who play fewer games.
B. Players who play a lot of games get hurt a lot, and so their salary falls.
C. Playing more games, holding minutes constant, means that you play few minutes per game and are therefore not a very good player.
D. The sample is a just a “bad draw” and the population coefficient is likely positive.
II. Short Answer
8. (4 points) Consider the model:
colgpa = β0 + β1 hsize + β2 hsize2 + β3 hsperc + β4 sat + β5 female+ β6 athlete + u
where colgpa is college GPA, hsize is high school size, hsperc is the student’s percentile rank in high school, sat is SAT score, and female and athlete are dummy variables indicating that the student is a female and an athlete, respectively.
How would you modify the model to allow the effect of being an athlete to differ by gender? Specify your model in a way that allows you to directly test the null hypothesis that there is no ceteris paribus difference between women athletes and women non-athletes. (That is, you should be able to perform a simple t-test). State this null hypothesis.
9. (8 points) In class, we derived the OLS estimator for simple regression two ways.
a. Name the two ways. For each of these two ways, explain how we derived OLS using this technique, in at most two sentences.
b. Which assumptions did we need to derive the OLS estimators?
10. (9 points). Using data on housing prices, the following equations were estimated:
[1] log(price) = 11.71 – 1.043 log(nox) + uhat
[2] log(price) = 9.23 – .718 log(nox) + .306 rooms + uhat
where nox is the amount of nitrous oxide in a community, and price and rooms describe the median price and average number of rooms of houses in the community.
a. For the second specification (with rooms), interpret the coefficients on log(nox) and rooms.
b. Why do you think the coefficient on log(nox) is more negative in the first equation?
c. True or false: -.718 is definitely closer to the true elasticity than -1.043. Explain your answer.
11. (15 points) You estimate the following model of housing prices:
Source | SS df MS Number of obs = 88
------+------F( 5, 82) = 57.15
Model | 6.22997343 5 1.24599469 Prob > F = 0.0000
Residual | 1.78764853 82 .021800592 R-squared = 0.7770
------+------Adj R-squared = 0.7634
Total | 8.01762195 87 .092156574 Root MSE = .14765
------
lprice | Coef. Std. Err. t P>|t| [95% Conf. Interval]
------+------
lassess | 1.036186 .1510441
bdrms | .0254289 .0230354
lsqrft | -.0921454 .1382578
llotsize | .0083737 .0384409
colonial | .0447857 .0359293
_cons | -.0402521 .9687942
------
a. Write down the population model being estimated.
b. Interpret the coefficient on lassess (β1hat). At the 5% level, test the null hypothesis that β1 = 1.
c. Interpret the coefficient on bdrms (β2hat). At the 5% level, t est the null hypothesis that β2 = 0.
d. You want to test the null hypothesis that the coefficients on bedrooms, lsqrft, llotsize, and colonial are jointly equal to zero. State the null and alternative hypotheses, and write down the restricted model. What kind of test statistic would you need to create? What are the numerator and denominator degrees of freedom for this test?
12. (18 points) Now suppose an econometrician is interested in the factors that affect whether an individual is divorced or not. She has the following variables:
divorced = 1 if individual is currently divorced, 0 otherwise
age = individual’s age in years
educ = individual’s education in years
male = 1 if individual is male, 0 otherwise
agemale =age*male
. reg divorced age educ male agemale
Source | SS df MS Number of obs = 101693
------+------F( 4,101688) = 172.29
Model | 57.0046607 4 14.2511652 Prob > F = 0.0000
Residual | 8411.28234101688 .082716568 R-squared = 0.0067
------+------Adj R-squared = 0.0067
Total | 8468.287101692 .083273876 Root MSE = .2876
------
divorced | Coef. Std. Err. t P>|t| [95% Conf. Interval]
------+------
age | .0010243 .0000668 15.33 0.000 .0008934 .0011552
educ | .0022388 .0003008 7.44 0.000 .0016493 .0028283
male | -.0265578 .0046785 -5.68 0.000 -.0357275 -.0173881
agemale | .0001686 .0000997 1.69 0.091 -.0000267 .000364
_cons | .0281631 .0050494 5.58 0.000 .0182664 .0380598
------
a. Is the coefficient on agemale statistically significant at the 5% level for a two-sided test?
b. Using the results from the estimated model above, what is the marginal effect of being male on the probability of being divorced? For males, what is the effect of an extra year of age on the probability being divorced?
c. The econometrician’s original model was: divorced = β0 + β1 age + β2educ + u. Suppose she wants to know if this model is different for men and women. Write down the restricted and unrestricted model, and state the null hypothesis.
d. The econometrician wants to perform a Chow test to test the null hypothesis from part C. Explain how she would perform the test at the 5% level, making sure to identify the numerator and denominator degrees of freedom you would need, and the necessary critical value.
7