Economics 375: Introduction to Econometrics

Economics 375: Introduction to Econometrics

Homework #3

This homework is due on Tuesday, May 3rd.

One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows a researcher to set up a known population regression function (something we’ve assumed we can never observe) and then act like a normal econometrician, forgetting for the moment the population regression function, and seeing how closely an OLS estimate of the regression comes to the true and known population regression function.

Our experiment will demonstrate that OLS is unbiased (something that chapter 4 will convince you of). The idea behind Monte Carlo experiments is to use the computer to create a population regression function (which we usually think of as being unobserved), then acting like we “forgot” the PRF, and using OLS to estimate the PRF. Thus, a Monte Carlo experiment allows a researcher to understand if OLS actually comes “close” to the PRF or not. In Stata, this is easy. Start by creating opening Stata and creating a new variable titled x1. The easiest way to do this is to click on the “Data Editor” button on the toolbar (looks like a spreadsheet with a pencil). In the top left cell type the number 1 and then in the next cell below type 2, then 3… through 20. After you are done typing, double click on the header of the column (where you see the name of the variable) and type x1 in the dialog box. To create a random normal variable type the following in the command line gen epsilon = rnormal(). This generates a new series of 20 observations titled "epsilon" where each observation is a random draw from a normal distribution with mean zero and variance one. In this case, gen represents the generate command, epsilon is the name of a new variable you are creating that is a random draw from a normal distribution. The generate command is a commonly used command in Stata. It might be worth reading the help menu on this command (type: help generate in the command line).

After creating epsilon, we are ready to create our dependent variable y. To do this, let’s create a population regression where we know the true slope and intercept of the regression. Since my favorite football player was Dave Krieg of the Seattle Seahawks (#17) and my favorite baseball player was Ryne Sandberg (#23), we will use these numbers to generate our dependent variable. In Stata use the gen command to create y where:

yi = 17 + 23x1i + epsiloni

Your command will look something like: gen y = 17 + 23*x1 + epsilon

1. Using your created data, use Stata’s reg command to estimate the regression:

yi = B0 + B1x1i

a. Why didn’t you include epsilon in this regression?

Epsilon represents the unknown and unobservable error term. By definition, this is unobservable by a researcher. Theoretically, the error term is part of every Y (after all, some randomness is likely to be part of almost any observed variable). Indeed, all a researcher does observe are the values of the x and y (if one knew the value of the error term, one could simply subtract the error term from y and know exactly where the population regression function lies).

b. What are your estimates of the true slope coefficients and intercept? Perform a hypothesis test that B1 = 23. What do you find?

My regression yields:

Dependent Variable: Y
Method: Least Squares
Date: 05/04/05 Time: 10:38
Sample: 1 20
Included observations: 20
Variable / Coefficient / Std. Error / t-Statistic / Prob.
C / 17.56934 / 0.337758 / 52.01752 / 0.0000
X1 / 22.99502 / 0.028196 / 815.5564 / 0.0000
R-squared / 0.999973 / Mean dependent var / 259.0171
Adjusted R-squared / 0.999971 / S.D. dependent var / 136.0422
S.E. of regression / 0.727094 / Akaike info criterion / 2.295118
Sum squared resid / 9.515985 / Schwarz criterion / 2.394691
Log likelihood / -20.95118 / F-statistic / 665132.2
Durbin-Watson stat / 1.760169 / Prob(F-statistic) / 0.000000

Because my error terms differ from yours, my answers will be different as well.

To test if the coefficient on x1 is statistically no different than 23 (the true coefficient in this case), we perform a hypothesis test:

H0 : B1 = 23

HA: B1 ¹ 23

t = (22.99502-23)/.028196 = -.176

tc,95%,18 = 2.101.

Because our t statistic is less than our critical value, we cannot reject the null hypothesis and conclude that the true coefficient is no different than 23. This should be comforting since we actually (only in Monte Carlo cases) know that the true coefficient is 23. Hence OLS is doing an adequate job evaluating our data (in other words, I won’t erroneously conclude the slope is something other than 23).

c. When you turn this homework into me, I will ask the entire class to tell me their estimates of the true, B0, and B1. I will then enter these estimates in a computer, order each from smallest to largest, and then make a histogram of each estimate. What will this histogram look like? Why?

I performed 10,000 different experiments exactly as described above. I found:

It is pretty apparent that the estimates of B0 and B1 are normally distributed around the true population means of 17 and 23. The Gauss Markov theorem indicates that these distributions have the smallest variance—as long as our classical assumptions are correct. Are they in this case?

Interestingly, when I performed this monte carlo experiment, I found the standard deviation of my 10,000 estimates of B0 to equal .4612 and for B1 to equal .0386. For a moment, consider the variances of the slope and intercept we discovered in class:

In our data, the sum of the xi2 = 665. Since the variance of the regression is equal to 1 (by virtue of setting up epsilon), the =.0015 and taking the square root gives the standard error of B1-hat of .0387—very close to the monte carlo estimate of .0386.

d. Use Stata to compute σ2=ei2n-2. The square root of this is termed the “standard error of the regression.” Does it equal what you would expect? Why or why not?

The std. error of the regression is the estimated standard deviation of the unobservable error term. Since we know the standard deviation of the actual error term is equal to 1, the std. error of the regression should be close to 1 (as a matter of fact, it should be a consistent estimator of the actual standard deviation of the error term).

2. On the class webpage, I have posted an Stata file entitled “2002 Freshmen.” This data is comprised of all complete observations of the 2002 entering class of WWU freshmen (graduating class of around 2006). The data definitions are:

aa: a variable equal to one if the incoming student previously earned an AA

actcomp: the student’s comprehensive ACT score

acteng: the student’s English ACT score

actmath: the student’s mathematics ACT score

ai: the admissions index assigned by WWU office of admissions

asian, black, white, Hispanic, other, native: a variable equal to one if the student is that ethnicity

f03 and f04: a variable equal to one if the student was enrolled in the fall of 2003 or the fall of 2004

gpa: the student’s GPA earned at WWU in fall 2002

summerstart: a variable equal to one if the student attended summerstart prior to enrolling in WWU

fig: a variable equal to one if the student enrolled in a FIG course

firstgen: a variable equal to one if the student is a first generation college student

housing: a variable equal to one if the student lived on campus their first year at WWU

hrstrans: the number of credits transferred to WWU at time of admission

hsgpa: the student’s high school GPA

male: a variable equal to one if the student is male

resident: a variable equal to one if the student is a Washington resident

runstart: a variable equal to one if the student is a running start student

satmath: the student’s mathematics SAT score

satverb: the student’s verbal SAT score

Some of these variables (the 0/1 or “dummy” variables) will be discussed in the future.

Admissions officers are usually interested in the relation between high school performance and college performance. Consider the population regression function:

gpai = b0 + b1hsgpai + ei

a. Prior to estimating this PRF, describe what characteristics you expect b0 and b1 to have.

I expect a positive coefficient for b1; students with greater high school GPAs will, on average, have higher college GPAs. Predicting the sign of b0 is more difficult because we have no observations where hsgpa = 0. However, on average college GPAs are probably lower than the high school GPAs of the students a college admits. Thus, a student who was admitted who did receive a 0 high school GPA is likely to earn a lower (negative) college GPA.

b. Estimate the PRF. Interpret your estimate of b1. Perform a hypothesis test that b1 = 0.

I find:

A unit increase in high school GPA increases college GPA by 1.015 units.

H0: β1 = 0

HA: β1 ≠ 0

t = (1.0017 – 0)/.045 = 21.74

tc,95%,2161= 1.96

Reject H0 and conclude that high school GPA does impact college GPA.

c. When I was in high school, my teachers told me to expect, on average, to earn one grade lower in college than what I averaged in high school. Based on the results of your regression, would you agree with my teachers?

If my teachers were correct, then the population regression function would be Fall02GPAi = -1 + 1×HSGPAi + εi. Note, that only under this population regression function would students earning any hsgpa would end up having exactly a one unit lower college gpa.

At first glance, one might look at our regression estimates and quickly conclude that the intercept is not equal to -1 so my teachers were incorrect. However, our estimated intercept of -.79 is an estimate; how likely does -.79 result when the true intercept is -1 is a question that can only be answered using a hypothesis test:

H0 : β0 = -1

HA : β0 ≠ -1

t = (-.74 - -1)/.162 = 1.60

tc,95%,2077 = 1.96

I fail to reject the null hypothesis and conclude that my regression is consistent with my teacher’s hypothesis.

The sophisticated reader will object to this approach because it only tests one of the two coefficients that are required for college gpa to always be one less than high school gpa. Indeed, one should also test β1 = 1. However, performing both test simultaneously requires an f-test, not a t-test. This will be the subject of discussion immediately before the midterm.

d. My high school GPA was a 3.85. Given that, what would you predict my fall quarter WWU gpa to be? Construct a 95% confidence interval of this amount.

GPA=-.74311+1.0017×3.85=3.11

The forecast variance is given by σf2=860.662081-1-11+12081+3.85-3.522202.8=.414

My 95% confidence interval is given by 3.11 ± 1.96×.414.5 = {1.85, 4.37}

e. Clearly there are other factors that GPA. Consider the PRF:

fall02gpai = b0 + b1hsgpai + b2satmathi + b3satverbi + ei

What are your expectations for b1, b2, b3?

I expect to find positive coefficients on b1, b2, b3. Higher values of high school GPA and SAT scores should translate into better college performance.

f. Estimate the equation presented in e. How do you interpret your estimate of b1 differently than you did in part b?

I find:

The coefficient on hsGPA indicates that, holding SAT Math and SAT Verbal constant, an increase in high school GPA by one unit raises college GPA by .875 units.

h. Which regression, b or e, do you prefer? Why?

I prefer the regression of part f because I believe that the SAT test explains some portion of college performance that is not captured by high school GPA alone. We will later talk of statistical methods that can be used to interpret this question.