Principles of Econometrics Class of October 14Thfeunl

Principles of Econometrics – class of October 14thFEUNL

Notes by José Mário Lopes[1]

1. Heteroskedasticity in a cross-section framework (examples from chapter 8)

Heteroskedasticity happens whenever Var(ui|x1,x2,…) is not constant for all observations.

Last class, you’ve seen how to robustify your standard errors when you suspect to be in the presence of heteroskedasticity. In the multiple regression model,

You would have to compute

Where rij denotes the ith residual from regressing xj on all other independent variables (see section 8.2).

In EViews, this can be done by choosing on the “Options” Menu, the White standard errors.

Robust standard errors and t statistics are appropriate as the sample sizes increases. We don’t always use these robust standard errors because, in small samples, the robust t statistics can depart a lot from the t distribution.

Hence, it is important to know whether there is or there isn’t heteroskedasticity in our sample. Let’s perform a few examples picking examples from the book. Take the example on the demand for cigarettes, from chapter 8. Open the corresponding workfile.

We wish to estimate the demand for cigarettes measured by the number of cigarettes smoked per day as a function of income, the price of a pack of cigarettes, education, age, squared age and the presence of a ban on restaurants from the state the person surveyed lives.

We get the following results:

-neither income nor cigarette price is significant and their impacts would be small anyway (eg, if income increases by 10%, cigs increases by (0.880/100)*10=0.088 cigarettes per day);

-education reduces smoking;

-smoking increases with age up until approx. 42.83 years (basically, maximize cigarettes smoked in variable age; derive the part related to age and age squared and make it equal to zero). After that, it falls.

But now, a very important question: is there heteroskedasticity? If so, the usual standard errors and t statistics will be wrong and OLS will not be efficient. We will perform just a couple tests to check for heteroskedasticity. See other tests available on EViews.

First, let’s run the Breusch-Pagan test for heteroskedasticity:

1)Estimate the model by OLS, keep the squared OLS estimated residuals.

2)Run an auxiliary regression of the squared OLS estimated residuals on the independent variables. Keep the R-squared from this regression.

3)Form either the F (following a F(k,n-k-1)) or the LM (following a chi-square with k degrees of freedom). If the p-value is greater than 5%, we do not reject the null of homoskedasticity.

In EViews, this is very easy to do.

Behold how many options you have for running a heteroskedasticity test!

For a BP test, we get

Both the F test and the LM (obs*Rsquared of the auxiliary regression) conclude for the rejection of the null of homoskedasticity.

You should check that EViews is doing this right. How? Generate the residuals yourself and perform the regression as usual (New Object/Equation, etc.). You will get the same output as above.

White test for heteroskedasticity takes into account the possibility that the variance structure might be richer. The squares and cross-products of the independent variables are also included in the right-hand side. Alternatively, whenever you have too many independent variables, you can use the fitted values of the dependent variable and the squared fitted values of the independent variable.

In our case, you get

Heteroskedasticity Test: White
F-statistic / 2.159258 / Prob. F(25,781) / 0.0009
Obs*R-squared / 52.17245 / Prob. Chi-Square(25) / 0.0011
Scaled explained SS / 110.0813 / Prob. Chi-Square(25) / 0.0000

This means that the null of homoskedasticity is rejected.

From this point, we can correct the standard errors using the White robust standard errors.

Or, we can transform the model and run OLS on this transformed model. How?

Feasible Generalized Least Squares procedure:

-generate the estimated squared residuals (the residuals from the model ;

-regress the log of the estimated squared residuals on the independent variables (why the log?), obtain the fitted values of this regression

-exponentiate the fitted values to get

-estimate the equation by WLS, using as weights

Since we have to estimate h, FGLS will not be unbiased but it is consistent and asymptotically more efficient than OLS.

If cigs_residsq stands for the estimated h, we have to divide the model by

1/square root(h). Why? See book (there is a univariate example there for savings, start from there).

We will get

Dependent Variable: CIGS/SQR(CIGS_RESIDSQF)
Method: Least Squares
Date: 10/12/09 Time: 17:06
Sample: 1 807
Included observations: 807
Coefficient / Std. Error / t-Statistic / Prob.
1/SQR(CIGS_RESIDSQF) / 5.635471 / 17.80314 / 0.316544 / 0.7517
LOG(INCOME)/SQR(CIGS_RESIDSQF) / 1.295239 / 0.437012 / 2.963855 / 0.0031
LOG(CIGPRIC)/SQR(CIGS_RESIDSQF) / -2.940314 / 4.460145 / -0.659242 / 0.5099
EDUC/SQR(CIGS_RESIDSQF) / -0.463446 / 0.120159 / -3.856953 / 0.0001
AGE/SQR(CIGS_RESIDSQF) / 0.481948 / 0.096808 / 4.978378 / 0.0000
AGE^2/SQR(CIGS_RESIDSQF) / -0.005627 / 0.000939 / -5.989706 / 0.0000
RESTAURN/SQR(CIGS_RESIDSQF) / -3.461064 / 0.795505 / -4.350776 / 0.0000
R-squared / 0.002751 / Mean dependent var / 0.966192
Adjusted R-squared / -0.004728 / S.D. dependent var / 1.574979
S.E. of regression / 1.578698 / Akaike info criterion / 3.759715
Sum squared resid / 1993.831 / Schwarz criterion / 3.800425
Log likelihood / -1510.045 / Hannan-Quinn criter. / 3.775347
Durbin-Watson stat / 2.049719

We could also use Menu “Options”/WLS and write down the appropriate weighting scheme.

2. A little bit on time series – just a few issues (examples from chapters 10 to 12)

2.1 Take the workfile about housing investment and prices.

There are a lot of interesting things you can do now. You can take a series and study its evolution over time. Take the housing price index for instance

You can actually see several graphs at the same time if you select a Group of variables.

Let’s estimate a simple model, now.

The log of the price seems to be significant. You may think this is OK, but it is not. Both variables are trending throughout the sample.

If you take a look at the residuals, you can see if what you’re doing makes sense or not.

They are not stationary (there are formal tests to see this, namely unit root tests like the Dickey-Fuller or Phillips-Perron tests and you can always look at the correlogram of the residuals). This means we should rethink your specification. Our previous regression wasspurious.

We now add a linear trend to take account of the trending behaviour of LINVPC.

LPRICE does not come significant anymore. We conclude that there are other factors beyond the price that are captured by the linear trend that seem to be important.[2]

Notice that these other factors are not modelled just by adding a linear trend. Moreover, the fact that a linear trend appears to be informative shouldn’t prompt you to get carried away and startobsessively adding a huge train of trend terms (linear, quadratic,…)

What we just did (adding a linear trend) has a detrending interpretation: it is equivalent to regressing all variables over a constant and a linear trend, saving the residuals and regressing the residuals of the dependent variable regression over the residuals of the independent variables regressions (see book).

2.2 Important assumptions and problems in a Time Series framework:

The Gauss-Markov theorem requires both homoskedasticity and absence of serially correlated errors. Otherwise, the OLS estimator will not be BLUE and the usual standard errors and t-statistics will no longer be valid.

How do we test for the presence of serial correlation?

Let’s see a few possibilities available in EViews.

You can take a look at the Durbin-Watson statistic, that appears at the bottom of the results.

Dependent Variable: LINVPC
Method: Least Squares
Date: 11/26/03 Time: 08:47
Sample: 1 42
Included observations: 42
Coefficient / Std. Error / t-Statistic / Prob.
C / -0.913060 / 0.135613 / -6.732815 / 0.0000
LPRICE / -0.380961 / 0.678835 / -0.561198 / 0.5779
T / 0.009829 / 0.003512 / 2.798445 / 0.0079
R-squared / 0.340765 / Mean dependent var / -0.666155
Adjusted R-squared / 0.306959 / S.D. dependent var / 0.172543
S.E. of regression / 0.143641 / Akaike info criterion / -0.974252
Sum squared resid / 0.804675 / Schwarz criterion / -0.850133
Log likelihood / 23.45930 / F-statistic / 10.07976
Durbin-Watson stat / 1.048727 / Prob(F-statistic) / 0.000296

The Durbin-Watson test, valid under classical assumptions, is based on the OLS residuals and one can show that DW is approximately 2(1-) where  is the first-order correlation coefficient between residuals at t and residuals at t-1. If the Durbin-Watson statistic is near 2, the correlation coefficient will be near 0. Hence, we are looking for a value significantly below 2 (for a positive correlation coefficient) and significantly above 2 (for a negative correlation coefficient). Imagine you were testing if  was close to zero (DW close to 2) against an alternative hypothesis that  was bigger than zero (DW smaller than 2). There are two critical values, dL and dU, tabled by Savin and White (1977), depending on the number of observations and the number of regressors. This means that, if DW falls between dL and dU, the results are inconclusive.

After you estimate your model, you have a Serial correlation menu under Residual Tests.

This is the Breusch-Godfrey test. The Null is of absence of autocorrelation. Here, we reject this Null: there is evidence to say there is autocorrelation. We basically are keeping the residuals of the regression, and regressing ut over ut-1, ut-2,… and the regressors. If the F statistic rejects this model, we conclude that there is no autocorrelation.[3]

Once you find out that here is first-order serial correlation, you can transform the model to take this into account:

1 – estimate the original model and take the estimated residuals.

2 –run the regression of ût over ût-1 to compute the correlation coefficient.

3 – For every-variable xt (and for the dependent variable), compute the quasi-differenced variable xt-xt-1

4- Apply OLS to the equation with the quasi-differenced variables. The usual standard errors, t statistics and F are asymptotically valid.[4]

Alternatively, you can estimate the model as usual, but converting the standard errors at the end. This may be better than simple FGLS.

Just pick the option “Newey-West”. You will be correcting for both heteroskedasticty and autocorrelation.

Notice that you can always test for heteroskedasticty as in cross-section cases. (just check the options under “Heteroskedasticity tests”. They are the same ones as before. However, for these tests to be valid, the errors should not be autocorrelated; also, for the F statistic of the Breusch-Pagan test to be valid, the residuals of this auxiliary regression should themselves be serially uncorrelated and homoskedastic.

2.3 Famous time series processes

Open the programs that generate an AR(1) and an MA(1).

AR(1)

We can write it as where the error is a white noise(constant variance and mean zero). If =1, you have what is known as a random walk. It is a typically nonstationary process, highly persistent.We say that this process has a stochastic trend, as opposed to a deterministic trend that appears whenever we have a linear trend directly establishing a trend in the variable.

Compare the highly persistent random walk above to a stationary little persistent AR(1)

If you add a constant to the random walk, you get the random walk with drift

Notice how the drift adefines a linear trend behaviour in the series!

MA(1)

Here, we model the residual part and make it richer than it was before.

The graph comes as follows

It is clearly stationary (once again, this can be tested through the so-called unit root tests). Actually, a pure MA process is always stationary.

You can create other processes yourself, eg, combining AR and MA parts to get ARMA models.

[1] If you find any typo in these notes, please e-mail me so I can correct it.

[2]You should always test the residuals to see if they’re well-behaved. In this case, they are still nonstationary. In a practical work, you should keep on looking for a correct specification.

[3]The regressors appear because we are assuming away strict exogeneity of the regressors. If we had strict exogeneity, we only needed to regress the residuals on their lagged values, the regressors wouldn’t be needed. See book on this.

[4] This is known as the Cochrane-Orcutt estimation, omitting the first observation. If you transform the first equation to include the first observation in the regression, you call this the Prais-Winsten estimation.