Page 1

Econ107 Applied Econometrics

Topic 2: Ordinary Least Squares

(Studenmund, Chapter 2)

I. What is Ordinary Least Squares (OLS)

There are a number of techniques for constructing the SRF (ie estimating the regression model or estimating). But the most common and extensively used procedure is OLS or Least Squares Criterion.

First, recall that the residual from any SRF can be written:

Page 1

Suppose we use the following criterion in choosing the SRF.

where n is the sample size. We want to make the Residual Sum of Squares (RSS) as small as possible. The summation takes place over the sample. This is our objective function -- the least squares criterion. The goal of the estimation.

Show this in the following diagram

II. OLS Estimators

Solving this minimization problem, we get a general formula for the estimated slope coefficient:

Page 1

where the ‘bars’ over X and Y indicate sample means of X and Y respectively. From now on, lower case letters for these variables indicate deviations from the means (that is ). And

Note that both solutions are 'unique'.

III. Why Ordinary Least Squares (OLS)? Any Alternative?

Some algebraic properties of OLS:

1) the estimated regression line passes through the sample means of Y and X (see the formula for ).

2) the mean of the fitted value is equal to the mean of the dependent variable (that is ).

3) we can re-write this regression in its 'deviation form', that is

Page 1

4) the residuals are uncorrelated with the fitted value of :

Statistical properties of OLS will be discussed in Chapter 4.

An alternative to OLS – Least absolute deviation (LAD):

IX. Total, Explained, and Residual Sums of Squares

OLS only tells how to obtain the estimated regression function (ie SPF) from data. But how well the line fits the data? To address this issue, we need three concepts.

Let’s begin with the following expression,

which we can put in the 'deviation' form:

Squaring both sides, and summing over the sample we get:

where we make use of the fact that the fitted values of and the residuals are uncorrelated, and the ‘deviation form’ of the regression function.

This has the following interpretation. The Total Sum of Squares (TSS) is equal to the sum of the Explained Sum of Squares (ESS) and the Residual Sum of Squares (RSS). It is known the “decomposition of variance”.

The total variation in the dependent variable can be attributed to the regression line (the explained forces) and the residuals (the unexplained forces). The smaller the RSS, the better the fit of the line to the data.

X. A Measure of 'Goodness of Fit' (R2)

Thus far we've concentrated on the estimation of the coefficients in our regression. We now consider how well the regression function 'fits' the data.

The Coefficient of Determination (R2) is the summary measure in a two-variable regression model that indicates the magnitude of this 'goodness of fit'.

R2 is defined as:

or

Page 1

This Coefficient of Determination measures “... the percentage of the total variation in Yi explained by the regression model”. It is always between 0 & 1. A larger R2 means higher explanatory power for the explanatory variable.

XI. Multiple Linear Regression (MLR) Model

Often Y is affected by many variables instead of just one. In this case we have to use a MLR:

The slope coefficients are often called partial regression coefficients as they indicate the change in the dependent variable associated with a one unit increase in the independent variable in question holding constant the other independent variables in the equation.

OLS applies in the same way to MLR. That is, we minimize the RSS:

Min

The values of that minimize RSS are called the OLS estimates and denoted by.

Formulae for OLS estimates? OLS estimates still have closed form expression and hence solutions are unique. However, the expressions quickly get very complicated with K. It is much easier to use Matrix Algebra to obtain a simple expression for OLS estimates. This will be learned in a more advanced econometric course.

XII. Coefficient of Determination

The Coefficient of Determination (R2) has the same interpretation as in the SLR model and can be defined in the same way, that is,

where

And R2 is defined as:

However, there is a potential problem here. Suppose you have a MLR with k variables, and you now decide to add in one explanatory variable. Suppose your goal is to get a larger value for R2. In reality, no matter what that variable is, the R2surelycannot decrease. That is R2 is a non-decreasing function of the number of regressors since it does not penalize a lower degrees of freedom.

For this reason, you cannot simply choose a regression model with a larger value of R2. We need to develop an alternative measure of the “goodness of fit”. We now introduce such an alternative measure – adjusted coefficient of determination,.Define Adjusted R2 () as:

where K is the number of slope coefficients in the model. N-K-1 is the degrees of freedom, that is, the excess of the number of observations over the number of estimated coefficients (including the intercept).

Why is better? When K increases, RSS surely drops, but the denominator (n-K-1) also drops. Now if the drop in RSS is not large enough, RSS/(n-K-1) will actually increase so that will decrease.In other words, penalizes the measure of fit in adding an explanatory variable if that variable does not contribution much toward explaining the variable in Y.

However, do not misuse . 1) must also care about economic theory/intuition. 2) do not compare s when the dependent variables are different.

XIII. Example 2.2.3

IV. Continue the height regression Q2.11

V. Appendix (Optional):

1. Deriving OLS estimates in the SLR.

Consider the first order conditions, that is, differentiate the objective function with respect to the choice variables (ie) and set them equal to zero:

Rearranging terms we get:

where 'n' is the sample size. These simultaneous equations are known as the Normal Equations. Solving them, we get a general formula for the estimated slope coefficient:

where the ‘bars’ over X and Y indicate sample means. Substituting this back into the first normal equation we have

2. Proof of Algebraic Properties of OLS Estimates.

1)Recall that a line y=a+bx goes through a point (x0,y0) if and only if y0=a+b*x0. Now we need to prove that the estimated regression line

goes through . This is obvious from the expression of

2)This is also easy to prove because

If we sum both sides of the last equality over the sample and divide by the sample size n, the second term at the right hand side will disappear. Hence

3)First, the basic SRF can be written:

Substituting in the expression for and rearranging terms we have

or

4)To prove the residuals are uncorrelated with the fitted value we have