1

Chapter 7 Multivariate Linear Regression Models

7.1-7.3: Least Squares Estimation

Data available:

The multiple linear regression model for the above data is

where the error terms are assumed to have the following properties:

1.

2.

3.

The above data can be represented as the matrix form. Let

.

Then,

=,

where the error terms become

1.

2.

Least squares method:

The least squares method is to find the estimate of minimizing the sum of squares of residual,

since . Expanding gives

since a real number.

Note:For two matrices A and B, and

Similar to the procedure in finding the minimum of a function in calculus, the least squares estimate b can be found by solving the equation based on the first derivative of ,

The fitted values (in vector):

The residuals (in vector):

Note: (i) where and .

(ii) where is any symmetric matrix.

Note: Since

,

is a symmetric matrix.

Also,

,

is a symmetric matrix.

Note: is called the normal equation.

Note:

.

Therefore, if there is an intercept, then the first column of Zis . Then,

Note: for the linear regression model without the intercept, might not be equal to 0.

Note:

,

whereis called “hat” matrix (or projection matrix). Thus,

.

Example 1:

Heller Company manufactures lawn mowers and related lawn equipment. The managers believe the quantity of lawn mowers sold depends on the price of the mower and the price of a competitor’s mower. We have the following data:

Competitor’s Price / Heller’s Price
/ Quantity sold

120 / 100 / 102
140 / 110 / 100
190 / 90 / 120
130 / 150 / 77
155 / 210 / 46
175 / 150 / 93
125 / 250 / 26
145 / 270 / 69
180 / 300 / 65
150 / 250 / 85

Theregression model for the above data is

.

The data in matrix form are

.

The least squares estimate is

.

The fitted regression equation is

.

The fitted equation implies an increase in the competitor’s price of 1 unit is associated with an increase of 0.414 unit in expected quantity sold and an increase in its own price of 1 unit is associated with a decrease of 0.269 unit in expected quantity sold. Thus,

[89.21,94.79,120.88,79.86,74.02,98.49,50.81,53.69,60.09,61.16]

and

[12.79,5.21,-0.88,-2.86,-28.02,-5.49,-24.81,15.31,4.91,23.84]

Suppose now we want to predict the quantity sold in a city where Heller prices it mower at $160 and the competitor prices its mower at $170. The quantity sold predicted is

.

Geometry of Least Squares:

.

In linear algebra,

is the linear combination of the column vector of . That is,

.

Then,

Least squares method is to find the appropriate such that the distance between and is smaller than the one between and the other linear combination of the column vectors of , for example, . Intuitively, is the information provided by covariates to interpret the response . Thus, is the information which interprets most accurately.

Further,

If we choose the estimate of such that is orthogonal every vector in , then . Thus,

.

That is, if we choose satisfying , then

and for any other estimate of ,

.

Thus, satisfying is the least squares estimate. Therefore,

Since

,

is called the projection matrix or hat matrix. projects the response vector on the space spanned by the covariate vectors. The vector of residuals is

.

We have the following two important theorems.

Properties of the least squares estimate:

1.

2. The variance –covariance matrix of the least squares estimate b is

[Derivation:]

since

.

Also,

since

Denote

,

the mean residual sum of squares (the residual sum of squares divided by n-r-1).

the sample variance estimate,

where . can be used to estimate .

Properties of the mean residual sum of squares:

1. and .

2.

3.

4.

[proof:]

1.

and

.

2.

3.

Similarly,

4.

Thus,

Therefore,

Gauss’s Least Squares Theorem:

Let , where , and Z has full rank r+1. For any c, the estimator

of has the smallest possible variance among all linear estimators of the form

that are unbiased for .

[proof:]

Let

.

Let be any unbiased estimator of with . Then,

That is, . Thus,

since

.

Useful Splus Commands:

estate=matrix(scan("E:\\T7-1.dat"),ncol=3,byrow=T)

estatelm=lm(estate[,3]~estate[,1]+estate[,2])

estatelm

summary(estatelm)

anova(estatelm)

Useful SAS Commands:

title'Regression Analysis';

data estate;

infile'E:\T7-1.dat';

input z1 z2 y;

procregdata=estate;

model y = z1 z2;

run;

1