Chapter 6 Multiple Linear Regression

Chapter 7 Multivariate Linear Regression Models

7.1-7.3: Least Squares Estimation

Data available:

The multiple linear regression model for the above data is

where the error terms are assumed to have the following properties:

The above data can be represented as the matrix form. Let

Then,

where the error terms become

Least squares method:

The least squares method is to find the estimate of minimizing the sum of squares of residual,

since . Expanding gives

since a real number.

Note:For two matrices A and B, and

Similar to the procedure in finding the minimum of a function in calculus, the least squares estimate b can be found by solving the equation based on the first derivative of ,

The fitted values (in vector):

The residuals (in vector):

Note: (i) where and .

(ii) where is any symmetric matrix.

Note: Since

is a symmetric matrix.

Also,

is a symmetric matrix.

Note: is called the normal equation.

Note:

Therefore, if there is an intercept, then the first column of Zis . Then,

Note: for the linear regression model without the intercept, might not be equal to 0.

Note:

whereis called “hat” matrix (or projection matrix). Thus,

Example 1:

Heller Company manufactures lawn mowers and related lawn equipment. The managers believe the quantity of lawn mowers sold depends on the price of the mower and the price of a competitor’s mower. We have the following data:

Competitor’s Price / Heller’s Price
/ Quantity sold

120 / 100 / 102
140 / 110 / 100
190 / 90 / 120
130 / 150 / 77
155 / 210 / 46
175 / 150 / 93
125 / 250 / 26
145 / 270 / 69
180 / 300 / 65
150 / 250 / 85

Theregression model for the above data is

The data in matrix form are

The least squares estimate is

The fitted regression equation is

The fitted equation implies an increase in the competitor’s price of 1 unit is associated with an increase of 0.414 unit in expected quantity sold and an increase in its own price of 1 unit is associated with a decrease of 0.269 unit in expected quantity sold. Thus,

[89.21,94.79,120.88,79.86,74.02,98.49,50.81,53.69,60.09,61.16]

and

[12.79,5.21,-0.88,-2.86,-28.02,-5.49,-24.81,15.31,4.91,23.84]

Suppose now we want to predict the quantity sold in a city where Heller prices it mower at $160 and the competitor prices its mower at $170. The quantity sold predicted is

Geometry of Least Squares:

In linear algebra,

is the linear combination of the column vector of . That is,

Then,

Least squares method is to find the appropriate such that the distance between and is smaller than the one between and the other linear combination of the column vectors of , for example, . Intuitively, is the information provided by covariates to interpret the response . Thus, is the information which interprets most accurately.

Further,

If we choose the estimate of such that is orthogonal every vector in , then . Thus,

That is, if we choose satisfying , then

and for any other estimate of ,

Thus, satisfying is the least squares estimate. Therefore,

Since

is called the projection matrix or hat matrix. projects the response vector on the space spanned by the covariate vectors. The vector of residuals is

We have the following two important theorems.

Properties of the least squares estimate:

2. The variance –covariance matrix of the least squares estimate b is

[Derivation:]

since

Also,

since

Denote

the mean residual sum of squares (the residual sum of squares divided by n-r-1).

the sample variance estimate,

where . can be used to estimate .

Properties of the mean residual sum of squares:

1. and .

[proof:]

and

Similarly,

Thus,

Therefore,

Gauss’s Least Squares Theorem:

Let , where , and Z has full rank r+1. For any c, the estimator

of has the smallest possible variance among all linear estimators of the form

that are unbiased for .

[proof:]

Let

Let be any unbiased estimator of with . Then,

That is, . Thus,

since

Useful Splus Commands:

estate=matrix(scan("E:\\T7-1.dat"),ncol=3,byrow=T)

estatelm=lm(estate[,3]~estate[,1]+estate[,2])

estatelm

summary(estatelm)

anova(estatelm)

Useful SAS Commands:

title'Regression Analysis';

data estate;

infile'E:\T7-1.dat';

input z1 z2 y;

procregdata=estate;

model y = z1 z2;

run;