Statistical Inference:

Simple Hypothesis Testing (F-test):

Model 0:

Model 1:

Model p:

.

We denote the following regression sum of squares,

,

,

and

.

Also, we denote the following residual sum of squares

0

(model 0) (model 1) (model p) (data)

We have the following fundamental equation:

Fundamental Equation:

(the distance between the data and model 0) =

(the distance between the data and model p) +

(the distance between model p and model 1) +

(the distance between model 1 and model 0)

The anova table associated with the fundamental equation is

Source / df / SS / MS
/ n-p / /
/ / 1 / /
/ p-1 /

Total n

Note:

,

,

and

Simple Hypothesis Testing:

(i)

To test the above hypothesis, the following F statistic can be used,

Intuitively, measures the difference between model p and model 0 while is the estimate of the variance of the random error. Thus, large F value implies the difference between model p and model 0 is large as the random variation reflected by the mean residual sum of square is taken into account. That is, at least some of are so significant such that the difference between model p and model 0 (no parameter) is apparent. Therefore, the F value can provide important information about if . Next question is to ask how large value of F can be considered to be large? By the distribution theory and the 3 assumptions about the random errors , as is true, where is the F distribution with degrees of freedom p and n-p, respectively.

(ii)

To test the above hypothesis, the following F statistic can be used,

Large F value implies the difference between model p and model 1 is large as the random variation reflected by the mean residual sum of square is taken into account. That is, at least some of are so significant such that the difference between model p and model 1 (only one parameter ) is apparent. as is true, where is the F distribution with degrees of freedom p-1 and n-p, respectively.

Testing for Several Parameters Being 0:

To test the above hypothesis, we need to derive some basic quantities for model q (q<p), model q: . Let

Then, the least square estimate for model q is

.

Then, the fitted value for model q

and the distance between model q and model 0 is

.

Thus, the distance between model p and model qis

Also,

To test the above hypothesis, the following F statistic can be used,

,

where

.

Large F value implies the difference between model p and model q is large as the random variation reflected by the mean residual sum of square is taken into account. That is, at least some of are so significant such that the difference between model p and model q is apparent. as is true, where is the F distribution with degrees of freedom p-q and n-p, respectively.

Note:

Also, we define the following sequential sum of squares,

Thus,

and

.

Therefore,

Example:

(model 6).

Describe how to test .

[solution:]

As is true, the reduced model is

(model 4).

Let

.

Then,

,

and

Thus,

1