252regrex1 9/20/99

MINITAB EXAMPLE

Simple Regression

Explanation: The data set has already been prepared and stored as famdat.mtw. Column 1 (C1) is labeled ‘Y’, C2 is labeled ‘X’, C3 is labeled ‘RESID’ and C4 is labeled ‘PRED’. C1 and C2 contain the data and C3 and C4 are blank. The data set is retrieved, plotted, and printed out. Then the command for simple regression is given. In the command, the words ‘on 1’ indicate that there is only one independent variable. The equation of the regression line (, where and ) is printed out. This is followed by a short table that repeats in the ‘Coef’ column the coefficients and . The quantities in the ‘Stdev’ column are the two standard deviations and . In the ‘t-ratio’ column, are the two ratios and , which are used to test and . Finally, in the ‘p’ column are the p-values for the two null hypotheses. If we assume that , since the first p-value is 0.117, which is above the significance level, we do not reject the null hypothesis for , and thus say that the intercept is not significant. Similarly, since the second p-value is below the significance level, we reject the null hypothesis for and say that the slope is significant.

’s’ is the standard error , which is used in computing t-tests and confidence intervals. ‘R-sq’ is, of course, .

An analysis of variance table now appears. Note that the value of

that is shown is the square of the t-ratio used to test the hypothesis that the slope is zero, and that the p-values are the same. This is because the hypothesis that the slope is zero and the hypothesis tested by the analysis of variance are equivalent in simple regression.

A printout of the original data follows with two new columns added. The column labeled ‘RESID’ seems to be a standardized residual, and can be ignored. The column labeled ‘PRED’ is the value of Y () predicted by the equation , using the values of ‘X’ .

The next graph shows the values of ‘PRED’ plotted against ‘X’. These are the regression line. The final graph shows both the actual and predicted values plotted on the same axes, and thus enables us to connect the predicted points to show the regression line compared to the actual points. The actual commands used here, which produce a color plot, are explained in the MINITAB Handbook under “High Resolution Graphics.”

Minitab Output:

Worksheet size: 100000 cells

MTB > Retrieve 'C:\MINITAB\FAMDAT.MTW'.

Retrieving worksheet from file: C:\MINITAB\FAMDAT.MTW

Worksheet was saved on 4/ 1/1998

MTB > print 'y''x'

Data Display

Row Y X

1 0 0

2 2 1

3 1 2

4 3 1

5 1 0

6 3 3

7 4 4

8 2 2

9 1 2

10 2 1

MTB > plot 'y'*'x'

MTB > regress 'y' on 1 'x' 'resid''pred'

Regression Analysis

The regression equation is

Y = 0.833 + 0.667 X

Predictor Coef Stdev t-ratio p

Constant 0.8333 0.4751 1.75 0.117

X 0.6667 0.2375 2.81 0.023

s = 0.9014 R-sq = 49.6% R-sq(adj) = 43.3%

Analysis of Variance

SOURCE DF SS MS F p

Regression 1 6.4000 6.4000 7.88 0.023

Error 8 6.5000 0.8125

Total 9 12.9000

MTB > print 'y''x''resid''pred'

Data Display

Row Y X RESID PRED

1 0 0 -1.08786 0.83333

2 2 1 0.59300 1.50000

3 1 2 -1.37281 2.16667

4 3 1 1.77900 1.50000

5 1 0 0.21757 0.83333

6 3 3 0.21155 2.83333

7 4 4 0.78446 3.50000

8 2 2 -0.19612 2.16667

9 1 2 -1.37281 2.16667

10 2 1 0.59300 1.50000

MTB > plot 'pred'*'x'

MTB > plot 'y'*'x' 'pred'*'x';

SUBC> symbol;

SUBC> type 3 1;

SUBC> color 8 9;

SUBC> overlay.

MTB > stop