14.5.5Correlation and Regression with Minitab

During this tutorial you will learn how to use Minitab to investigate the association between two continuous variables and how to describe it graphically

In this practical session you will analyse the some bivariate data.

1To enter the data:

File / Open Worksheet / Merlin1 (saved in the ANOVA worksheet, 14.5.1)

In order to see what this file contains: Open the Information window; Window / Info. This window gives details of the data stored in the file merlin4.mtw which you are going to analyse.

We shall give each of the cases a salary, in £’000, and then see if it is associated with their age.

In the first empty column: name the variable Salary, in £’000, and type the following in one column: (Work down these columns one after the other)

38.1 / 38.9 / 23.2 / 22.9 / 19.8 / 19.7 / 15.6 / 31.7 / 17.3 / 37.8
18.7 / 42.8 / 19.6 / 47.5 / 31.3 / 8.5 / 28.5 / 14.1 / 33.5 / 32.9
42.3 / 60.1 / 15.5 / 15.8 / 59.3 / 15.9 / 37.3 / 20.3 / 13.7 / 9.8
25.9 / 60.7 / 20.7 / 35.9 / 33.8 / 39.3 / 32.9 / 19.8 / 6.4 / 15.2
53.6 / 75.2 / 40.2 / 25.3 / 24.5 / 14.5 / 23.9 / 28.2 / 35.3 / 8.7
37.6 / 10.9 / 28.5 / 63.2 / 32.0 / 10.2 / 8.6 / 25.3 / 12.5 / 38.4

The variables of interest in this practical session are the continuous variables SALARY and AGE. We shall investigate the relationship between the Salaries earned by the employees of Merlin and their ages.

2Save revised worksheet as Merlin5.mtw

3Produce a Scatterplot to see if there appears to be a relationship?

Graph / Scatterplot / Simple Select SALARY for Y and AGE for X
  • Examine the plot. Does it suggest a linear relationship? ......

4Calculate the correlation coefficient.

Stat / Basic Statistics / Correlation Select SALARY and AGE as the variables.

  • What is the value of the correlation coefficient? ......
  • What is the probability of it being zero? ......
  • Is this significant at 5%?......

If the p value is less than 0.05 the correlation coefficient is significant at the 5% level of significance.

5 Find the regression equation:

Stat / Regression / Regression Select SALARY for Response, AGE for Predictors.

  • Write down the regression equation ......

The last output was for the default setting. Minitab can calculate and store the fitted values and the standardised residuals for each observation.

Edit / Edit Last Dialog The last dialogue box reopens.

Under Storage select Residuals, Standardised residuals and Fits.

OK and check that three new columns have been added to your worksheet.

6Save this altered version of your file at this stage under a new name Merlin6

File / Save Worksheet as Merlin6

7Investigate the Fitted values:

Graph / Scatterplot Select FITS1 as the Y-variable and AGE as the X-variable.

This produces a straight line as all the fitted values lie on the regression line.

To fit this line to a scatterplot requires a graphics plot:

Graph / Scatterplot / With regression SALARY against AGE as before.

8To print this graph which will not appear in your Session file:

File / Print Graph

9To predict a Y-value for a given X-value, a 40 year old:

Stat / Regression / Regression

Select SALARY and AGE as before and then select Options.

Type 40 in the Prediction intervals for new observations.

The output gives the predicted value for y when x is 40 with a confidence interval and prediction interval for this predicted y-value.

  • What is the predicted salary of a 40 year old employee?......

10To produce the regression line with its confidence interval and prediction interval:

Stat / Regression / Fitted Line Plot / Options: Display Confidence and Prediction bands. Select SALARY and AGE as before. Print this window as before.

11 Print your session and/or graphs if required.

1