Performing “Canned” Regressions in Excel

Here are instructions on how to perform a regression using the Excel Add-in Data Analysis Toolpak. It uses example 3.2 from Chapter 3, page 61. Work through this example and print out your results. Your results should be identical to the last screen shot on this handout.

1) First, launch the Excel program. Once you are in Excel, open up your data set by going to the File Menu and choosing Open. Note: do not try to open Excel by double-clicking on your data file. This may not launch Excel if the data are a text file or some other non-Excel format. Alternatively, if you have a small data set, like below, you can enter the data by hand.

You should have your data set arranged into columns. It is best to have variable names in the first row and data starting in the second row. Each column corresponds to a different variable.

2) Go to the Tools Menu, and look for Data Analysis…. If it is there, choose it and proceed to the next step. If it is NOT there, from the Tools Menu, Choose Add-Ins. Check the first item Analysis ToolPak. Hit OK (don’t uncheck anything that might already be checked; just make sure that Analysis ToolPak is chosen.) Excel will run through some sort of installation process. Hopefully it won’t ask for your original MS Excel disk/CD – it didn’t when I installed it. (I have been told that all campus computers have the Data Analysis ToolPak.)

3) In the Tools Menu, choose Data Analysis. Scroll down this dialog box to Regression and choose it. You should see the following on your screen. Note that I have inputted the data on X and Y in the upper left corner. This is Exercise 3.2 from Chapter 3.

4) You need to tell Excel what to use for data on the dependent variable (Y) and what to use for data on the independent variable (X). Enter the cell ranges that contain the appropriate data, including the first cell that has the variable name. For the data in the example above, the Y data are in cells $B$1:$B$6. The X data are in cells $A$1:$A$6. We also want to check the following options in the Regression dialog box:

Labels (note: this tells Excel that the contests of cell A1 is a label, not data)

New Worksheet Ply (note: this tells Excel to put results in a new sheet)

residuals

Line Fit Plots

Hit OK.

6) Excel then performs the regression and puts the results in a new spreadsheet contained in your workbook. Look at the bottom for the Sheet 1, Sheet 2 and Sheet 3 tabs. Your original data should be in Sheet 1 and the regression results in one of the other sheets.

7) Before you print out the sheet containing your results, re-size it by selecting all columns and choosing Format, Column, Autofit Selection

I resized the scatterplot. I also right-clicked on Predicted Y values, and choose Format Data Series.. From the Patterns Tab, where it says Line, choose Automatic. It will then connect the dots giving you a graph of the estimated regression line.

Excel will print out a residual for every observation because we choose this option in the Regression dialog box. If you have a large data set (more than 30 observations) don’t print all of the residuals out (never hand in pages and pages of data values! – it is a waste of paper.)

Print out:

Before printing out enter your Name in Cell B1. (Failure to do this will result in points lost).

In order to print out only the relevant information for the above regression:

1) highlight from cell A1 to cell J32, got to File Menu, Print Area, Set Print Area, Choose Set Print Area.

2) Go to File, Page Setup, Choose Landscape and Fit to 1 page wide by 1 page tall.