Stat 301 B – Lab 7
Goals: In this lab, we will see how to:
fit multiple regression models with interaction or quadratic terms
construct the model comparison F test for two nested models
I include information about some potentially useful JMP Fit Model options. These are optional.
We will use the grandfather clocks data set (gfclocks.txt) to illustrate all of these.
Interaction or quadratic terms in a multiple regression model:
There are two ways to fit a model with an interaction or quadratic term:
create a variable with the product or square of the appropriate X variable(s) by hand
have JMP create that product or square “on the fly”.
Both give the same results, so long as “on the fly” is done with some care. Here’s how to do each:
- Creating a new X variable: This should be familiar by now. Select the data set to make it the active window. Right click on a blank column and select column info. Type in an appropriate column name, then select column properties / edit formula. Use the buttons to write the desired formula, AGE x NUMBIDS. Then click the OK buttons. You will get a new column with the product. You should see that the values are identical to those in the pre-existing AGE-BID column. Use that product variable in a multiple regression model in the obvious way: just add it to the Construct Model Effects box in the Analyze/Fit Model dialog.
Quadratic variables (e.g. age2) are constructed by AGE*AGE or using the power button (xy), which gives you squares (y=2) by default.
2. Creating an interaction or quadratic term “on the fly”: The Fit Model dialog allows you to use “created” variables in a regression model without creating a new named variable. This is done with the Cross button in the Analyze / Fit Model dialog. Set up the linear effects in the model, as before by putting PRICE in the Y variable box and Adding AGE and NUMBIDS to the Construct Model Effects box. To include an interaction term, select both AGE and NUMBIDS in the Select Columns box, then left-click Cross. An AGE*NUMBIDS term will be added to the Construct Model Effects box, which will look like:
To construct a quadratic term, select Age in the Select Columns box and Age in the Construct Model effects box, then click Cross. The Model Selection dialog with the two terms selected and the result included as a model term looks like:
The AGE*AGE term is the square of AGE. If you also wanted NUMBIDS2, you would repeat this with NUMBIDS.
IMPORTANT NOTE: To recreate class results (and to match results from adding variables), you need to do one more thing anytime you create variables “on the fly”: Click the red triangle by Model Specification and unclick Center Polynomials.
We will talk in lecture later about centering polynomials. For now, it is a distraction.
If the parameter estimates box includes terms like: (AGE-144.938)*(AGE-144.938), you didn’t turn off polynomial centering. Your results will not match mine. Recreate and rerun the model with Center Polynomials unclicked.
model comparison F test for two nested models: JMP gives you the “all variables” vs “only the intercept” model comparison automatically (the Analysis of Variance table in the JMP output). Other model comparisons can be obtained in either of two ways.
- Fit each model separately (two runs of Analyze / Fit Model). Look at the Analysis of Variance result for each model, extract the Error DF and Sums of Squares. Calculate the F statistic by hand. The p-value has to be obtained from printed tables.
- JMP can test arbitrarily complex null hypotheses about parameters. This includes the null hypotheses that correspond to model comparisons. My example will test whether both quadratic terms are needed in the model:
Price= β0+β1 Age+ β2 Numbids+ β3 Age×Numbids+ β4 Age2+ β5 Numbids2
i.e., the null hypothesis that β4=0 and β5=0
Fit the full model (the 6 parameter model), then click the red triangle by Response PRICE, select Estimates / Custom Test. You should get a dialog box looking like:
This allows you to specify each component of the Null hypothesis (here, β4=0 and β5=0). To see how to specify this to JMP, remember that the null hypothesis we want to test is exactly the same as: 1 ×β4=0 and 1×β5=0. There are two pieces to this null hypothesis. The first concerns β4; the second concerns β5. We have to specify each piece.
Each piece is specified to JMP by entering the coefficient (1) adjacent to the appropriate parameter and the resulting value (0) as the result. Click on the number next to each parameter and enter the desired value. We only have to change the AGE*AGE coefficient because the default result (the value by the =) is 0. The first piece, β4=0, is the following in the Custom Test dialog:
Since we have two pieces to the null hypothesis, click Add Column to get a second column in which we can enter the second piece. When done, the Custom Test dialog should look like:
Click Done to run the test. The output is on the next page.
The two columns are information about each piece separately. In this case (since the coefficients are 1 for each piece), that is the same information available from the parameter estimates box. The output we want is in the box at the bottom of the window. Sum of Squares is the change in SS Error between the two models; Numerator DF is the change in DF Error between the two models. JMP goes directly to the F statistic and gives you the p-value. Here, my conclusion would be no evidence of a quadratic relationship for either age or numbids. Practically, I would then omit those two variables from my model.
Note of caution: JMP does exactly what you tell it to do. It can’t read your mind. If you put a 1 in the wrong place, you will get the wrong results because JMP is testing the wrong hypothesis. In particular, it is easy to put two 1’s in the same column, instead of in two different columns. If you put two 1’s in the same column, you are asking JMP to test the hypothesis:
1 ×β4+1 ×β5=0. Very different!
If you aren’t sure that you’ve specified your test correctly, I suggest two checks:
1) Does the test have the correct d.f.? If there are two pieces in your null hypothesis, the test should have numerator df = 2. Alternatively, the numerator df should equal the number of equal signs in your null hypothesis.
2) Does the test have the correct SS? Check by fitting the two models you want (so the models being compared are clear), then hand computing the change in SS. If that’s correct, the rest of the output is almost certainly correct.
Some potentially useful JMP Fit Model shortcuts:
- By default, the Model Specification dialog box closes when you run a model. You can keep it around by clicking the “Keep dialog open” box below the Run button. This is very helpful when you plan to fit various models to the same data set.
- If you decide to transform a variable (e.g., after looking at the residual x predicted value plot), you can do that “on the fly” in the Model Specification dialog. Select the variable to be transformed (Y or one or more of the X variables), click the red triangle by “Transform” and select the transformation.
- If you have a lot of variables in the model and you want to create all (or some) pairs of interaction variables, JMP can do this for you. Select the X desired variables in the “Select Columns” box, then left click on Macros. Choose Factorial to degree from the drop down menu. The default degree is 2 (box below Macros). This puts all variables and all products of variables into the Model Effects box. If you want all variables and some interactions, select all the variables and Add them to the Construct Model Effects. Then, select the variables that participate in interactions and click Macros / Factorial to Degree. The desired interactions are added to the model box. (Technically, the selected variables are added a second time, then removed because they are redundant).
Note: This does not create quadratic terms, just the products.
- If you want JMP to create a bunch of quadratic terms for you, select the desired variables (from the Select Columns box), then click Macros / Polynomial to degree. JMP adds squared terms for each selected variable to the Model Effects box.
Note: This does not create products, just the quadratic terms.
- If you want both quadratics and products, i.e., a full quadratic model, select the desired variables and click Macros / Response Surface. All products and quadratics are added to the Model Effects box. The desired variables change to NAME &RS, which turns on additional results options that are usually wanted for optimization (remember the mention of response surface models in lecture). Ask me if you want to know more (completely optional!).
- If you want to delete a term from the Model Effects box, select it, then click Remove.