Multinomial Logistic Regression

As with binomial logistic regression, this technique is employed to predict a categorical variable from a collection of continuous and/or categorical predictors. Unlike with binomial logistic regression, there are more than two levels of the predicted categorical variable.

In the summer of 2014 my colleagues and I received feedback on a manuscript we had submitted to a scholarly journal. The categorical variable being predicted was the status of engineering students here at ECU – they were classified as still being in the program, having left the program but in good status, or having left the program in poor status. One of my coauthors had used a discriminant function analysis, but one of the reviewers suggesting using a multinomial logistic regression instead, to avoid the restrictive assumptions associated with a discriminant function analysis. So, I taught myself how to do a multinomial logistic regression, with some help from a colleague in biostatistics. Since the data were in SPSS format, I employed SPSS.

Below I present the multinomial logistic analysis recommended by one of our reviewers. Although I have done it in a sequential fashion, for pedagogical purposes, we reported a simultaneous analysis (all the variables thrown in at once, that is, the last step shown below). All of the predictor variables were continuous. To make it easier to compare predictors’ relative importance, I standardized them all to mean 0, standard deviation 1.

MSAT is score on the math SAT. VSAT is score on the verbal SAT. HSGPA is high school GPA. ALEKS is score on a mathematics assessment test designed to test a college student’s readiness to take courses that require mastery of mathematics. LOC is locus of control, with high scores representing an external locus of control. The NEO predictors are scores on a Big Five personality test: Openness, Conscientiousness, Extroversion, Agreeableness, and Neuroticism.

Descriptive Statistics
N / Minimum / Maximum / Mean / Std. Deviation
MSAT / 256 / 410 / 780 / 565.47 / 62.174
VSAT / 256 / 350 / 670 / 492.93 / 59.728
HSGPA / 256 / 2.22 / 4.00 / 3.1167 / .34986
ALEKS / 256 / 17 / 97 / 53.74 / 18.985
LOC / 256 / 0 / 36 / 13.79 / 5.950
NEOOpen / 256 / 11 / 50 / 26.83 / 5.663
NEOC / 256 / 14 / 49 / 31.57 / 6.649
NEOE / 256 / 10 / 46 / 30.68 / 5.946
NEOA / 256 / 12 / 43 / 28.73 / 5.460
NEON / 256 / 6 / 53 / 25.31 / 11.284
Valid N (listwise) / 256

First I entered the Big Five predictors as a set. Analyze, Regression, Multinomial Logistic.

Case Processing Summary
N / Marginal Percentage
groups / Poor / 68 / 26.6%
Good / 85 / 33.2%
Stay / 103 / 40.2%
Valid / 256 / 100.0%
Missing / 0
Total / 256
Subpopulation / 256a
a. The dependent variable has only one value observed in 256 (100.0%) subpopulations.
Model Fitting Information
Model / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood / Chi-Square / df / Sig.
Intercept Only / 555.273
Final / 522.381 / 32.892 / 10 / .000

Using these predictors significantly improved the model (compared to a model based only on the differences in group sample sizes).

Pseudo R-Square
Cox and Snell / .121
Nagelkerke / .136
McFadden / .059

This is an R-squared-like statistic, but cannot really be interpreted as a proportion of variance. I avoid it, but one of our reviewers wanted it.

Likelihood Ratio Tests
Effect / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood of Reduced Model / Chi-Square / df / Sig.
Intercept / 533.145 / 10.764 / 2 / .005
ZNEOOpen / 523.656 / 1.274 / 2 / .529
ZNEOC / 537.587 / 15.206 / 2 / .000
ZNEOE / 523.370 / .989 / 2 / .610
ZNEOA / 523.208 / .826 / 2 / .662
ZNEON / 527.838 / 5.457 / 2 / .065
The chi-square statistic is the difference in -2 log-likelihoods between the final model and a reduced model. The reduced model is formed by omitting an effect from the final model. The null hypothesis is that all parameters of that effect are 0.

Removing consciousness from the model would significantly lower fit between model and data. Neuroticism is nearly significant (but look below).

Each predictor has k-1 B weights, each one comparing the reference group with one of the other groups. Here I designated the stay group as the reference group.

Parameter Estimates
groupsa / B / Std. Error / Wald / df / Sig. / Exp(B)
Good / Intercept / .404 / .184 / 4.846 / 1 / .028
ZNEOOpen / -.135 / .187 / .519 / 1 / .471 / .874
ZNEOC / .658 / .213 / 9.562 / 1 / .002 / 1.932
ZNEOE / .078 / .200 / .154 / 1 / .695 / 1.081
ZNEOA / -.030 / .189 / .025 / 1 / .873 / .970
ZNEON / .233 / .214 / 1.185 / 1 / .276 / 1.262
groupsa / B / Std. Error / Wald / df / Sig. / Exp(B) / groupsa
Stay / Intercept / .561 / .179 / 9.791 / 1 / .002
ZNEOOpen / -.208 / .185 / 1.270 / 1 / .260 / .812
ZNEOC / .741 / .211 / 12.372 / 1 / .000 / 2.099
ZNEOE / -.092 / .196 / .221 / 1 / .638 / .912
ZNEOA / .121 / .189 / .410 / 1 / .522 / 1.129
ZNEON / .467 / .211 / 4.893 / 1 / .027 / 1.595

For each one standard deviation increase in conscientiousness, the odds of being in the stay group rather than the poor group more than doubled.

For each one standard deviation increase in conscientiousness. the odds of being in the good group rather than the poor group nearly doubled.

For each one standard deviation increase in neuroticism the odds of being in the stay group rather than the poor group increased multiplicatively by 1.60.

Locus of control was added in the next step. Its addition did not significantly improve the model.

Model Fitting Information
Model / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood / Chi-Square / df / Sig.
Intercept Only / 555.273
Final / 520.187 / 35.086 / 12 / .000
Pseudo R-Square
Cox and Snell / .128
Nagelkerke / .145
McFadden / .063
Likelihood Ratio Tests
Effect / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood / Chi-Square / df / Sig.
Intercept / 531.087 / 10.901 / 2 / .004
ZNEOOpen / 521.362 / 1.175 / 2 / .556
ZNEOC / 536.245 / 16.058 / 2 / .000
ZNEOE / 521.040 / .853 / 2 / .653
ZNEOA / 521.134 / .947 / 2 / .623
ZNEON / 524.591 / 4.405 / 2 / .111
ZLOC / 522.381 / 2.194 / 2 / .334
Parameter Estimates
groupsa / B / Std. Error / Wald / df / Sig. / Exp(B)
Good / Intercept / .403 / .185 / 4.759 / 1 / .029
ZNEOOpen / -.128 / .188 / .459 / 1 / .498 / .880
ZNEOC / .706 / .218 / 10.528 / 1 / .001 / 2.026
ZNEOE / .062 / .200 / .097 / 1 / .755 / 1.064
ZNEOA / -.065 / .193 / .112 / 1 / .738 / .938
ZNEON / .091 / .236 / .148 / 1 / .700 / 1.095
ZLOC / .282 / .198 / 2.035 / 1 / .154 / 1.326
Stay / Intercept / .567 / .180 / 9.946 / 1 / .002
ZNEOOpen / -.201 / .186 / 1.172 / 1 / .279 / .818
ZNEOC / .759 / .214 / 12.605 / 1 / .000 / 2.136
ZNEOE / -.096 / .196 / .240 / 1 / .624 / .909
ZNEOA / .105 / .192 / .299 / 1 / .585 / 1.111
ZNEON / .410 / .230 / 3.160 / 1 / .075 / 1.506
ZLOC / .114 / .192 / .355 / 1 / .551 / 1.121
a. The reference category is: Poor.

On the third step, ALEKS was added to the model.

Model Fitting Information
Model / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood / Chi-Square / df / Sig.
Intercept Only / 555.273
Final / 502.495 / 52.777 / 14 / .000
Pseudo R-Square
Cox and Snell / .186
Nagelkerke / .210
McFadden / .095
Likelihood Ratio Tests
Effect / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood of Reduced Model / Chi-Square / df / Sig.
Intercept / 514.751 / 12.255 / 2 / .002
ZNEOOpen / 502.969 / .473 / 2 / .789
ZNEOC / 517.760 / 15.265 / 2 / .000
ZNEOE / 503.311 / .816 / 2 / .665
ZNEOA / 503.760 / 1.265 / 2 / .531
ZNEON / 505.689 / 3.193 / 2 / .203
ZLOC / 504.877 / 2.382 / 2 / .304
ZALEKS / 520.187 / 17.691 / 2 / .000
Parameter Estimates
groupsa / B / Std. Error / Wald / df / Sig. / Exp(B)
Good / Intercept / .502 / .197 / 6.501 / 1 / .011
ZNEOOpen / -.104 / .191 / .294 / 1 / .587 / .901
ZNEOC / .743 / .222 / 11.184 / 1 / .001 / 2.103
ZNEOE / .081 / .203 / .162 / 1 / .687 / 1.085
ZNEOA / -.075 / .194 / .150 / 1 / .698 / .928
ZNEON / .084 / .239 / .122 / 1 / .727 / 1.087
ZLOC / .290 / .198 / 2.136 / 1 / .144 / 1.337
ZALEKS / .341 / .187 / 3.338 / 1 / .068 / 1.406
Stay / Intercept / .630 / .194 / 10.536 / 1 / .001
ZNEOOpen / -.129 / .193 / .451 / 1 / .502 / .879
ZNEOC / .740 / .220 / 11.271 / 1 / .001 / 2.096
ZNEOE / -.076 / .202 / .140 / 1 / .708 / .927
ZNEOA / .124 / .197 / .395 / 1 / .530 / 1.132
ZNEON / .362 / .238 / 2.314 / 1 / .128 / 1.436
ZLOC / .107 / .198 / .289 / 1 / .591 / 1.112
ZALEKS / .733 / .186 / 15.499 / 1 / .000 / 2.081

Adding ALEKS significantly improved the model. Each increase of one standard deviation in ALEKS was associated with a more than doubling of the odds of being in the stay group rather than the poor group. The effect of ALEKS on the odds ratio for good versus poor fell just short of statistical significance.

In Step 4 the SAT variables were added.

Model Fitting Information
Model / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood / Chi-Square / df / Sig.
Intercept Only / 555.273
Final / 493.748 / 61.525 / 18 / .000

The chi-square for this step is 502.495 – 493.748 = 8.747 on 18-14 = 4 degrees of freedom. That yields a p value of .068.

Pseudo R-Square
Nagelkerke / .241
Likelihood Ratio Tests
Effect / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood of Reduced Model / Chi-Square / df / Sig.
Intercept / 505.474 / 11.726 / 2 / .003
ZNEOOpen / 494.425 / .677 / 2 / .713
ZNEOC / 509.567 / 15.819 / 2 / .000
ZNEOE / 494.480 / .732 / 2 / .693
ZNEOA / 494.550 / .802 / 2 / .670
ZNEON / 496.586 / 2.838 / 2 / .242
ZLOC / 496.634 / 2.886 / 2 / .236
ZALEKS / 504.006 / 10.258 / 2 / .006
ZMSAT / 500.976 / 7.228 / 2 / .027
ZVSAT / 496.824 / 3.076 / 2 / .215

Removing math SAT from the model would significantly reduce the fit of the model to the data, but the effects of math SAT on the two contrasts (stay versus good and stay versus poor) fall short of statistical significance. In another analysis I found that math SAT was significantly associated with the difference between the stay and the good groups, with the odds of being in the stay group rather than the good group increasing multiplicatively by 1.63 for each standard deviation increase in math SAT.

Parameter Estimates
groupsa / B / Std. Error / Wald / df / Sig. / Exp(B)
Good / Intercept / .494 / .201 / 6.045 / 1 / .014
ZNEOOpen / -.124 / .194 / .408 / 1 / .523 / .883
ZNEOC / .752 / .225 / 11.178 / 1 / .001 / 2.121
ZNEOE / .058 / .205 / .080 / 1 / .777 / 1.060
ZNEOA / -.048 / .196 / .060 / 1 / .807 / .953
ZNEON / .084 / .244 / .118 / 1 / .731 / 1.088
ZLOC / .327 / .202 / 2.620 / 1 / .106 / 1.387
ZALEKS / .406 / .205 / 3.927 / 1 / .048 / 1.500
ZMSAT / -.241 / .213 / 1.278 / 1 / .258 / .786
ZVSAT / .315 / .206 / 2.335 / 1 / .126 / 1.370
Poor / Intercept / .629 / .197 / 10.151 / 1 / .001
ZNEOOpen / -.157 / .195 / .649 / 1 / .421 / .855
ZNEOC / .777 / .225 / 11.965 / 1 / .001 / 2.176
ZNEOE / -.091 / .205 / .199 / 1 / .655 / .913
ZNEOA / .112 / .198 / .319 / 1 / .572 / 1.119
ZNEON / .348 / .240 / 2.101 / 1 / .147 / 1.416
ZLOC / .126 / .201 / .389 / 1 / .533 / 1.134
ZALEKS / .630 / .202 / 9.735 / 1 / .002 / 1.878
ZMSAT / .266 / .214 / 1.545 / 1 / .214 / 1.305
ZVSAT / .071 / .206 / .120 / 1 / .729 / 1.074

In the last step, high school GPA was added to the model.

Model Fitting Information
Model / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood / Chi-Square / df / Sig.
Intercept Only / 555.273
Final / 473.253 / 82.020 / 20 / .000
Pseudo R-Square
Cox and Snell / .274
Nagelkerke / .310
McFadden / .148
Likelihood Ratio Tests
Effect / Model Fitting Criteria / Likelihood Ratio Tests
-2 Log Likelihood / Chi-Square / df / Sig.
Intercept / 488.053 / 14.800 / 2 / .001
ZNEOOpen / 473.641 / .388 / 2 / .824
ZNEOC / 488.933 / 15.680 / 2 / .000
ZNEOE / 473.844 / .591 / 2 / .744
ZNEOA / 473.951 / .698 / 2 / .705
ZNEON / 475.236 / 1.983 / 2 / .371
ZLOC / 475.350 / 2.096 / 2 / .351
ZALEKS / 482.546 / 9.292 / 2 / .010
ZMSAT / 480.010 / 6.757 / 2 / .034
ZVSAT / 475.947 / 2.694 / 2 / .260
ZHSGPA / 493.748 / 20.495 / 2 / .000
Parameter Estimates
groupsa / B / Std. Error / Wald / df / Sig. / Exp(B)
Good / Intercept / .625 / .214 / 8.526 / 1 / .004
ZNEOOpen / -.102 / .202 / .251 / 1 / .616 / .903
ZNEOC / .763 / .228 / 11.140 / 1 / .001 / 2.144
ZNEOE / .118 / .215 / .301 / 1 / .583 / 1.125
ZNEOA / -.114 / .202 / .319 / 1 / .573 / .892
ZNEON / .056 / .253 / .049 / 1 / .825 / 1.058
ZLOC / .276 / .209 / 1.754 / 1 / .185 / 1.318
ZALEKS / .404 / .208 / 3.762 / 1 / .052 / 1.498
ZMSAT / -.238 / .222 / 1.150 / 1 / .284 / .788
ZVSAT / .288 / .215 / 1.796 / 1 / .180 / 1.334
ZHSGPA / .667 / .197 / 11.480 / 1 / .001 / 1.949
Stay / Intercept / .734 / .212 / 12.034 / 1 / .001
ZNEOOpen / -.125 / .204 / .373 / 1 / .541 / .883
ZNEOC / .807 / .230 / 12.298 / 1 / .000 / 2.241
ZNEOE / -.008 / .216 / .001 / 1 / .972 / .992
ZNEOA / .030 / .207 / .022 / 1 / .883 / 1.031
ZNEON / .289 / .252 / 1.314 / 1 / .252 / 1.335
ZLOC / .087 / .211 / .172 / 1 / .678 / 1.091
ZALEKS / .619 / .208 / 8.833 / 1 / .003 / 1.858
ZMSAT / .249 / .226 / 1.215 / 1 / .270 / 1.283
ZVSAT / .049 / .217 / .051 / 1 / .820 / 1.051
ZHSGPA / .838 / .202 / 17.161 / 1 / .000 / 2.312

High School GPA, Conscientiousness, ALEKS, and high school GPA contributed significantly to the model.