*** GLM: crab data of table 4.3 CDA by Agresti;

*** mixed factors; model selection;

proc format;

value colorgp 1='light medium'

2='medium'

3='dark medium'

4='dark';

value spinecdtn 1='both good'

2='one worn'

3='both worn';

run;

data crab;

input color spine width satell weight;

weight=weight/1000; color=color-1;

** response =1 for at least one satellites;

if satell > 0 then y=1;

else y=0;

format color colorgp.

spine spinecdtn.;

datalines;

3 3 28.3 8 3050

3 2 24.5 0 2000

;

run;

*** fit logistic regression for color and width;

proc genmod data=crab descending;

class color;

model y = color width color*width/ dist=bin link=logit;

run;

*** assume no interaction between color and width;

proc genmod data=crab descending;

class color;

model y = color width/ dist=bin link=logit;

run;

*** treat color as quanitative factor;

proc genmod data=crab descending;

model y = color width/dist=bin link=logit;

run;

*** model selection: you can use forward or stepwise instead;

proc logistic data=crab descending;

class color(param=ref ref=’dark’) spine;

model y =color|spine|width / selection=backward;

run;

data crab2;

set crab;

if color=4 then dark=1;

else dark=0;

run;

*** model: width and dark color;

proc genmod data=crab2 descending;

model y= dark width/dist=bin link=logit p r;

output out=diag pred=fitted stdreschi=std_res;

run;

The GENMOD Procedure

Model Information

Data Set WORK.CRAB

Distribution Binomial

Link Function Logit

Dependent Variable y

Number of Observations Read 173

Number of Observations Used 173

Number of Events 111

Number of Trials 173

Class Level Information

Class Levels Values

color 4 dark dark medium light medium medium

Response Profile

Ordered Total

Value y Frequency

1 1 111

2 0 62

PROC GENMOD is modeling the probability that y='1'.

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 165 183.0806 1.1096

Scaled Deviance 165 183.0806 1.1096

Pearson Chi-Square 165 163.2536 0.9894

Scaled Pearson X2 165 163.2536 0.9894

Log Likelihood -91.5403

Algorithm converged.

The SAS System 14:31 Sunday, February 5, 2006 53

The GENMOD Procedure

Analysis Of Parameter Estimates

Standard Wald 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 -10.0400 3.5583 -17.0142 -3.0657 7.96 0.0048

color dark 1 4.1861 7.5809 -10.6722 19.0445 0.30 0.5808

color dark medium 1 -11.4781 7.6980 -26.5658 3.6097 2.22 0.1359

color light medium 1 8.2874 12.0036 -15.2393 31.8140 0.48 0.4899

color medium 0 0.0000 0.0000 0.0000 0.0000 . .

width 1 0.4189 0.1367 0.1509 0.6869 9.38 0.0022

width*color dark 1 -0.2184 0.2952 -0.7971 0.3602 0.55 0.4594

width*color dark medium 1 0.4395 0.3018 -0.1521 1.0311 2.12 0.1454

width*color light medium 1 -0.3129 0.4479 -1.1908 0.5651 0.49 0.4849

width*color medium 0 0.0000 0.0000 0.0000 0.0000 . .

Scale 0 1.0000 0.0000 1.0000 1.0000

NOTE: The scale parameter was held fixed.

The GENMOD Procedure

Model Information

Data Set WORK.CRAB

Distribution Binomial

Link Function Logit

Dependent Variable y

Number of Observations Read 173

Number of Observations Used 173

Number of Events 111

Number of Trials 173

Class Level Information

Class Levels Values

color 4 dark dark medium light medium medium

Response Profile

Ordered Total

Value y Frequency

1 1 111

2 0 62

PROC GENMOD is modeling the probability that y='1'.

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 168 187.4570 1.1158

Scaled Deviance 168 187.4570 1.1158

Pearson Chi-Square 168 168.6590 1.0039

Scaled Pearson X2 168 168.6590 1.0039

Log Likelihood -93.7285

Algorithm converged.

Analysis Of Parameter Estimates

Standard Wald 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 -11.3128 2.7451 -16.6931 -5.9324 16.98 <.0001

color dark 1 -1.4023 0.5484 -2.4773 -0.3274 6.54 0.0106

color dark medium 1 -0.2962 0.4176 -1.1147 0.5223 0.50 0.4781

color light medium 1 -0.0724 0.7399 -1.5226 1.3778 0.01 0.9220

color medium 0 0.0000 0.0000 0.0000 0.0000 . .

width 1 0.4680 0.1055 0.2611 0.6748 19.66 <.0001

Scale 0 1.0000 0.0000 1.0000 1.0000

NOTE: The scale parameter was held fixed.

The SAS System 14:31 Sunday, February 5, 2006 56

The GENMOD Procedure

Model Information

Data Set WORK.CRAB

Distribution Binomial

Link Function Logit

Dependent Variable y

Number of Observations Read 173

Number of Observations Used 173

Number of Events 111

Number of Trials 173

Response Profile

Ordered Total

Value y Frequency

1 1 111

2 0 62

PROC GENMOD is modeling the probability that y='1'.

Criteria For Assessing Goodness Of Fit

Criterion DF Value Value/DF

Deviance 170 189.1212 1.1125

Scaled Deviance 170 189.1212 1.1125

Pearson Chi-Square 170 170.0759 1.0004

Scaled Pearson X2 170 170.0759 1.0004

Log Likelihood -94.5606

Algorithm converged.

Analysis Of Parameter Estimates

Standard Wald 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 -10.0708 2.8069 -15.5722 -4.5695 12.87 0.0003

color 1 -0.5090 0.2237 -0.9475 -0.0706 5.18 0.0229

width 1 0.4583 0.1040 0.2544 0.6622 19.41 <.0001

Scale 0 1.0000 0.0000 1.0000 1.0000

NOTE: The scale parameter was held fixed.

The SAS System 13:46 Wednesday, November 4, 2009 186

The LOGISTIC Procedure

Model Information

Data Set WORK.CRAB

Response Variable y

Number of Response Levels 2

Model binary logit

Optimization Technique Fisher's scoring

Number of Observations Read 173

Number of Observations Used 173

Response Profile

Ordered Total

Value y Frequency

1 1 111

2 0 62

Probability modeled is y=1.

Backward Elimination Procedure

Class Level Information

Class Value Design Variables

color dark 1 0 0

dark medium 0 1 0

light medium 0 0 1

medium -1 -1 -1

spine both good 1 0

both worn 0 1

one worn -1 -1

Step 0. The following effects were entered:

Intercept color spine color*spine width width*color width*spine width*color*spine

Model Convergence Status

Quasi-complete separation of data points detected.

WARNING: The maximum likelihood estimate may not exist.

WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based


The SAS System 13:46 Wednesday, November 4, 2009 187

The LOGISTIC Procedure

on the last maximum likelihood iteration. Validity of the model fit is questionable.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 227.759 212.446

SC 230.912 278.665

-2 Log L 225.759 170.446

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 55.3129 20 <.0001

Score 47.1111 20 0.0006

Wald 29.6574 20 0.0756

Step 1. Effect width*color*spine is removed:

Model Convergence Status

Quasi-complete separation of data points detected.

WARNING: The maximum likelihood estimate may not exist.

WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based

on the last maximum likelihood iteration. Validity of the model fit is questionable.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 227.759 209.674

SC 230.912 266.433

-2 Log L 225.759 173.674


The SAS System 13:46 Wednesday, November 4, 2009 188

The LOGISTIC Procedure

WARNING: The validity of the model fit is questionable.

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 52.0846 17 <.0001

Score 46.2789 17 0.0002

Wald 31.0713 17 0.0196

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

1.3313 3 0.7217

Step 2. Effect color*spine is removed:

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 227.759 205.559

SC 230.912 243.398

-2 Log L 225.759 181.559

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 44.1997 11 <.0001

Score 40.4197 11 <.0001

Wald 31.9610 11 0.0008

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

9.0092 9 0.4364


The SAS System 13:46 Wednesday, November 4, 2009 189

The LOGISTIC Procedure

WARNING: The validity of the model fit is questionable.

Step 3. Effect width*spine is removed:

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 227.759 201.637

SC 230.912 233.170

-2 Log L 225.759 181.637

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 44.1215 9 <.0001

Score 40.3715 9 <.0001

Wald 31.9987 9 0.0002

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

8.8373 11 0.6369

Step 4. Effect spine is removed:

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.


The SAS System 13:46 Wednesday, November 4, 2009 190

The LOGISTIC Procedure

WARNING: The validity of the model fit is questionable.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 227.759 199.081

SC 230.912 224.307

-2 Log L 225.759 183.081

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 42.6779 7 <.0001

Score 38.5321 7 <.0001

Wald 30.4875 7 <.0001

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

9.6978 13 0.7184

Step 5. Effect width*color is removed:

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 227.759 197.457

SC 230.912 213.223

-2 Log L 225.759 187.457


The SAS System 13:46 Wednesday, November 4, 2009 191

The LOGISTIC Procedure

WARNING: The validity of the model fit is questionable.

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 38.3015 4 <.0001

Score 34.3384 4 <.0001

Wald 27.6788 4 <.0001

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

12.8933 16 0.6805

Step 6. Effect color is removed:

Model Convergence Status

Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics

Intercept

Intercept and

Criterion Only Covariates

AIC 227.759 198.453

SC 230.912 204.759

-2 Log L 225.759 194.453

Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 31.3059 1 <.0001

Score 27.8752 1 <.0001

Wald 23.8872 1 <.0001

Residual Chi-Square Test

Chi-Square DF Pr > ChiSq

20.6100 19 0.3587


The SAS System 13:46 Wednesday, November 4, 2009 192

The LOGISTIC Procedure

WARNING: The validity of the model fit is questionable.

NOTE: No (additional) effects met the 0.05 significance level for removal from the model.

Summary of Backward Elimination

Effect Number Wald

Step Removed DF In Chi-Square Pr > ChiSq

1 width*color*spine 3 6 0.0944 0.9925

2 color*spine 6 5 0.0920 1.0000

3 width*spine 2 4 0.0761 0.9627

4 spine 2 3 1.4323 0.4886

5 width*color 3 2 3.8871 0.2739

6 color 3 1 6.6246 0.0849

Type 3 Analysis of Effects

Wald

Effect DF Chi-Square Pr > ChiSq

width 1 23.8872 <.0001

Analysis of Maximum Likelihood Estimates

Standard Wald

Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 1 -12.3508 2.6287 22.0749 <.0001

width 1 0.4972 0.1017 23.8872 <.0001

Odds Ratio Estimates

Point 95% Wald

Effect Estimate Confidence Limits

width 1.644 1.347 2.007

Output for model 8

Analysis Of Parameter Estimates

Standard Wald 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 -11.6790 2.6925 -16.9563 -6.4017 18.81 <.0001

dark 1 -1.3005 0.5259 -2.3312 -0.2698 6.12 0.0134

width 1 0.4782 0.1041 0.2741 0.6823 21.08 <.0001

Scale 0 1.0000 0.0000 1.0000 1.0000