Chapter 4 Review Problem

The ABX Company is interested in conducting a study of the factors that affect absenteeism among its production employees. After reviewing the literature on absenteeism and interviewing several production supervisors and a number of employees, the researcher in charge of the project defined the following variables

Variable / Description
Absenteeism / The number of distinct occasions that the worker was absent during 2003. Each occasion consists of one or more consecutive days of absence.
Job Complexity / An index ranging from 0 to 100, a higher value indicates more job complexity
Base Pay / Base hourly pay rate in dollars
Seniority / Number of complete years with the company on December 31, 2003
Age / Employee’s age on December 31, 2003
Dependents / Determined by employee response to the question: “How many individuals other than yourself depend on you for most of their financial support?”

We consider a multiple regression model for predicting absenteeism based on job complexity, base pay, seniority, age and dependents.

Multivariate

Correlations

Absenteeism / Job Complexity / Base Pay / Seniority / Age / Dependents
Absenteeism / 1.0000 / -0.3617 / -0.2254 / -0.3356 / -0.3100 / -0.0510
Job Complexity / -0.3617 / 1.0000 / 0.5020 / 0.3735 / 0.2768 / -0.0792
Base Pay / -0.2254 / 0.5020 / 1.0000 / 0.4940 / 0.3259 / 0.0590
Seniority / -0.3356 / 0.3735 / 0.4940 / 1.0000 / 0.7530 / 0.1518
Age / -0.3100 / 0.2768 / 0.3259 / 0.7530 / 1.0000 / 0.1478
Dependents / -0.0510 / -0.0792 / 0.0590 / 0.1518 / 0.1478 / 1.0000

Response Absenteeism

Whole Model

Actual by Predicted Plot

Summary of Fit

RSquare / 0.187444
RSquare Adj / 0.130222
Root Mean Square Error / 1.379833
Mean of Response / 2.090909
Observations (or Sum Wgts) / 77

Analysis of Variance

Source / DF / Sum of Squares / Mean Square / F Ratio
Model / 5 / 31.18388 / 6.23678 / 3.2757
Error / 71 / 135.17976 / 1.90394 / Prob > F
C. Total / 76 / 166.36364 / 0.0102

Parameter Estimates

Term / Estimate / Std Error / t Ratio / Prob>|t| / VIF
Intercept / 3.6266094 / 1.387114 / 2.61 / 0.0109 / .
Job Complexity / -0.015955 / 0.006883 / -2.32 / 0.0233 / 1.4100426
Base Pay / 0.0448386 / 0.166481 / 0.27 / 0.7885 / 1.5803544
Seniority / -0.0381 / 0.04824 / -0.79 / 0.4323 / 2.7925213
Age / -0.025584 / 0.032503 / -0.79 / 0.4338 / 2.3335252
Dependents / -0.040499 / 0.123385 / -0.33 / 0.7437 / 1.051738

(a) If one were to use only one of the five explanatory variables to predict absenteeism, which would provide the best predictions?

(b) Based on the multiple regression model, predict Absenteeism for an employee with Job Complexity 30, Base Pay $7 per hour, Seniority 5 years, Age 35 and 1 Dependent.

(c) Will the multiple regression model typically be able to forecast an employee’s absenteeism to within 1? Justify your answer.

(d) Find a 95% confidence interval for the change in mean absenteeism that is associated with a one point increase in job complexity, holding fixed base pay, seniority, age and dependents.

(e) Is there strong evidence that base pay is useful for predicting absenteeism once job complexity, seniority, age and dependents have been taken into account? Justify your answer using a test.

(f) Is there any indication that multicollinearity might be a serious problem for this multiple regression? Explain.

(g) Is there strong evidence that seniority and/or age are useful for predicting absenteeism once job complexity, base pay and dependents have been taken into account? Justify your answer using a test. The output below showing the multiple regression of absenteeism on job complexity, base pay and dependents is useful for answering this question.

Response Absenteeism

Summary of Fit

RSquare / 0.13899
RSquare Adj / 0.103606
Root Mean Square Error / 1.400787
Mean of Response / 2.090909
Observations (or Sum Wgts) / 77

Analysis of Variance

Source / DF / Sum of Squares / Mean Square / F Ratio
Model / 3 / 23.12281 / 7.70760 / 3.9280
Error / 73 / 143.24083 / 1.96220 / Prob > F
C. Total / 76 / 166.36364 / 0.0117

Parameter Estimates

Term / Estimate / Std Error / t Ratio / Prob>|t|
Intercept / 3.4168959 / 0.926489 / 3.69 / 0.0004
Job Complexity / -0.018603 / 0.006859 / -2.71 / 0.0083
Base Pay / -0.060165 / 0.156479 / -0.38 / 0.7017
Dependents / -0.084708 / 0.123335 / -0.69 / 0.4944

(h) The following output is from a simple regression of absenteeism on base pay. Notice that the coefficient on base pay is negative in the simple regression but positive in the multiple regression. Briefly explain the interpretation of this.

Bivariate Fit of Absenteeism By Base Pay

Linear Fit

Absenteeism = 3.911742 - 0.2790188 Base Pay

Summary of Fit

RSquare / 0.050803
RSquare Adj / 0.038147
Root Mean Square Error / 1.451031
Mean of Response / 2.090909
Observations (or Sum Wgts) / 77

Analysis of Variance

Source / DF / Sum of Squares / Mean Square / F Ratio
Model / 1 / 8.45173 / 8.45173 / 4.0141
Error / 75 / 157.91190 / 2.10549 / Prob > F
C. Total / 76 / 166.36364 / 0.0487

Parameter Estimates

Term / Estimate / Std Error / t Ratio / Prob>|t|
Intercept / 3.911742 / 0.923733 / 4.23 / <.0001
Base Pay / -0.279019 / 0.139264 / -2.00 / 0.0487