QMETH 520: Managerial Applications of Regression Analysis

ORMAT QMETH 520: Managerial Applications of Regression Models

Winter, 06: TTH 3:30 - 5:20 PM in BLM 311

(http://faculty.washington.edu/htamura/qm520)

Instructor:

Hiro Tamura: 362 MacKenzie Hall: (206) 543-4399; htamura@u

Office Hours: TTH 2:00 - 3:20 PM

Course objective:

A fundamental task of a manager is to make predictions. In many circumstances, the manager relies on her/his intuitive judgment. For a complex problem with important consequences, however, a manager is compelled to gather data for prediction. She/he can proceed to interpret cues and makes prediction using her/his expertise, similar to a physician who diagnoses the health of a patient from a variety of test results. Could the manager (an expert) improve on her/his intuition by using statistical methods? Works of outstanding psychologists have discovered that for many situations a simple mechanical (statistical) combination of cues outperforms a prediction by the experts. It is difficult for an individual to combine information from different sources. Regression and other statistical models fill this need by providing a mechanism that combines cues for prediction.

For forecasting using regression, the manager first determines a variable to predict for each study unit in the target population. This variable is called the dependent variable. Next, the manager selects a set of variables that in her/his judgment could provide cues for prediction. These variables are called independent variables or explanatory variables. S/he then formulates a statistical model for combining the cues to predict the dependent variable. The statistical model is called a population regression model, and has unknown parameters that need to be estimated from data (sample). The regression with parameter values estimated from data is called the sample regression equation.

The manager must examine the performance of the sample regression equation before using it. Here are main points for check:

(1) Does the sample regression explain the data reasonably well?

(2) Does every coefficient provide a reasonable interpretation of the effect of the independent variable?

(3) Is the population regression model a good approximation of the true underlying data generating process?

(4) Do the data contain any surprises that might be ill-affecting the sample regression?

(5) Can the regression be simplified by selecting a subset of the original independent variables without significantly lowering its performance?

Software:

We will use SPSS for computation. SPSS is one of the most popular software choices among practitioners, and well suited for analysis of cross section data. Excel worksheets can be read by SPSS, so Excel is useful for reading and editing data before analysis using SPSS.
Texts & References

(T1) Dielman, T. E. (2005), Applied Regression Analysis. A second Course in Business and Economics Statistics. Fourth Edition

(R) Articles – downloadable from the course web.

1. Mandel, B. J. (1969) The Regression Control Chart. Journal of Quality Technology 1: 1

2. Comiskey, E. E. (1966) Cost Control by Regression Analysis, The Accounting Review 41: 2

3. Tamura, H. (1979) Using Statistics to Find a Reasonable Cost for Medicaid Payments. Journal of Contemporary Business 8: 2

4. Tamura, H., Lauer, L.W. and Sanborn, F. A. (1985) Estimating ‘Reasonable Cost’ of Medicaid Patient Care Using a Patient Mix Index. Health Services Research 20: 1

5. Mochel, D. and Tamura, H. credit scoring systems, Credit Union Executive, The Winter 81: 5

6. Karpoff, J. M. (2001) Public versus Private Initiative in Arctic Exploration: The Effects of Incentives and Organizational Structure. Journal of Political Economy 109: 1

(To continue)

Grading:

Homework/Quiz 6 30%

Exam (Take Home) 1 35

Project 1 35

Total 100%

Guest Speakers for Special Topics

Mike Bowcut, Director, Database Marketing & Analysis, REI

Vandra Huber, Professor, Human Resource Management

University of Washington Business School

Schedule (subject minor revisions):

I. Orientation

1. 1/3/Tu A. Course overview, explanation of project

Statistics for management

B. Regression analysis – overview

i. Standard regression vs. logistic regression

ii. Three performance measures of a regression

C. Applications for Management – Examples

2. 1/5/Th A. Introduction to SPSS

Study Case: Introduction to SPSS

i. Importing an Excel worksheet

ii. Define missing values; labeling variables and string variable values.

iii. Graphs menu

iv. Analyze menu

v. Regression outputs

B. Review of Simple Regression Analysis

Text Ch. 2 and 3, Siegel Ch. 11 and 12 (where necessary)

Study Case: Review Questions

i. Interpreting scatterplot and correlation coefficient – on line demonstration

ii. Computing and interpreting simple regression outputs

II. Basics

3. 1/10/Tu A. Population Regression Model

Text Ch. 3; 4.1, 2

Study Case: Marketing a New Shampoo Formula – Data Analysis

i. Understanding each component of the population regression model

= conditional mean of Y given X

bk = population regression coefficient of Xk

e = disturbances

ii. DGP: data generating process for regression

iii. Using log transformation for variables

iv. SPSS menu for transforming variables

B. Examination of data using scatterplot matrix

pattern of association among variables (y vs. x) (x vs. x), outliers

C. Basic Regression Outputs

Text Ch. 3.4; 4.3

i. Interpreting the equation section of the output

standardized regression coefficient,

ii. Interpreting the ANOVA section of the output

SSR, SSE, SST, and R2; MSR, MSE, MST, and adjusted R2

unconditional population regression model (hidden)

4. 1/12/Th A. Basic Hypothesis Testing in Regression

Text Ch. 3.4; 4.3, 4.4

i. F-distribution, Excel FINV, and FDIST

ii. t-distribution, Excel TINV, and TDIST

iii. Distribution of t-stat and F-stat – on line demonstration

iv. Testing significance of the sample regression
F-stat=MSR/MSE

v. Testing significance of an independent variable

standard error of coefficient, ; t-stat= bk/

two sided test vs. one sided test

B. Effect of Adding / Omitting an Independent Variable

Study Case: Marketing a New Shampoo Formula

– Comparison of Simple vs. Multiple Regression

how each of three performance measures is affected?

5. 1/17/Tu Prediction of Y| xm for a new study unit

A. Point vs. Interval Prediction

Text Ch. 3.5, 4.5, Appendix D.

Study Case: Forecasting Labor Hours For Moving

Study Case: Marketing a New Shampoo Formula - Prediction

i. Point vs. Interval prediction

ii. Interval prediction ignoring estimation errors

iii. Interval prediction accounting for estimation errors

standard error of the regression se

standard error of the estimate sm for simple regression (text p. 104)

standard error of the estimate sm for multiple regression

standard error of the prediction sp (text p. 106)

key formula: (text p. 106)

B. Measures of prediction performance
Deleted Residual, PRESS, R2_PRED, Sample splitting

III. Diagnostics of the Sample Regression Equation

6. 1/19/Th A. Diagnostics Tests – Assessing the Assumptions

Text, Ch. 6:1-6.6

Study Case: Forecasting Labor Hours For Moving – Residual Analysis

key assumptions, RVSF plot, normal plot,

testing for heteroscedasticity

B. Diagnostics Tests – Multicollinearity

Text, Ch. 4.6

Study Case: Promotion Planning

Study Case: Variables in Hospital System

multicollinearity, inflated standard errors, VIF

7. 1/24/Tu Diagnostics Tests – Influential Observations

Text, Ch. 6.7

Study Case: Regression Control

leverage, t-resid, Cook’s D, DFITS

8. 1/26/Th A. PDCA Cycle for Modeling

B. Review / Q&A

IV: Modeling Techniques

9. 1/31/Tu Categorical Independent Variables

Text, Ch. 7.1, 7.2; Ch. 4.4.2

Study Case: Salary Discrimination

Study Case: Factors Driving Sales

Study Case: Product Display

symbolic representation, indicator (dummy) variables,

additive vs. interaction terms.

SS for full model, for reduced model; conditional SS

10. 2/2/Th Trends and Seasonality

Text, Ch. 3.6, 7.3

Study Case: Ice cream Production

Study Case: MLB Salary

Study Case: Growth of Heart Transplant Surgery

Study Case: Seasonal pattern of University Book Store Sales

Study Case: Detecting Spurious Trend

Standard trends, seasonal dummy variables, DW test, AR(1) disturbance

11. 2/7/Tu Selection of Independent Variables

Text, Ch. 8; NKNW Ch. 8.3.

Study Case: Modeling Hospital Operations

backward elimination, stepwise regression; all possible regression, Cp.

V: Qualitative Dependent Variable

12. 2/9/Th Review of Contingency Table Analysis

Siegel, Ch. 17

Study Case: Pet Ownership and Patient Survival

distribution, Excel CHIDIST, CHIINV

two way contingency table, test of independence

2 by 2 table, relative risk, odds ratio

13. 2/14/Tu Basics of Logistics Regression

Text Ch. 10.3

A. Model Interpretation

Study Case: Logistic Regression – Model Interpretation

generalized linear models, binomial distribution, logit link

B. Estimation and Significance Testing

Study Case: Getting a Flu Shot

maximum likelihood estimation

14. 2/16/Th Logistic Regression - Grouped Data

Text Ch. 10.3,

Study Case: Preference Testing

weighted least squares

15. 2/21/Tu A. Multinomial Logit Regression

Study Case: Awareness for Health Care

multinomial distribution
B. Ordinal Response Variable

Study case: Awareness for Health Care

Ordered probit model, ordered logit model

Invited Speakers

SP1. 2/23/Th Mike Bowcut, Director, Database Marketing & Analysis

REI

SP2. 2/28/Tu Vandra Huber, Professor, Human Resource Management

University of Washington Business School

Special Topics

16. 3/2/Th Analysis of the Correlations Among Independent Variables

Janet R. Daling and Tamura, H. (1970) "Use of Orthogonal Factors for Selection of Variables in a Regression Equation - An Illustration."

Applied Statistics (The Journal of the Royal Statistical Society-Series C) 19

Study Case: Selection of Factors for UW Admission factor analysis of the correlation matrix

Review / Q&A

17. 3/7/Tu

18. 3/9/Th Final Exam Due. Course Evaluation