ORMAT QMETH 520: Managerial Applications of Regression Models
Winter, 06: TTH 3:30 - 5:20 PM in BLM 311
(http://faculty.washington.edu/htamura/qm520)
Instructor:
Hiro Tamura: 362 MacKenzie Hall: (206) 543-4399; htamura@u
Office Hours: TTH 2:00 - 3:20 PM
Course objective:
A fundamental task of a manager is to make predictions. In many circumstances, the manager relies on her/his intuitive judgment. For a complex problem with important consequences, however, a manager is compelled to gather data for prediction. She/he can proceed to interpret cues and makes prediction using her/his expertise, similar to a physician who diagnoses the health of a patient from a variety of test results. Could the manager (an expert) improve on her/his intuition by using statistical methods? Works of outstanding psychologists have discovered that for many situations a simple mechanical (statistical) combination of cues outperforms a prediction by the experts. It is difficult for an individual to combine information from different sources. Regression and other statistical models fill this need by providing a mechanism that combines cues for prediction.
For forecasting using regression, the manager first determines a variable to predict for each study unit in the target population. This variable is called the dependent variable. Next, the manager selects a set of variables that in her/his judgment could provide cues for prediction. These variables are called independent variables or explanatory variables. S/he then formulates a statistical model for combining the cues to predict the dependent variable. The statistical model is called a population regression model, and has unknown parameters that need to be estimated from data (sample). The regression with parameter values estimated from data is called the sample regression equation.
The manager must examine the performance of the sample regression equation before using it. Here are main points for check:
(1) Does the sample regression explain the data reasonably well?
(2) Does every coefficient provide a reasonable interpretation of the effect of the independent variable?
(3) Is the population regression model a good approximation of the true underlying data generating process?
(4) Do the data contain any surprises that might be ill-affecting the sample regression?
(5) Can the regression be simplified by selecting a subset of the original independent variables without significantly lowering its performance?
Software:
We will use SPSS for computation. SPSS is one of the most popular software choices among practitioners, and well suited for analysis of cross section data. Excel worksheets can be read by SPSS, so Excel is useful for reading and editing data before analysis using SPSS.
Texts & References
(T1) Dielman, T. E. (2005), Applied Regression Analysis. A second Course in Business and Economics Statistics. Fourth Edition
(R) Articles – downloadable from the course web.
1. Mandel, B. J. (1969) The Regression Control Chart. Journal of Quality Technology 1: 1
2. Comiskey, E. E. (1966) Cost Control by Regression Analysis, The Accounting Review 41: 2
3. Tamura, H. (1979) Using Statistics to Find a Reasonable Cost for Medicaid Payments. Journal of Contemporary Business 8: 2
4. Tamura, H., Lauer, L.W. and Sanborn, F. A. (1985) Estimating ‘Reasonable Cost’ of Medicaid Patient Care Using a Patient Mix Index. Health Services Research 20: 1
5. Mochel, D. and Tamura, H. credit scoring systems, Credit Union Executive, The Winter 81: 5
6. Karpoff, J. M. (2001) Public versus Private Initiative in Arctic Exploration: The Effects of Incentives and Organizational Structure. Journal of Political Economy 109: 1
(To continue)
Grading:
Homework/Quiz 6 30%
Exam (Take Home) 1 35
Project 1 35
Total 100%
Guest Speakers for Special Topics
Mike Bowcut, Director, Database Marketing & Analysis, REI
Vandra Huber, Professor, Human Resource Management
University of Washington Business School
Schedule (subject minor revisions):
I. Orientation
1. 1/3/Tu A. Course overview, explanation of project
Statistics for management
B. Regression analysis – overview
i. Standard regression vs. logistic regression
ii. Three performance measures of a regression
C. Applications for Management – Examples
2. 1/5/Th A. Introduction to SPSS
Study Case: Introduction to SPSS
i. Importing an Excel worksheet
ii. Define missing values; labeling variables and string variable values.
iii. Graphs menu
iv. Analyze menu
v. Regression outputs
B. Review of Simple Regression Analysis
Text Ch. 2 and 3, Siegel Ch. 11 and 12 (where necessary)
Study Case: Review Questions
i. Interpreting scatterplot and correlation coefficient – on line demonstration
ii. Computing and interpreting simple regression outputs
II. Basics
3. 1/10/Tu A. Population Regression Model
Text Ch. 3; 4.1, 2
Study Case: Marketing a New Shampoo Formula – Data Analysis
i. Understanding each component of the population regression model
= conditional mean of Y given X
bk = population regression coefficient of Xk
e = disturbances
ii. DGP: data generating process for regression
iii. Using log transformation for variables
iv. SPSS menu for transforming variables
B. Examination of data using scatterplot matrix
pattern of association among variables (y vs. x) (x vs. x), outliers
C. Basic Regression Outputs
Text Ch. 3.4; 4.3
i. Interpreting the equation section of the output
bk
standardized regression coefficient,
ii. Interpreting the ANOVA section of the output
SSR, SSE, SST, and R2; MSR, MSE, MST, and adjusted R2
unconditional population regression model (hidden)
4. 1/12/Th A. Basic Hypothesis Testing in Regression
Text Ch. 3.4; 4.3, 4.4
i. F-distribution, Excel FINV, and FDIST
ii. t-distribution, Excel TINV, and TDIST
iii. Distribution of t-stat and F-stat – on line demonstration
iv. Testing significance of the sample regression
F-stat=MSR/MSE
v. Testing significance of an independent variable
standard error of coefficient, ; t-stat= bk/
two sided test vs. one sided test
B. Effect of Adding / Omitting an Independent Variable
Study Case: Marketing a New Shampoo Formula
– Comparison of Simple vs. Multiple Regression
how each of three performance measures is affected?
5. 1/17/Tu Prediction of Y| xm for a new study unit
A. Point vs. Interval Prediction
Text Ch. 3.5, 4.5, Appendix D.
Study Case: Forecasting Labor Hours For Moving
Study Case: Marketing a New Shampoo Formula - Prediction
i. Point vs. Interval prediction
ii. Interval prediction ignoring estimation errors
iii. Interval prediction accounting for estimation errors
standard error of the regression se
standard error of the estimate sm for simple regression (text p. 104)
standard error of the estimate sm for multiple regression
standard error of the prediction sp (text p. 106)
key formula: (text p. 106)
B. Measures of prediction performance
Deleted Residual, PRESS, R2_PRED, Sample splitting
III. Diagnostics of the Sample Regression Equation
6. 1/19/Th A. Diagnostics Tests – Assessing the Assumptions
Text, Ch. 6:1-6.6
Study Case: Forecasting Labor Hours For Moving – Residual Analysis
key assumptions, RVSF plot, normal plot,
testing for heteroscedasticity
B. Diagnostics Tests – Multicollinearity
Text, Ch. 4.6
Study Case: Promotion Planning
Study Case: Variables in Hospital System
multicollinearity, inflated standard errors, VIF
7. 1/24/Tu Diagnostics Tests – Influential Observations
Text, Ch. 6.7
Study Case: Regression Control
leverage, t-resid, Cook’s D, DFITS
8. 1/26/Th A. PDCA Cycle for Modeling
B. Review / Q&A
IV: Modeling Techniques
9. 1/31/Tu Categorical Independent Variables
Text, Ch. 7.1, 7.2; Ch. 4.4.2
Study Case: Salary Discrimination
Study Case: Factors Driving Sales
Study Case: Product Display
symbolic representation, indicator (dummy) variables,
additive vs. interaction terms.
SS for full model, for reduced model; conditional SS
10. 2/2/Th Trends and Seasonality
Text, Ch. 3.6, 7.3
Study Case: Ice cream Production
Study Case: MLB Salary
Study Case: Growth of Heart Transplant Surgery
Study Case: Seasonal pattern of University Book Store Sales
Study Case: Detecting Spurious Trend
Standard trends, seasonal dummy variables, DW test, AR(1) disturbance
11. 2/7/Tu Selection of Independent Variables
Text, Ch. 8; NKNW Ch. 8.3.
Study Case: Modeling Hospital Operations
backward elimination, stepwise regression; all possible regression, Cp.
V: Qualitative Dependent Variable
12. 2/9/Th Review of Contingency Table Analysis
Siegel, Ch. 17
Study Case: Pet Ownership and Patient Survival
distribution, Excel CHIDIST, CHIINV
two way contingency table, test of independence
2 by 2 table, relative risk, odds ratio
13. 2/14/Tu Basics of Logistics Regression
Text Ch. 10.3
A. Model Interpretation
Study Case: Logistic Regression – Model Interpretation
generalized linear models, binomial distribution, logit link
B. Estimation and Significance Testing
Study Case: Getting a Flu Shot
maximum likelihood estimation
14. 2/16/Th Logistic Regression - Grouped Data
Text Ch. 10.3,
Study Case: Preference Testing
weighted least squares
15. 2/21/Tu A. Multinomial Logit Regression
Study Case: Awareness for Health Care
multinomial distribution
B. Ordinal Response Variable
Study case: Awareness for Health Care
Ordered probit model, ordered logit model
Invited Speakers
SP1. 2/23/Th Mike Bowcut, Director, Database Marketing & Analysis
REI
SP2. 2/28/Tu Vandra Huber, Professor, Human Resource Management
University of Washington Business School
Special Topics
16. 3/2/Th Analysis of the Correlations Among Independent Variables
Janet R. Daling and Tamura, H. (1970) "Use of Orthogonal Factors for Selection of Variables in a Regression Equation - An Illustration."
Applied Statistics (The Journal of the Royal Statistical Society-Series C) 19
Study Case: Selection of Factors for UW Admission factor analysis of the correlation matrix
Review / Q&A
17. 3/7/Tu
18. 3/9/Th Final Exam Due. Course Evaluation
1