The Value of Data Analysis
Cost analysis leads to these common business questions:
- Does changing product mix affect profits for my company?
- Does advertising help my company’s sales?
- What efficiency improvements should I make?
Managerial Constraints: Limited time and limited money
Managerial Choices: gut feel or data analysis; best to do both
Ross recruiters:
- Students have good theoretical knowledge
- Need more practical engagement with using data to solve business problems
Michigan degrees offered
- BA/MA/PhD in Actuarial Math, Applied Math, and Financial Math
- BA/MA/PhD in Statistics
- BA/MA/PhD in Industrial Engineering and Computer Science
- BA/MA/MAcc/MBA/PhDat Ross
In Search of Excellence by Peters and Waterman (1982)
- Plenty of data and cheap to analyze
- Careful analysis can help sharpen human intuition about the business
- Data analysis without business understanding can be ruinous
Our Goal
- Learn to be a driver, not the mechanic
- Use data to understand business and enhance communication
Thinking about Statistical Analysis from an Accounting Perspective
Let’s start with a medical case. Two patients report platelet counts:
Patient APatient B
3 4
Platelet counts of:
Group A (5 patients without drug)Group B(4 patients w drug)
Andy: 5Beth:1
Amer:6Ben:2
Joe:5Matt:9
Chris:5Bill:10
Ali:6Joe15
Which group --- A or B --- has higher platelet counts?
How does the plot look?
How are the plots different?
The miracle of the Bell curve and the magical p-value
What is a bell curve?
p-value: the chance that two bell curves are the same
(p-value in the above platelet example is 0.39 or 39%)
What is correlation?
Coordinated dancing and boxing and clapping:
When all body parts move is sync, the correlations are highly positive or highly negative (i.e., if you have weight on the left foot, you know the right foot is free).
When all body parts are out of sync, the correlations are zero (i.e., the position of the left foot tells you nothing about the position of the right foot).
Height and Protein Intake among 20 year old males
MaleHeightAvg. daily lifetime intake
A1.8723
B1.5530
C1.6910
………
Q: Is higher protein intake associated with higher height?
Plot the graph…
Suppose the slope is 3. Plot the plausible underlying bell curve when it is significantly different from zero, and when it is not. (short and fat vs tall and skinny.)[1]
Question: What does it mean when the bell curve is fat? What is the possible set of values that are likely?
Question: What is the y-axis and the x-axis on a bell curve?
Mean is the point on the x-axis where the bell curve is centered
Standard deviation is a measure of the fatness of the bell curve
Example from Finance: High risk “means” high return. How does the plot between risk and return look?
Question: Whither bell curve?
Question: Whither probability trees?
Question: Who is Howard Marks?
Correlation vs. Causation
Alternative explanations for the observed relation between height and protein intake:
Correlation between two variables can be “contaminated” by a third variable, say wealth. Remove the contamination:
- Examine height-protein plots separately for rich and poor people. (Hi-low method)
- Regress height on protein intake and wealth. The “coefficient” on the protein intake now tells you how height and protein intake are correlated once wealth effects have been “neutralized”.
The miracle of p-value occurs once again. You can compute the chance that the “partial” correlation is the same as zero. If the chance is less than 5 or 10 percent (0.05 or 0.10), you can assert that it is different than zero
.
A Case
- Designing the Regression
Q: Does MBA make a better fund manager?
Q: Business Analysis: Why is this an important question to ask?
Solution: Gut feel or data analysis?
Collect data on a bunch of funds and run the regression[in Excel]
Important: First check the min and max of each variable to make sure you are working with reasonable data without massive outliers
Regress Excess PercentReturn on MBA
How to write the regression results as an equation?
Expected Excess PercentReturn = - 1.049 + 1.058*MBA
What does the MBA coefficient mean both economically and numerically?
Regress Excess PercentReturn on MBA and SAT
How to write the regression results as an equation?
Expected Excess PercentReturn = - 6.8 + 0.005*SAT + 0.95*MBA
What does the MBA coefficient mean both economically and numerically?
- Interpreting the Results
Expected Excess PercentReturn = -3.5 + 0.005*SAT + 0.581*MBA -1.83*Growth - 0.07*Age -0.02*Tenure + 0.365*Size
Meaning of the Regression itself: If a growth fund of size 5 hires a 30-year old manager who has an MBA and a SAT score of 1500 (note tenure = 0), it should expect a return of:
Meaning of:
MBA coefficient:
SAT coefficient:
Each coefficient comes with a p-value. Let’s keep only those with p-value less than 10%. Meaning:
Expected Excess PercentReturn = - 3.5 + 0.005*SAT -1.83*Growth -0.07*Age + 0.365*Size
Meaning of the Regression itself: If a growth fund of size 5 hires a 30-year old manager who has an MBA and a SAT score of 1500, it should expect a return of:
New meaning of the MBA coefficient [the values it can take]:
- What business questions can the regression answer?
- Should we hire an MBA for our hedge fund?
- Given two equal candidates, one older and one younger, what should we do?
- Should we hire a candidate who’s been to a good undergrad school? Why?
- How much should we trust our gut-feel during hiring?
- When should we ignore the regression completely?
Value of Regressions
How do you answer questions below (gut feel or data-driven):
- Did investing more in quality reduction efforts improve quality? [TI]
- Did external failures of our product at customer sites hurt sales? [TI]
- Does improving customer satisfaction lead to more profits? [Bank]
- How do quant traders front-run institutional traders?
Closing Thoughts
-Can you trust the data blindly?
- Wrong regression specification
- Correlation is not causality
- Data incorrect or too skewed
-What does a human really do?
- Skill in Design, not Analysis
-My experience with Guardian Glass
- quality engineers with accountants
1
[1] Think of 3 itself as 3 attached to a “infinitely super-skinny and infinitely supertall” bell curve centered at 3.