The Value of Data Analysis

Cost analysis leads to these common business questions:

  1. Does changing product mix affect profits for my company?
  2. Does advertising help my company’s sales?
  3. What efficiency improvements should I make?

Managerial Constraints: Limited time and limited money

Managerial Choices: gut feel or data analysis; best to do both

Ross recruiters:

  1. Students have good theoretical knowledge
  2. Need more practical engagement with using data to solve business problems

Michigan degrees offered

  1. BA/MA/PhD in Actuarial Math, Applied Math, and Financial Math
  2. BA/MA/PhD in Statistics
  3. BA/MA/PhD in Industrial Engineering and Computer Science
  4. BA/MA/MAcc/MBA/PhDat Ross

In Search of Excellence by Peters and Waterman (1982)

  1. Plenty of data and cheap to analyze
  2. Careful analysis can help sharpen human intuition about the business
  1. Data analysis without business understanding can be ruinous

Our Goal

  1. Learn to be a driver, not the mechanic
  2. Use data to understand business and enhance communication

Thinking about Statistical Analysis from an Accounting Perspective

Let’s start with a medical case. Two patients report platelet counts:

Patient APatient B

3 4

Platelet counts of:

Group A (5 patients without drug)Group B(4 patients w drug)

Andy: 5Beth:1

Amer:6Ben:2

Joe:5Matt:9

Chris:5Bill:10

Ali:6Joe15

Which group --- A or B --- has higher platelet counts?

How does the plot look?

How are the plots different?

The miracle of the Bell curve and the magical p-value

What is a bell curve?

p-value: the chance that two bell curves are the same

(p-value in the above platelet example is 0.39 or 39%)
What is correlation?

Coordinated dancing and boxing and clapping:

When all body parts move is sync, the correlations are highly positive or highly negative (i.e., if you have weight on the left foot, you know the right foot is free).

When all body parts are out of sync, the correlations are zero (i.e., the position of the left foot tells you nothing about the position of the right foot).

Height and Protein Intake among 20 year old males

MaleHeightAvg. daily lifetime intake

A1.8723

B1.5530

C1.6910

………

Q: Is higher protein intake associated with higher height?

Plot the graph…

Suppose the slope is 3. Plot the plausible underlying bell curve when it is significantly different from zero, and when it is not. (short and fat vs tall and skinny.)[1]

Question: What does it mean when the bell curve is fat? What is the possible set of values that are likely?

Question: What is the y-axis and the x-axis on a bell curve?

Mean is the point on the x-axis where the bell curve is centered

Standard deviation is a measure of the fatness of the bell curve

Example from Finance: High risk “means” high return. How does the plot between risk and return look?

Question: Whither bell curve?

Question: Whither probability trees?

Question: Who is Howard Marks?

Correlation vs. Causation

Alternative explanations for the observed relation between height and protein intake:

Correlation between two variables can be “contaminated” by a third variable, say wealth. Remove the contamination:

  1. Examine height-protein plots separately for rich and poor people. (Hi-low method)
  1. Regress height on protein intake and wealth. The “coefficient” on the protein intake now tells you how height and protein intake are correlated once wealth effects have been “neutralized”.

The miracle of p-value occurs once again. You can compute the chance that the “partial” correlation is the same as zero. If the chance is less than 5 or 10 percent (0.05 or 0.10), you can assert that it is different than zero

.

A Case

  1. Designing the Regression

Q: Does MBA make a better fund manager?

Q: Business Analysis: Why is this an important question to ask?

Solution: Gut feel or data analysis?

Collect data on a bunch of funds and run the regression[in Excel]

Important: First check the min and max of each variable to make sure you are working with reasonable data without massive outliers

Regress Excess PercentReturn on MBA

How to write the regression results as an equation?

Expected Excess PercentReturn = - 1.049 + 1.058*MBA

What does the MBA coefficient mean both economically and numerically?

Regress Excess PercentReturn on MBA and SAT

How to write the regression results as an equation?

Expected Excess PercentReturn = - 6.8 + 0.005*SAT + 0.95*MBA

What does the MBA coefficient mean both economically and numerically?

  1. Interpreting the Results

Expected Excess PercentReturn = -3.5 + 0.005*SAT + 0.581*MBA -1.83*Growth - 0.07*Age -0.02*Tenure + 0.365*Size

Meaning of the Regression itself: If a growth fund of size 5 hires a 30-year old manager who has an MBA and a SAT score of 1500 (note tenure = 0), it should expect a return of:

Meaning of:

MBA coefficient:

SAT coefficient:

Each coefficient comes with a p-value. Let’s keep only those with p-value less than 10%. Meaning:

Expected Excess PercentReturn = - 3.5 + 0.005*SAT -1.83*Growth -0.07*Age + 0.365*Size

Meaning of the Regression itself: If a growth fund of size 5 hires a 30-year old manager who has an MBA and a SAT score of 1500, it should expect a return of:

New meaning of the MBA coefficient [the values it can take]:

  1. What business questions can the regression answer?
  1. Should we hire an MBA for our hedge fund?
  1. Given two equal candidates, one older and one younger, what should we do?
  1. Should we hire a candidate who’s been to a good undergrad school? Why?
  1. How much should we trust our gut-feel during hiring?
  1. When should we ignore the regression completely?

Value of Regressions

How do you answer questions below (gut feel or data-driven):

  1. Did investing more in quality reduction efforts improve quality? [TI]
  1. Did external failures of our product at customer sites hurt sales? [TI]
  1. Does improving customer satisfaction lead to more profits? [Bank]
  1. How do quant traders front-run institutional traders?

Closing Thoughts

-Can you trust the data blindly?

  • Wrong regression specification
  • Correlation is not causality
  • Data incorrect or too skewed

-What does a human really do?

  • Skill in Design, not Analysis

-My experience with Guardian Glass

  • quality engineers with accountants

1

[1] Think of 3 itself as 3 attached to a “infinitely super-skinny and infinitely supertall” bell curve centered at 3.