Chapter 6, Section 1

A test of hypothesis

Suppose some expert is running around spouting statistics and you disagree with the numbers spouted. Well, there’s an official way to disagree with an expert and it’s called a test of hypothesis.

You can dispute any statistic, of course, but we’ll focus on disputing the mean…

You set up a 6 step process to disagree.

Step 1 is to identify the expert’s position and call it the “null hypothesis”

Ho m = k

This is what the expert claims.

Step 2 is to format your own feelings and call it the “alternative hypothesis”

Ha m < , > , ¹ k

You have to pick one of the comparatives.

Step 3 is to do a random sample and calculate your test statistic…in this case, as we’ll be focused on sample means (which have a normal distribution centered at the true population mean with a standard deviation reduced by the square root on n), we’ll use z score or t score as our test statistic.

Step 4 is to decide on a rejection region…to decide on a boundary for the test statistic.

We’ll be deciding exactly how far away from the expert’s number for the mean that we have to be in order to claim that he’s wrong. Often this is decided for you, but sometimes you can have a say in it. We’ll be deciding exactly how unusual is acceptable. Let’s look at a picture:

Step 5 is to list all assumptions you’ve made about the population being sampled.

Step 6 is to make a decision about the test…either reject the null hypothesis or not.

Let’s look at one hypothesis test and explore what’s happening:

An expert claims that a 6 year old can learn a new 4 letter word in 2 hours, on average. You disagree and feel that the child needs a longer time period to learn a new 4 letter word.

So Ho m = 2

Ha m > 2

We’ll do a random sample of 6 year olds and test to see how long it takes to learn a new 4 letter word. Suppose you sample 49 kids. You’re going to use your sample standard deviation in your test statistic (you’ll need to supply this information in your Step 5)…sometimes you use the expert’s standard deviation, sometimes one from the literature.

Suppose the sample mean is 2.2 hours with a sample standard deviation of .5 hour (15 minutes).

What’s the z score for this?

2.8…well the measurement we got is pretty far away from the mean.

Step 4 is often decided for you by industry standards…but let’s suppose we have some say in it. Let’s suppose we want an alpha of 1%…which corresponds to a z score of 2.33…this means that only 1% of the values of a distribution are higher than our boundary.

Did we find an average that is beyond 1.28 std dev from the mean? sure did.

So we’ve found that doing an honest sample nets an average that is so far away from the mean that the expert claims that it’s likelier that it comes from a different distribution than from the one posited by the expert…it’s more likely to come from a distribution centered higher than 2. Here’s a picture of what we think is probably true:

Here’s where we’ve got to be cautious. We can claim that the expert is wrong and it does take longer…but we’ve got to admit to an alpha of 1%…there’s a 1% chance that we’re wrong and our sample did come from the expert’s distribution – we just got a few outliers in our sample and it came out high.

This business of claiming someone is wrong is quite delicate. You see, since we don’t know the real true answer…and neither does the expert…we’re depending on data.

There’s a null hypothesis and an alternative hypothesis:

null hypothesis is true / alternate hypothesis is true
Accept null hypothesis / correct / Type 2 error
Reject null hypothesis / Type 1 error / correct

Our process attempts to minimize the Type 1 errors (probability alpha)…which, unfortunately, increases Type 2 errors (probability beta).

It is necessary to specify your alpha at some point during the hypothesis test…and to realize that you’re just not ever going to get at The Truth.

Our criminal justice system works with the same problem

defendant is innocent / defendant is guilty
jury finds innocent
jury finds guilty

Exactly the same alpha and beta are in play…and there’s always some probability that the outcome is just plain wrong.