The Central Limit Theorem
If we take a measurement, and call it “x”, then x may be subject to variability
If we repeat the measurement multiple times, say , then we could average the measurements and report
What we will learn today is that
- The variability of the average is less than the variability of a single measurement x
- The average is more “normal” than a single measurement.
These two facts are the result of the so called Central Limit Theorem.
Here is a summary of facts:
Let be a collection of independent random variables, all with the same distribution of mean and standard deviation . Thus can be thought of as independent samples from the same distribution
Define the sample mean (or average) by
Fact 1:
has mean
has standard deviation
This says the mean of the average is the same as the mean of an individual measurement, but that the standard deviation of the average is much smaller.
Fact 2:
As n gets larger and larger, the variability gets so small that is virtually indistinguishable from the mean .
This result is called the strong law of large numbers.
Fact 3:
Sums and differences of “normals” are again “normal”
In addition: If each individual comes from a normal distribution, then has a normal distribution whether or not the sample size n is small or large.
Fact 4: (Central Limit Theorem)
Let be a collection of independent random variables, all with the same distribution of mean and standard deviation .
If is large, then has approximate normal distribution with mean and standard deviation
The point is that this is true no matter what the common distribution of the sample is, if we average enough of them, then the distribution of the average is approximately normal.
A rule of thumb is to use the normal distribution if n > 30.
What this theorem says: Distributions can be quite complex. Given a single random variable X with mean and standard deviation , we may have very little information about what values X could take. However, if we take a independent sample , all of which have this same (perhaps complicated) distribution, then taking averages “washes out” most of the complexity. The result is a smooth normal curve for which we have an easy system for computing probabilities!
Example: Student scores on an exam are normally distributed with mean 500 and standard deviation 100.
a)What is the probability a randomly selected student scores at least a 520?
A: from the normal tables.
b)A class contains 25 students. What is the probability that the class average is at least 520?
A: In this case we have which is based on n =25 samples. By fact 1, the mean is 500, but the standard deviation is . Thus
Note: we did not need to use the central limit theorem here because the scores themselves were already normally distributed.
Example: Suppose that one’s body temperature is normally distributed with mean 98.6 degrees Fahrenheit and standard deviation 0.3 degrees.
a)If a randomly selected person is measured once, what is the probability the temperature exceeds 98.9?
b)If a randomly selected person is measured 4 times independently (on different days), what is the probability the average of the four measurement exceeds 98.9?
Answers:
a)The z-score of 98.9 is . So
b)For four measurements, the average has standard deviation . In this case, the z- score of 98.9 is . So
This illustrates the fact that averages are less likely to deviate from the mean than do individual measurements.
Example: People do not all weigh the same. Let’s assume that for a large group of people, the mean weight is 150 pounds and the standard deviation is 30 pounds. Assume that 100 people are selected at random. What is the probability that this group has combined weight more than 14500 pounds?
Answer: Notice that the distribution of weights is not assumed to be normal. But since we have 100 (which is more than 30) people, we can use the central limit theorem. Recall that the central limit theorem tells us that the average of the sample is approximately normally distributed. On the other hand, we are interested in the combined weight. The point is that the statement about the combined weight being more than 14500 pounds can be rephrased as a statement about the sample average:
The combined weight is more than 14500 pounds if and only is the average weight is more than pounds.
Thus P( combined weight > 14500) = .
The point is that since the sample size is so large, we can invoke the Central Limit Theorem and assume that is normally distributed with mean 150 and standard deviation. So to compute we find the z-score of 145, which is . Thus
P( combined weight > 14500) = .
------
c)What is the probability that the combined weight of 10 people exceeds 1450?
The point here is that the sample size is too small to compute this probability. Usually we require a sample size of at least 30 to use the Central Limit Theorem. Therefore, unless we were to know something extra about the distribution of weight, we cannot compute this probability.