Unit 16 Continuous Probability Distributions

Normal Probability Distribution

The Normal Probability Distribution is the most commonly encountered distribution in nature. Whenever you measure some characteristic of a population, be it running times, weight, volume, etc. you will most likely get data that is distributed normally. If you do not use a measuring device to generate the data, then most likely the data will not be distributed normally. Some examples of data sets that are not normally distributed are polls, daffodils in a lawn, leaves on a tree, or people’s income, i.e. things that you count.

The probability distribution of a Normal distributed population will have the classic bell shape,

Pictured here is the Standard Normal Distribution. It is an abstract distribution with mean equal to zero and standard deviation equal to 1.0. If we wish to compare two different populations, we can convert a data value from each population to a standard score,also called az-score and then compare the two scores. For example, height and weight of men are normally distributed with means 69 inches and 168 lbs respectively, and standard deviations of 2.8 inches and 27 lbs respectively. The average looking male would weigh 168 lbs and be 69 inches tall. What about someone who was 76 inches tall and weighed 180 pounds? To convert his height to a standard score, we use the following formula,

His height converted to a standard score would then be,

The standard score corresponding to his weight would be,

Looking at these numbers, we would say that he’s much taller than he is heavy. In fact, this person would appear thin to us even though his weight is above average.

Question #1.
Find the z-score of someone, who’s IQ is 120 if

It’s important to note that the histogram of sample data from a normally distributed population won’t be perfectly bell shaped, but it won’t be skewed either. Compare the following two histograms,

The histogram on the left is not perfectly bell shaped, but it is more or less centered around its mean of 234 which is also close to the mode of 235, and it starts out low, rises to a peak near center and then drops off. The histogram on the right is clearly skewed to the left. The data in the tail has pulled the mean of 194 to the right of the mode at 190. In a perfect world the two distributions would look as follows,

Area Equals Probability

Recall the histogram for throwing dice. We’ll use as the height of the bars, the expected value, and assume that we roll the dice 36 times.

For example, the height of the bar for bin 7 is six, because if we roll the dice 36 time we would expect to get six sevens, We can create a relative histogram by dividing the height of each bar by the total number of data points, in this case 36. Thus the height of the bar for bin 7 would be 1/6 and the height of the bar for bin 6 would be 5/36, etc. If you were to do this for every bar, and then added the height of all the bars, recall that the sum would be 1.0.

If we roll a pair of dice, we are sure to get a number between 2 and 12, and so What’s the probability of rolling a six? There are 5 ways that can happen, there are 36 possibilities, so If we let the width of each bar in the histogram equal 1, then height of each bar equals the area of each bar. The area of the bar over bin 6 in the relative histogram is 5/36, the same as the probability of rolling a 6. We’ve seen this before; area of the histogram is equivalent to probability.

The same reasoning applies to continuous probability distributions. The total area under the normal distribution curve equals 1.0. However, there are some differences between discrete data sets and continuous data sets.

When we are dealing with continuous datasets, we can’t have single value bins. For one thing, there is no such thing as someone being, for example, exactly 69 inches tall. Also, if our populations are huge, then we could use many bins in our histogram. So many, that when you looked at it from a distance, the outline of the histogram would begin to look like a smooth curve. We use histograms for our samples, and we use smooth curvesto represent populations. Look at the similarity between the two:

When we take a relatively small sample and histogram it, we are not going to get a perfect bell curve, but if it’s close, we are going to assume that the sample is from a normally distributed population, especially if the data was generated by taking measurements.

Area under the smooth curve is equivalent to probability just as it was for the histogram. So finding probabilities becomes a problem of finding areas. The mean of a normally distributed population is always the dead center of the curve. For example, IQ is normally distributed. The mean is 100 and the standard deviation is 15. The center of the bell curve would be 100 and 115 would be one standard deviation unit to the right and 85 would be one standard deviation to the left.

Question #2.
For the IQ distribution, how many standard deviations from the mean would be a score of 145?

This is the end of Unit 16. In class, you will get more practice with these concepts by working exercises in MyMathLab.

1

Copyright ©RHarrow 2013