Lab Activity 4: Statistics, Parameters, and Sampling Distributions

Part I – Statistics and Parameters

In each situation, explain whether the value given in bold print is a statistic or a parameter:

1.  A polling organization samples 1000 adults nationwide and finds that 72% of those sampled favor tougher penalties for persons convicted of drunk driving.

Statistic -- “of those sampled”

2.  In their year 2000 census, the United States Census Bureau found that the median age of all American citizens was about 35 years.

Parameter – “All” Americans

3.  For a sample of 20 men and 25 women, there is a 14 centimeter difference in the mean heights of the men and women.

Statistic -- value for sample

4.  A writer wants to know how many typing mistakes there are in her manuscript, so she hires a proofreader who tells her that in all books of the same length as hers, the average is 15 errors.

Parameter -- value for books of that length

Explain whether or p is the correct statistical notation for each proportion described:

5.  The proportion that smokes in a randomly selected sample of n = 300 students in the 11th and 12th grades.

because it’s the value from the sample

6.  The proportion that smokes among all students in the 11th or 12th grade in the United States.

p because it’s the value for the population

7.  The proportion that is left-handed in a sample of n = 250 individuals.

because it’s the value from the sample

Part II- Sampling Distribution of Sample Proportion

A newspaper conducts a poll to determine the proportion of adults who favor a certain candidate. They ask a random sample of 400 people whether or not they favor that candidate (Assume no bias!). Suppose the true proportion of adults who favor the candidate is 64%.

1.  The newspaper records the sample proportion who favor the candidate. What is the sampling distribution of the sample proportion? Draw a picture! (Do it by hand…having in mind the “empirical rule” on p. 44.)

The sampling distribution of the sample proportion can be approximated by the normal distribution since n*p = 400 * 0.64 = 256 and n*(1-p) = 400 * 0.36 = 144 are both > 10, with mean = p = 0.64 and standard deviation = = √( 0.64*(1-0.64)/400) = 0.024

So, when drawing the curve for the sampling distribution of the sample proportion, have in mind that 99.7% of the area under the curve should fall between 0.64 ± 3*(0.024) = (0.568, 0.712). The tails should approach (be close to) the horizontal axis at these points.

2.  What is the probability that the newspaper would have recorded a sample proportion greater than 68%?

z-score = (0.68 -0.64)/ 0.024 = 1.67

P( > 0.68) = P(Z > 1.67) = 1 –P(Z < 1.67) = 1 - 0.9525 = 0.0475

3.  What is the probability that less than 55% of the newspaper respondents would support this candidate?

z-score = (0.55 -0.64)/ 0.024 = - 3.75

P( 0.55) = P( Z < -3.75) ≈0.001

If you try to look this up the table only goes down to -3.49 so we know the probability must be less than P(Z<-3.49) so P( 0.55) < 0.0002. This makes sense if we look at the picture of the p-hat distribution above. Notice that there is almost no area to the left of 0.55, meaning almost NO CHANCE of observing a sample proportion smaller than 0.55.

Part III- Sampling Distribution of Sample Mean

Put the CD that came with your textbook into the CD-ROM drive, the menu for the CD should appear on the screen automatically. Click on “Turn on Your Computer Applets”. Select the “SampleMeans” applet. This simulation will aid us in understanding this difficult concept of the sampling distribution for a sample mean.

1.  As you can see, we are looking at a population that is normally distributed with μ = 8 and σ = 5. Let’s say that we want to take a sample of size 10 from this population (n = 10). What does the sampling distribution of look like in this case? You can get an idea by using this simulation. Enter “10” in the box at the top labeled “# Observations per sample”. Now click on the button labeled “100” under “# Samples”. This will generate 100 samples of size 10 (n=10) from the population, calculate the sample mean for each, and create a red histogram that shows the distribution of those 100 sample means.

Click on the “100” button again. Roughly what shape is the histogram for these 200 sample means?

What are the center and spread? Does the spread make sense?

The histogram is roughly bell-shaped. (It is centered at 8 and stretches from 5 to 11.)

2.  From the CD menu, click “Turn on Your Computer Applets” again and choose “SampleMeans” once more. This should open a new browser window with another applet. KEEP the previous one as well for comparison purposes.

Suppose we now want to take a sample of size 100. What would the sampling distribution of look like in this case? This time, enter “100” as the number of observations per sample and then generate 100 samples.

Again, the histogram is roughly bell-shaped. (It is still centered at 8 but now stretches only from 7 to 9.)

As you increased the sample size, what changed about the sampling distribution for ? Has the overall shape changed? the mean? the spread (variability)?

The only change was in the spread (or variability).

3.  Instead of just looking at a histogram, let’s actually make some calculations in order to

describe these sampling distributions. Using the normal curve approximation rule, describe the sampling distribution of for samples of size 10 from the population above. What is the shape? mean? standard deviation?

The sampling distribution would be approximately normal (bell-shaped) with mean µ = 8 and standard deviation = 5 / sqrt(10) = 1.58.

4.  Now describe the sampling distribution of for samples of size 100. What is the shape?

mean? standard deviation? Compare these results to your answer in #4. Do you notice the same similarities/differences here as you did using the simulation?

The sampling distribution would be approximately normal (bell-shaped) with mean µ = 8 and standard deviation = 5 / sqrt(100) = 0.5. This matches earlier observations that the shape and mean of the distribution remained constant but the spread decreased.

Suppose I take a sample of size 100 from the population above. What is the probability that I will observe a sample mean greater than 9?

P(> 9) = P(Z > ((9-8)/0.5)) = P(Z > 2) = 1 – P(Z < 2) = 1 - 0.9772 = 0.0228

5.  Remember that the population does not have to be normally distributed in order for the normal approximation rule to apply, but our sample size does have to be relatively large (n ≥ 30). To illustrate this, open the “TVMeans” applet.

Notice that the population is now quite skewed to the right. However, use the simulation to generate 100 samples of size 5 (n=5) from this population and look at the distribution of the sample means. What do you notice about the shape?

The histogram is skewed to the right, but not as drastically as the population distribution.

6.  Now clear the applet and generate 100 samples of size 30 (n=30) from this population. What is the shape of the sampling distribution now?

Now that we increased the sample size to n=30, the histogram is roughly bell-shaped.

Calculate the mean and standard deviation for the sampling distribution of in this case.

The mean of the sampling distribution is µ = 8.352 and the standard deviation is = 7.723 / sqrt(30) = 1.41.

If I took a sample of size n=30 from this population and calculated the sample mean of hours watching TV in a week, what is the probability the sample mean would be between 8 and 9 hours?

P(8 < < 9) = P( < 9) – P( < 8) = P(Z < 0.46) – P(Z < -0.25) = 0.6772 - 0.4013 = 0.2759

1