Chapter 10: Sampling Distributions

Some Vocabulary

  • Parameter – any number that describes a population, unknown in statistical practice because we cannot examine the entire population
  • Statistic – number that can be computed from sample data without making use of any unknown parameters – often used to estimate the value of a population parameter

Example: µ is the population mean and is the sample mean

10.1 Your local newspaper contains a large number of advertisements for unfurnished one-bedroom apartments. You choose 10 at random and calculate that their mean monthly rent is $540 and that the standard deviation of their rents is $80.

10.2 Voter registration records show that 68% of all voters in Indianapolis are registered as Republicans. To test a random-digit dialing device, you use the device to call 150 randomly chosen residential telephones in Indianapolis. Of the registered voters contacted, 73% are registered Republicans.

10.3 A carload lot of ball bearings has a mean diameter of 2.5003 cm. This is within the specifications for acceptance of the lot by the purchaser. By chance, an inspector chooses 100 bearings from the lot that have a mean diameter of 2.5009 cm. Because this is outside the specified limits, the lot is mistakenly rejected.

Statistical Estimation and The Law of Large Numbers

Draw observations at random from any population with finite mean µ. As the number of observations drawn increases, the mean of the observed values gets closer and closer to the mean of the population.

10.5 The idea of insurance is that we all face risks that are unlikely but carry high cost. Think of a fire destroying your home. Insurance spreads the risk: we all pay a small amount, and the insurance policy pays a large amount to those few whose homes burn down. An insurance company looks at the records for millions of homeowners and sees that the mean loss from fire in a year is $250 per person. The company plans to sell fire insurance for $250 plus enough to cover its costs and profit. Explain clearly why it would be unwise to sell only 12 policies, Then explain why selling thousands of such policies is safe business.

What is the Fire Threat?
Number of Residential Fires in 1999 / 371,000
Number of Fire Deaths in the Home in 1999 / 2,895
Cost of Residential Fires in 1997 / $4,565,000,000
Source: Fire loss in the United States during 1999, National Fire Protection Association (NFPA)

Sampling Distributions

-the distribution of values taken by the statistic in all possible samples of the same size from the same population

Sampling Distribution of

-Suppose that is the mean of a SRS of size n drawn from a large population with mean µ and standard deviation σ. Then the mean of the sampling distribution is µ and the standard deviation is σ/√n

-If individual observations have the N(µ,σ) distribution, then the sample mean of n independent observations has the N(µ,) distribution.

Example: Suppose the heights of American women are distributed N(64, 2.7).

  1. What is the probability that if 1 woman is selected at random, her average height will be more than 66 inches?
  1. What is the probability that if 9 women are selected at random, their average height will be more than 66 inches?
  1. What is the probability that if 25 women are selected at random, their average height will be more than 66 inches?

Central Limit Theorem

Draw an SRS of size n from any population with mean µ and finite standard deviation σ. When n is large, the sampling distribution of the sample mean is approximately normal: N(µ,).

For sufficiently large samples, the sampling distribution of will be approximatelyNormal.

  • Typically, a sample size of 25 or 30 is sufficiently large
  • The amazing and counter-intuitive thing about the central limit theorem is that no matter what the shape of the original distribution, the sampling distribution of the mean approaches a normal distribution
  • The necessary sample size depends on the normality or skewness of the distribution of the population
  • The larger the sample size, the better the normality
  • Averages are less variable than individual observations
  • The Central Limit Theorem holds when the distribution of the population is either unknown or non-Normal
  • If the population is Normal, then the sampling distribution of will be Normal regardless of the sample size
  • If we can claim that the sampling distribution is Normal, we can thenmake statements concerning the probability of obtaining certain values for the sample mean

In a Nutshell

1. The mean of the sampling distribution of means is equal to the mean of the population from which the samples were drawn.

2. The variance of the sampling distribution of means is equal to the variance of the population from which the samples were drawn divided by the size of the samples.

3. If the original population is distributed normally, the sampling distribution of means will also be normal. If the original population is not normally distributed, the sampling distribution of means will increasingly approximate a normal distribution as sample size increases.

Example: The number of flaws per square yard in a type of carpet material varies with a mean of 1.6 flaws per square yard and standard deviation of 1.2 flaws per square yard. The population distribution cannot be Normal, because a count takes only whole-number values. An inspector samples 200 square yards of the material, records the number of flaws found in each square yard, and calculates , the mean number of flaws per square yard inspected. Use the central limit theorem to find the approximate probability that the mean number of flaws exceeds 2 per square yard.