Mathematics IVUnit 11st Edition

Mathematics IV

Frameworks

Student Edition

Unit 1

How Confident Are You?

1st Edition

June, 2010

Georgia Department of Education

Table of Contents

INTRODUCTION ...... 3

Colors of Reese’s Pieces Candies Learning Task…………...... 7

Pennies Learning Task...... 16

Gettysburg Address Learning Task...... 22

Confidence Intervals Learning Tasks ...... 31

Mathematics IV – Unit 1

How Confident Are You?

Student Edition

Introduction:

In Mathematics III, students began examining sampling distributions and sampling variability, laying the groundwork for the Central Limit Theorem, a major topic in this unit. Then, students looked at a number of discrete probability distributions and at one particular continuous probability distribution, the normal distribution. This unit will continue linking the two. The Central Limit Theorem allows us to use a normal distribution to approximate the distribution of sample means and proportions, even when the original sampling distribution itself is not a normal distribution. The second major topic is constructing confidence intervals and understanding margin of error. These ideas are prevalent in the media and, therefore, important for all students to understand.

Much of the content in this unit is beyond what was traditionally taught in Georgia and may be beyond the mathematical and statistical training received by many teachers. Some general references that may be helpful to teachers are the Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report, which can be found online at NCTM’s Navigating through Data Analysis in Grades 9 – 12, and “Statistics in the High School Mathematics Curriculum: Building Sound Reasoning under Uncertain Conditions” by Richard Scheaffer and Josh Tabor in the August 2008 Mathematics Teacher. Also, the school’s AP Statistics teacher or local RESA math specialist may be able to provide assistance.

It is assumed throughout the unit that students already know how to use normal distribution tables or how to calculate normal probabilities on their calculators. These skills were taught in Mathematics III and should be maintained in the present unit. Other previous topics, e.g. sampling distributions, will be reviewed throughout the unit.

Enduring Understandings:

  • As the sample size increases, the value of a statistic approaches the true value of the population parameter, the standard deviation of the sample means is , and the standard deviation of the sample proportions is . These formulas for the standard deviation are valid as long as the population is at least 10 times the size of the sample.
  • The Central Limit Theorem allows us to use normal probability calculations, given certain conditions are met, for sample means and proportions even if the original distributions are non-normal.
  • Confidence intervals provide a range of values that estimate the population parameter.
  • The margin of error, often cited in media articles, tells how accurate we believe our estimate of the parameter to be. To decrease the margin of error, we could increase the sample size or decrease how confident we need the result to be.

Key Standards Addressed:

MM4D1. Using simulation, students will develop the idea of the central limit theorem.

MM4D2. Using student-generated data from random samples of at least 30 members, students will determine the margin of error and confidence interval for a specified level of confidence.

MM4D3. Students will use confidence intervals and margin of error to make inferences from data about a population. Technology is used to evaluate confidence intervals, but students will be aware of the ideas involved.

Related Standards Addressed:

MM4P1. Students will solve problems (using appropriate technology).

a.Build new mathematical knowledge through problem solving.

b.Solve problems that arise in mathematics and in other contexts.

c.Apply and adapt a variety of appropriate strategies to solve problems.

d.Monitor and reflect on the process of mathematical problem solving.

MM4P2. Students will reason and evaluate mathematical arguments.

a.Recognize reasoning and proof as fundamental aspects of mathematics.

b.Make and investigate mathematical conjectures.

c.Develop and evaluate mathematical arguments and proofs.

d.Select and use various types of reasoning and methods of proof.

MM4P3. Students will communicate mathematically.

a.Organize and consolidate their mathematical thinking through communication.

b.Communicate their mathematical thinking coherently and clearly to peers, teachers, and others.

c.Analyze and evaluate the mathematical thinking and strategies of others.

d.Use the language of mathematics to express mathematical ideas precisely.

MM4P4. Students will make connections among mathematical ideas and to other disciplines.

a.Recognize and use connections among mathematical ideas.

b.Understand how mathematical ideas interconnect and build on one another to produce a coherent whole.

c.Recognize and apply mathematics in contexts outside of mathematics.

MM4P5. Students will represent mathematics in multiple ways.

a.Create and use representations to organize, record, and communicate mathematical ideas.

b.Select, apply, and translate among mathematical representations to solve problems.

c.Use representations to model and interpret physical, social, and mathematical phenomena.

unit overview:

The unit begins with students reading and discussing an internet article on presidential approval ratings, employing percents and margins of error. This motivates the unit and will be revisited in the Confidence Intervals Tasks.

The Reese’s Pieces Candies task reviews ideas about sampling distributions of sample proportions and develops this knowledge into the Central Limit Theorem (CLT) for Proportions. In addition to collecting data with actual candy, students will use simulation to fully develop the CLT. The task concludes with a problem intended to synthesize the ideas from the simulation.

The next two tasks, Pennies and Gettysburg Address, are nearly identical. Both tasks review sampling distributions of sample means from normal populations, a topic from Acc. Math II. Then, either with pennies or words from the Gettysburg Address, students will draw samples of various sizes, create whole class plots, and discuss the results. This activity leads to the CLT for Means. The tasks conclude with a problem intended to synthesize the ideas from the activity.

The final series of learning tasks builds on students’ understanding of the empirical rule to develop confidence intervals and margin of error. The article from the beginning of the unit will serve as a catalyst and basis for the investigation of margin of error and confidence intervals for proportions. Students will then simulate samples for the creation of additional samples in an effort to understand what it means to say that one is 95% confident. The last part of this series addresses confidence intervals for sample means, including the use of data collection and simulation.

The culminating task requires students to synthesize the statistical knowledge gained throughout high school. They must design, implement, and analyze the results of a survey or experiment, paying particular attention to the use of statistical inference.

Vocabulary and formulas

Central Limit Theorem:

  • Choose a simple random sample of size n from any population with mean  and standard deviation . When n is large (at least 30), the sampling distribution of the sample mean is approximately normal with mean  and standard deviation .
  • Choose a simple random sample of size n from a large population with population parameter p having some characteristic of interest. Then the sampling distribution of the sample proportion is approximately normal with mean p and standard deviation . This approximation becomes more and more accurate as the sample size n increases, and it is generally considered valid if the population is much larger than the sample, i.e. np  10 and n(1 – p)  10.
  • The CLT allows us to use normal calculations to determine probabilities about sample proportions and sample means obtained from populations that are not normally distributed.

Confidence Interval is an interval for a parameter, calculated from the data, usually in the form estimate  margin of error. The confidence level gives the probability that the interval will capture the true parameter value in repeated samples.

Margin of Error is the value in the confidence interval that says how accurate we believe our estimate of the parameter to be. The margin of error is comprised of the product of the z-score and the standard deviation (or standard error of the estimate). The margin of error can be decreased by increasing the sample size or decreasing the confidence level.

Parameter is a number that describes the population. A parameter is a fixed number, but in practice we do not know its value because we cannot examine the entire population.

Sample Mean is a statistic measuring the average of the observations in the sample. It is written as. The mean of the population, a parameter, is written as .

Sample Proportion is a statistic indicating the proportion of successes in a particular sample. It is written as . The population proportion, a parameter, is written as p.

Sampling Distribution of a statistics is the distribution of values taken by the statistic in all possible samples of the same size from the same population.

Sampling Variability refers to the fact that the value of a statistic varies in repeated random sampling.

Statistic is a number that describes a sample. The value of the statistics is known when we have taken a sample, but it can change from sample to sample. We often use a statistic to estimate an unknown parameter.

colors of reese’s Pieces Candies[1] learning task:

1. Why do I need to learn more about statistics?

  1. Read the following article. What do the numbers in the article represent?
  2. How reliable do you think the ratings are?
  3. How do you think pollsters determine approval ratings such as these?
  4. What do you think a margin of error is? Why is that important?

During this unit, you will learn the answers to these questions and how those answers are important.

Excerpt from

Poll: Iraq speeches, election don't help Bush

Tuesday, December 20, 2005; Posted: 12:56 a.m. EST (05:56 GMT)

CNN -- President Bush's approval ratings do not appear to have changed significantly, despite a number of recent speeches he's given to shore up public support for the war in Iraq and its historic elections on Thursday.

A CNN/USA Today Gallup poll conducted over the weekend found his approval rating stood at 41 percent, while more than half, or 56 percent, disapprove of how the president is handling his job. A majority, or 52 percent, say it was a mistake to send troops to Iraq, and 61 percent say they disapprove of how he is handling Iraq specifically. The margin of error was plus or minus 3 percentage points.

...

The poll was nearly split, 49 percent to 47 percent, between those who thought the U.S. will either "definitely" or "probably" win, and those who said the U.S. will lose. That said, 69 percent of those polled expressed optimism that the U.S. can win the war. The margin of error for how respondents assessed the war was plus or minus 4.5 percentage points.

...

Although half those polled said that a stable government in Iraq was likely within a year, 62 percent said Iraqi forces were unlikely to ensure security without U.S. assistance. And 63 percent said Iraq was unlikely to prevent terrorists from using Iraq as a base. The margin of error on questions pertaining to troop duration in Iraq, as well as the country's future, was plus or minus 3 percentage points.

The poll interviewed 1,003 adult Americans and found that the public has also grown more skeptical about Bush's key arguments in favor of the war. Compared with two years ago, when 57 percent considered Iraq a part of the war on terrorism, 43 percent think so now. In the weekend poll, 55 percent said they view the war in Iraq as separate from the war on terror. The margin of error on this line of questioning was plus or minus 3 percentage points.

On the domestic front, 56 percent of those polled say they disapprove of how Bush is handling the economy; by contrast, 41 percent approve. The margin of error was plus or minus 3 percentage points.

The president may find support for his call to renew the Patriot Act. Forty-four percent said they felt the Patriot Act is about right, and 18 percent said it doesn't go far enough. A third of respondents say they believe the Patriot Act has gone too far in restricting people's civil liberties to investigate suspected terrorism.

Nearly two-thirds said they are not willing to sacrifice civil liberties to prevent terrorism, as compared to 49 percent saying so in 2002. The margin of error was plus or minus 4.5 percentage points for those questions.

2. Reviewing some basics:

a. Think about a single bag of Reese’s Pieces. Does this single bag represent a sample of Reese’s Pieces or the population of Reese’s pieces?

b. We use the term statistic to refer to measures based on samples and the term parameter to refer to measures of the entire population. If there are 62 Reese’s Pieces in your bag, is 62 a statistic or a parameter? If Hershey claims that 25% of all Reese’s Pieces are brown, is 25% a statistic or a parameter?

We also use different symbols to represent statistics and parameters. The following table will be very useful as we continue through this unit.

Parameter / Statistic
Proportion / P / “p-hat”
Mean /  “mu” / “x-bar”
Standard deviation /  “sigma” / S
Number / N / N

3. How many orange candies should I expect in a bag of Reese’s Pieces?

a. From your bag of Reese’s Pieces, take a random sample of 10 candies. Record the count and proportion of each color in your sample.

Orange / Yellow / Brown
Count
Proportion

b. Do you know the value of the proportion of orange candies manufactured by Hershey?

c. Do you know the value of the proportion of orange candies among the 10 that you selected?

d. Do you think that every student in the class obtained the same proportion of orange candies in his or her sample? Why or why not?

e. Combine your results with the rest of the class and produce a dotplot for the distribution of sample proportions of orange candies (out of a sample of 10 candies) obtained by the class members.

f. What is the average of the sample proportions obtained by your class?

g. Put the Reese’s Pieces back in the bag and take a random sample of 25 candies. Record the count and proportion of each color in your sample.

Orange / Yellow / Brown
Count
Proportion

h. Combine your results with the rest of the class and produce a dotplot for the distribution of sample proportions of orange candies (out of a sample of 25 candies) obtained by the class members. Is there more or less variability than when you sampled 10 candies? Is this what you expected? Explain.

i. What is the average of the sample proportions (from the samples of 25) obtained by your class? Do you think this is closer or farther from the true proportion of oranges than the value you found in f? Explain.

j. This time, take a random sample of 40 candies. Record the count and proportion of each color in your sample.

Orange / Yellow / Brown
Count
Proportion

k. Combine your results with the rest of the class and produce a dotplot for the distribution of sample proportions of orange candies (out of a sample of 40 candies) obtained by the class members. Is there more or less variability than the previous two samples? Is this what you expected? Explain.

l. What is the average of the sample proportions (from the samples of 40) obtained by your class? Do you think this is closer or farther from the true proportion of oranges than the values you found in f and i? Explain.

4. Sampling Distribution of

We have been looking a number of different sampling distributions of , but we have seen that there is great variability in the distributions. We would like to know that is a good estimate for the true proportion of orange Reese’s Pieces. However, there are guidelines for when we can use the statistic to estimate the parameter. This is what we will investigate in the next section.

First, however, we need to understand the center, shape, and spread of the sampling distribution of .

We know that if we are counting the number of Reese’s pieces that are orange and comparing with those that are not orange, then the counts of oranges follow a binomial distribution (given that the population is much larger than our sample size).

  1. Recall from Math III the formulas for the mean and standard deviation of a binomial distribution.
  2. Given that = X/n, where X is the count of oranges and n is the total in the sample, how might we find and ? Find formulas for each statistic.
  3. This leads to the statement of the characteristics of the sampling distribution of a sample proportion.

The Sampling Distribution of a Sample Proportion:

Choose a simple random sample of size n from a large population with population parameter p having some characteristic of interest. Let be the proportion of the sample having that characteristic. Then:

  • The mean of the sampling distribution is ____.
  • The standard deviation of the sampling distribution is ______.
  1. Let’s look at the standard deviation a bit more. What happens to the standard deviation as the sample size increases? Try a few examples to verify your conclusion. Then use the formula to explain why your conjecture is true.

If we wanted to cut the standard deviation in half, thus decreasing the variability of , what would we need to do in terms of our sample size?