Teaching Difficult Topics in Statistics

The Statway™ Learning Model

Presented by Rachel Mudge, Andre Freeman, and Scott Guth

In this presentation, we focus on the following ideas:

•  How students learn difficult statistics topics in Statway.

•  What we know about how students learn statistics.

•  How Statway applies the Statistical Reasoning Learning Environment to support student learning.

•  Another Statway activity that applies this environment.

On the following page, you’ll find a snippet from Statway Lesson 7.1.1: Sampling Distributions of Sample Proportions. As we consider this lesson, think about how we implement:

·  A focus on ideas – not procedures

·  Real and motivating data that engage students

·  A classroom activity that supports student reasoning

·  Appropriate technology (not for just getting answers)

·  Meaningful classroom discourse

A Crash Course in Statistics and Sampling Distributions

In case there are a few statistically uninitiated members of the audience, we now take a moment to review a few key ideas.

In statistics, a population is a collection of individuals from whom we need information, but there are too many individuals to measure or observe. Instead, we make observations from a sample – a subset of a population. In statistics we assume that samples are random – where all population members have the same likelihood of being included.

Measurements from population are called parameters, and are usually unknown. Measurements from samples are called statistics. Statistics are used to estimate parameters. Samples vary, so statistics from random samples vary as well. A sampling distribution is the distribution of a statistic from varying random samples of a given size.

For example, our population might be all college students in the U.S. A random sample would be a relatively small collection of students chosen without bias. If we compute the average age of all students, we are computing a parameter. If we compute the average age of students in a sample, we are computing a statistic. If we consider average ages from every sample that contains 100 students, we are thinking about a sampling distribution.

Three important learning outcomes in a lesson on sampling distributions emphasize the distribution’s geometry. The key ideas are shape, center, and spread.

Shape: Sampling distributions are often normal (modeled by the bell curve) under certain conditions.

Center: The mean of a sampling distribution is the parameter being estimated.

Spread: Spread is measured using the standard error (or deviation) of a statistic. It is related to the population standard deviation. Standard error decreases with larger sample size.

We now consider Statway Lesson 7.1.1 – the first of several lessons which include the above outcomes. After considering this lesson, we think about its approach.

STATWAY™ student handout

Lesson 7.1.1

Distributions of Sample Proportions

Introduction

The population proportion of all blue M&Ms is an example of a parameter. The proportion of blue M&Ms in a sample is an example of a statistic. Statistics are often used to estimate parameters or to test claims made about population parameters.

In this activity we will gather multiple samples from a population. Each sample will yield a sample proportion. Our many different samples may produce many different sample proportions. These proportions are just a small part of the collection of all sample proportions, which forms a distribution of values called the sampling distribution of sample proportions.

Population / Sample
The collection of all M&Ms / 25 M&Ms
Parameter / Statistic
The proportion of blue candy in the population, p / Proportion of blue candy in sample, p

Try These

You have been given a cup that contains a random sample of 25 M&Ms candies.

1 Count the colors of candies in your sample and fill in the chart below:

Blue / Brown / Green / Orange / Red / Yellow / Total
Number of candies
Proportion of candies

2 Record the proportion of blue candies in your sample in the dotplot on the board.

3 Plot the proportions gathered by the entire class on the dotplot below

Figure 1: Dotplot for the proportion of blue candies.

4 Did everyone have the same proportion of blue candies?

A sampling distribution of sample proportions is the distribution of all possible sample proportions from samples of a given size. The dotplot that the class constructed is just part of the entire sampling distribution of sample proportions.

5 Describe the variability of the distribution of sample proportions on the board in terms of shape, center, and spread.

6 The unknown parameter is the true proportion of blue candies in the population of all M&Ms candies. What is your best estimate for this population parameter?

NEXT STEPS

In Part 2 of this lesson, we use a computer to further simulate part of the sampling distribution of sample proportions of blue M&Ms. Counting out candies is time consuming and inefficient, but technology can be used to better simulate a distribution of sample proportions. We will now use a computer applet to simulate additional sample proportions. The simulation requires a value for the population proportion, p, of all M&Ms that are blue, and the sample size, n.

Open the Blue M&Ms simulation at

http://www.statway.org/students.

The input fields for the applet are as follows: p is the population proportion, and n is the sample size. The simulation creates and plots 1000 sample proportions, from 1000 different samples of size 25. Once you enter values for the population proportion or sample size, the simulation automatically creates and plots the updated distribution of sample proportions.

TRY THESE

Simulating a Distribution of Sample Proportions

1 Describe the shape, center and spread of the simulated distribution of sample proportions.

2 How does this distribution compare to the one our class constructed on the board in terms of shape, center, and spread?

3 What sample proportions do you think are unlikely or unusual? Why? How do these unusual results compare to the results of our class distribution?

4 Would you agree with the claim that 50% of M&M candies are blue? Explain your answer.

5 Would you agree with the claim that 10% of M&M candies are blue? Explain your answer.

6 Give an interval in which most sample proportions are likely to fall.

END OF LESSON 7.1.1 EXCERPT

Summarizing Lesson 7.1.1

1.  Lesson 7.1.1 addresses a theoretical topic. Does the lesson make the ideas accessible to students?

2.  How are the central ideas (center, shape, and spread) demonstrated?

3.  Does the activity present a real and relevant data set?

4.  Is the activity effective in supporting the development of student reasoning?

5.  Does the activity integrate the use of appropriate technological tools that allow students to test their conjectures, explore and analyze data, and develop their statistical reasoning? How?

6.  How does the activity promote classroom discourse that includes statistical arguments and sustained exchanges that focus on significant statistical ideas.

7.  It is quite difficult for students to visualize a sampling distribution. Is it likely that the lesson would achieve its desired outcomes (understanding shape, center, and spread for sampling distributions)?

The Statistical Reasoning Learning Environment

Based on prior studies, researchers Garfield and Ben-Zvi developed what they call a Statistical Reasoning Learning Environment. The model is based on six principles of instructional design described by Cobb and McClain (2004). It is similar to but extends the GAISE (2005) recommendations.

I.  Focus on developing central statistical ideas rather than on presenting a set of tools and procedures.

II.  Use real and motivating data sets to engage students in making and testing conjectures

III.  Use classroom activities to support the development of students' reasoning.

IV.  Integrate the use of appropriate technological tools that allow students to test their conjectures, explore and analyze data, and develop their statistical reasoning.

V.  Promote classroom discourse that includes statistical arguments and sustained exchanges that focus on significant statistical ideas.

VI.  Use assessment to learn what students know and to monitor the development of their statistical learning as well as to evaluate instructional plans and progress.

The objective of this activity is to think about how a few of the more difficult Statway topics fit into this framework. We will use the framework to critique the activities, and think about what role the instructor can play to successfully implement each component.

8.  Think about the six recommendations of the Statistical Reasoning Learning Environment. Can you think of ways that you could supplement the M&M activity to better align with these recommendations? Please explain your ideas.

Applying the Statistical Reasoning Learning Environment to another Context

Another topic that many Statway faculty identified as difficult for students is hypothesis testing. Let's begin again with a quick overview.

Quick Overview

Hypothesis tests examine assumptions made about the assumed value of a parameter (a population mean or proportion). Significant deviations from an assumed population mean or proportion are unlikely events. In a hypothesis test, a P-value is used to measure significance. The P-value measures the probability of observing a statistic at least as extreme as one we have observed randomly. When the P-value is small, the observed statistic (the sample proportion or mean) is significantly different from the assumed value of the parameter it estimates.

In Statway, we provide a four-step process for testing hypotheses. These steps are:

I.  Determine the Hypotheses
This step identifies an assumption that is tested using statistical data.

II. Collect Data
Here students verify assumptions about randomness and normality. Relevant statistics are computed as well.

III.  Assess the Evidence
Students compute a P-value. When the P-value is small, then the statistic is significantly different from the parameter in the null hypothesis.

IV.  State a Conclusion
Students are expected to be able to explain what they discovered in plain words.

As an example of an activity that introduces hypothesis testing, let's look at the card activity in Lesson 7.3.1.

STATWAY™ INSTRUCTOR NOTES

7.3 Lesson 1

Using Sampling Distributions to Reason on Population Claims

A Probability Experiment

We are about to conduct a probability experiment that will help us see how we use data from samples to make decisions about an entire population. In this activity, the population will be a poker deck of 52 cards. Poker decks have 26 black cards and 26 red cards. We will take a sample from this deck to make a decision about it.

When the activity is done, work on the following questions in your group.

The Fairness of a Deck of Cards

1  Think about the results of the activity we just conducted as you answer the following questions.

A  Before we began the activity, you had no reason to believe that the deck was not fair. Do you believe the deck is fair now?

B  You have only seen 8 cards, and yet you have made a decision about a collection of cards that is actually much bigger. What justifies such a decision?

C  How did probability factor into your decision making process?

NEXT STEPS

You just made a decision about a population that you did not observe completely. Your decision was based on a small sample. In statistics, sample data help us make decisions about populations through a process known as hypothesis testing.

We have just conducted an informal hypothesis test. We began with an assumption about the deck. Then we gathered sample data. Upon examining the data, the likelihood of what we observed led us to make a conclusion about the entire deck of cards.

Hypothesis tests include four steps: determining hypotheses, collecting data, assessing the evidence, and stating a conclusion. The following questions show how the steps fit with the reasoning behind the card drawing activity.

Step 1: Determine the Hypotheses

2  Every hypothesis test makes an assumption about the value of a population parameter. That value is then challenged by the test. The parameter that we consider here is an unknown population proportion. The assumption we make about its value is called the null hypothesis.

A  We usually assume that a deck of cards is fair. What assumption are we making about the value of the population proportion of cards in the deck that are black?

p = ______

This assumption is the null hypothesis for card drawing activity.

B  You saw 8 random cards in the deck. Did the sample data convince you that the proportion you assumed above might not be true? Explain your answer.

C  Do you feel that it is valid to use data from relatively small samples to make decisions about entire populations? Think about how convinced you were about the fairness of the deck.

D  Do you believe that true proportion of black cards in the deck is less than, greater than, or simply different from, the value given in question 2A above?

The belief you expressed in question 2D opposes the null hypothesis, so we call it the alternative hypothesis. Normally we choose the alternative hypothesis before gathering data, but here we will make an exception.

We now use what we know about probability to show how strongly our evidence supports this alternative. If the evidence for the alternative hypothesis is strong, we will reject the null hypothesis.

Step 2: Collect the Data

3  Once we have identified the null and alternative hypotheses, we gather and summarize data. We also verify requirements as we proceed.

A  All methods of inference in statistics assume that we are gathering data from simple random samples. Explain if and how your instructor insured that data were gathered randomly from the population of cards in a single deck.

B  What were the sample size and sample proportion of randomly selected cards that were black?

n = ______p = ______

C  We have seen that sample proportions have an approximately normal distribution when np ≥ 10 and n(1 – p) ≥ 10. If we assume the deck is fair (half of the cards are black), is a normal model appropriate for this application?

Step 3: Assess the Evidence