The Treatment Effect Acquisition

Algebra 1 Summer Institute 2014

The Treatment Effect Acquisition

Summary
This investigation focuses on the pattern of chance variation that arises from randomly assigning subjects to treatment groups. After considering the amount of chance variation, investigators can probe the experimental results for evidence of a genuine signal - a treatment effect that could not have plausibly arisen from the random assignment process alone. This investigation exposes participants to a very powerful simulation approach that can easily be extended to different scenarios while also developing their intuition about whether an outcome could have occurred by chance and what factors might affect that assessment. / Goals

Formalize intuition about chance variation
Practice a simple simulation approach
Determine if an outcome could have occurred by chance and what factors might affect that assessment.

/ Participant Handouts

The Treatment Effect Acquisition

Materials
Paper
Pencil / Technology
LCD Projector
Facilitator Laptop
Excel / Source
Focus in High School Mathematics Reasoning and Sense Making
Statistics and Probability / Estimated Time
90 minutes

Mathematics Standards

Common Core State Standards for Mathematics
MAFS.7.SP.1: Use random sampling to draw inferences about a population.
1.1: Understand that statistics can be used to gain information about a population by examining a sample of the population; generalizations about a population from a sample are valid only if the sample is representative of that population. Understand that random sampling tends to produce representative samples and support valid inferences.
1.2: Use data from a random sample to draw inferences about a population with an unknown characteristic of interest. Generate multiple samples (or simulated samples) of the same size to gauge the variation in estimates or predictions. For example, estimate the mean word length in a book by randomly sampling words from the book; predict the winner of a school election based upon randomly sampled survey data. Gauge how far off the estimate or prediction might be.
Standards for Mathematical Practice

Make sense of problems and persevere in solving them
Reason abstractly and quantitatively
Construct viable arguments and critique the reasoning of others

4. Model with mathematics
5. Use tools appropriately

Instructional Plan

Tell Participants we are going to collect some data from them and revisit the word memorization activity. Make equal numbers of copies of two lists of words (found at the end of this document), randomly mixed them ahead of time, distribute them face down to the participants, and tell them that when you say “go”, they can turn over the page and try to memorize as many words as they can. Time them giving them 30 seconds, depending on the number of words. When time is up, tell them to stop, and ask them to turn over the paper and write down as many of the words as they can remember. Then ask them to exchange papers with a neighbor to “grade” each other’s performance.(Slide 2)

The participants might be expecting that there are two different treatment groups. As much “control” as possible should be exerted to create uniform testing conditions. Once the data have been compiled, share them with the participants for analysis.

Ask participants to examine both numerical and graphical summaries of the data, and in all likelihood has seen that the group receiving the meaningful words was more successful at the memorization task. The following suggestions use the mean but you can use the median as an extension.(Slide 3)

In the investigation, treatment denotes the type of words (meaningful or nonsense) that the participants received for memorization, and treatment effect is the difference, on average, between the numbers of meaningful words or nonsense words that the participants remembered. In the following text, we will use 4.27 as the difference between the averages.(Slide 4)

Ask the participants to consider the following scenario: Assume that the treatment does not make a difference in how many words are memorized. Is it still possible that we would end up seeing a difference in the mean number of words memorized in the two groups? Why? Let participants share their thoughts in a large group discussion.(Slide 5)

Pose the following questions also for discussion: do these results convince you that giving people meaningful words will, on average, cause them to remember more words? Or could there be another explanation for what these researchers observed? For example, could the meaningful words group have tended to receive higher scores even if memorizing meaningful words wasn’t any easier than memorizing nonsensical words? Let participants share their thoughts.(Slide 6)

Could it be that we got unlucky in the random assignment, and more of the better memorizers ended up in the meaningful words group than in the nonsense word group? Is it believable that this could result in a difference in the mean scores between the groups of 4.27 words? Listen to participants’ discussions.

It is possible to get data like this even if the treatment really does not make a difference. But how can we decide whether a difference as large as we observed between the groups is probable in this scenario? Is “random chance” a reasonable explanation? In other words, how probable is it that we will get a difference in the means around 4.27 just from getting unlucky in the random assignment? How could we decide this?(Slide 7)

We can repeat the random assignment process, assuming that the scores stay the same no matter which group they are put into, and explore how different the two group means tend to be from each other when we know for a fact that there is no treatment effect.

The simulation uses the memorization scores from the study, and participants perform the simulation in pairs according to the instructions that appear below. After following steps (a) and (b) to run the simulation and pool their results, they answer questions (c) – (f).

Directions(Slide 8)

Work with a partner and put the 40 (assuming 40 participants) scores from the study on 40 separate index cards. Shuffle the cards and then deal out 20 to be group A. The rest will be group B. Determine the mean score for each group. Then calculate the difference of these means (group A – group B). Results could be negative.
Pool your results with those of the other pairs of participants in the class, with each pair putting one dot on the axis on the board for their value.

Questions(Slide 9)

Where are the values obtained from the simulations centered? Why does that make sense?
Where does the researchers’ result (4.27 or the one obtained in the experiment conducted in class) fall in the distribution of differences? Is a difference in means of 4.27 a likely or unlikely occurrence?
What conclusion would you draw from this study? In particular, which of the following conclusions do you consider more believable? And why?
There is no real difference between the treatments, and the researchers were unlucky in the random assignments to have found the difference of 4.27.
We are now convinced that there is a genuine effect from memorizing meaningful words instead of nonsense words.
If the mean for the meaningful word group were only 2 words larger that the mean for the nonsense word group, what would you conclude? Be clear about how you arrived at this conclusion.

Discussions based on the simulation

The values should be centered around zero. Random assignments will not always create equal groups, and they might differ some even when we know for certain that no real treatment effect is present. The order of subtraction does not really matter when there is no treatment effect.

The difference of 4.27 (or -4.27) should not happen very often in the simulation. No matter how many times the simulation is done, it should be centered closely around zero and 4.27 should be an unlikely result.

The conclusion should be that we are now convinced that there is a genuine effect from memorizing meaningful words instead of nonsense words. If the researchers have confidence in the study’s random assignment, they can be confident that any major difference observed in the results for two groups is due t the treatment.

What would you conclude if the difference were of 2 words? Is it possible to get 2 in the simulation? We don’t have strong evidence against “no treatment effect”. All we can say is that these results are not inconsistent with what we would expect to see when there is no treatment effect, rather than we have proven or supported the conclusion that there is no treatment effect.

Extension Tasks(Slide 10, 11)

The directions for participants follow:

Suppose that the same study comparing group means for numbers of meaningful and nonsensical words memorized was conducted in two other classrooms. In each case, do you think the results for the new class would provide more convincing evidence or less that the meaningful words list was easier to memorize? Explain how you are deciding.
Compile and analyze the results from the other two sections of the summer institutes. How does this randomization distribution compare to the original study? Are their results more or less probable than the original study? Does this make sense?
Suppose that the difference in median scores were 4 in the original study. How could you assess whether the median score difference of 4 is convincing evidence of a treatment effect? (What would you simulate?)
Suppose that the difference in medians between the two lists is 4 words. Describe circumstances under which you would be convinced there was a treatment effect, and circumstances under which you would not be convinced. (Remember to take into account issues surrounding within-group variability and sample size.)

Words / Non-words
1 / Bond / Corm
2 / Bore / Dask
3 / Care / Deto
4 / Dear / Dild
5 / Deep / Dlod
6 / Diet / Fude
7 / draw / Hart
8 / ease / Huly
9 / Edge / Jalk
10 / Evil / Jare
11 / Fate / Jolk
12 / Fury / Jort
13 / Gain / Kise
14 / Gaze / Moke
15 / Gulf / Mopy
16 / Harm / Nund
17 / Hate / Pake
18 / Lack / Palk
19 / Leap / Pran
20 / Loft / Puzz
21 / List / Rire
22 / Mama / Rumb
23 / Oily / Sask
24 / Peak / Sere
25 / Poem / Sero
26 / Reef / Talm
27 / Rely / Tike
28 / Ripe / Tolk
29 / Sell / Topy
30 / Soft / Vack
31 / Soon / Vash
32 / Stay / Vess
33 / Tiny / Wint
34 / Tour / Wonl
35 / Trim / Wush
36 / View / Zint
37 / Vote / Zosh

Words source: