STATWAY STUDENT HANDOUT | 2
Lesson 1.1.2
Samples, Populations, and Types of Statistical Studies
STATWAY™ STUDENT HANDOUT
Lesson 1.1.2
Samples, Populations, and Types of Statistical Studies
STUDENT NAME / DATEINTRODUCTION
In Lesson 1.1.1, we studied the four step process used in many statistical investigations. Step 1 in that process was “Ask a question that can be answered by collecting data.” We ask questions about one or more characteristics of each subject. In this context, the subjects of a statistical investigation are the individuals involved in the study. The subjects are often people, but can be animals, plants, or things. For example, in the previous lesson, the students in the class were our subjects.
We asked if our personality traits were related to zodiac sign. The characteristics we studied were sets of personality traits and birth date groups. The characteristics that we study about each subject are called variables. In the example, the variables were sets of personality traits and birth date groups.
In a statistical study, we usually ask one of the following types of questions:
Research Questions about PopulationsA population is a set or group of people or objects that share certain characteristics.
Research questions about populations include:
§ Estimating a numerical population characteristic.
§ Testing a claim about a population.
§ Comparing two or more populations.
§ Asking whether two variables are related in the population of interest.
Research Questions about how one variable responds as another variable
is manipulated or changed
Research questions of this type include:
§ “What is the effect of …?”
§ “What happens when…?”
Try These
1 Here is one of the studies we examined in the previous lesson.
Researchers at the Center for Reproductive Medicine at Brigham and Women’s Hospital wondered what proportion of women who visit a fertility clinic would want the opportunity to choose the sex of a future child. They also wondered if those that would like to choose the sex were more likely to want a boy or girl. The researchers mailed a survey containing 19 questions to women who had visited the Center. One question on the survey asked women whether they would like the option of choosing the sex of a future child. If the response to that question was yes, a follow-up question asked whether they would choose a boy or girl. Of the 229 women who wanted to choose, 89 (38.9%) said they would choose a boy and 140 (61.1%) said they would choose a girl.
Based on their statistical analysis of these data, the researchers concluded that there is convincing evidence of an overall preference for girls among women wanting to choose the sex of a future child. The researchers based this conclusion on this fact: In a sample of 229 women, it would be very unusual to observe a percentage as high as 61.1%. This is unusual if there is really no preference for girls in the population of women who would like to select the sex of a future child.
A What type of question does this study ask? Does it ask a question about a population or a question about how one variable responds as another variable is manipulated?
B What is the population of interest in this study?
C What variable is being studied for each subject in the study?
2 Here is the other study we examined in the previous lesson.
Psychologists believe that people are less likely to do something if they think it will require a lot of effort. But, how do people decide what things they think will be hard and what things they think will be easy?
Researchers at the University of Michigan wondered if written directions would affect how hard people thought a task would be. If the written instructions for how to do a task were difficult to read, would that affect how difficult people thought a task was? To investigate this, they performed an experiment.
The researchers randomly divided twenty students into two groups of 10 students each. One group received instructions for an exercise routine printed in a font that was easy to read, and the other group received the same set of instructions printed in a font that was difficult to read. A sample of each font appears below. Each student read the instructions, and then they were asked how many minutes they thought the exercise routine would take.
For the group that read the instructions printed in an easy-to-read font, the average number of minutes they thought the routine would take was 8.23 minutes. For the group that read the same instructions printed in the hard to read font, the average was 15.1 minutes.
Based on the study data, the researchers concluded that the difference between these two averages was not likely to be due to chance. There was evidence that people think a task will be harder when the instructions are hard to read.
This is the easy-to-read font that was used in the study.
This is the hard-to-read font that was used in the study.
A What type of question does this study ask? Does it ask a question about a population or a question about how one variable responds as another variable is manipulated?
B What variable is being manipulated?
C What variable responds to the variable being manipulated?
Next Steps
When we know what type of question a study asks and what the variable(s) are in the study, then we can move on to the second step of a statistical investigation.
Step 2 in this four step process is “decide what to measure and then collect data.” We measure the variables. This is why it is important to understand what type of question we ask in a study and what the variables are.
There are two types of studies used to collect data: observational studies and experiments.
Observational StudyIn an observational study researchers study the variables of a population. One example is whether a mother who wants to choose the sex of her child would prefer a girl or a boy.
Researchers usually observe a sample of the population. A sample is a subset or a selected number of people from one or more existing populations. For example, in the study above researchers only observe a certain number of all mothers who want to choose the sex of their children. Remember that an observational study is used to answer questions about characteristics of populations. Because the goal of an observational study is typically to learn about the population, it is important that the sample be representative of the population researchers are interested in. The individuals in the sample must be similar to the individuals in the whole population.
Experiment
In an experiment researchers observe how one variable (such as estimated length of time for a task) behaves under different conditions (such as the font of the instructions). The conditions are being actively manipulated by the researcher (e.g., giving some students the instructions in one font and other students the instructions in a different font).
An experiment is used to answer questions about how one variable responds as another is manipulated. Because the goal of an experiment is to learn about the effect of the different experimental conditions, it is important to have similar groups for each of the different experimental conditions.
In summary, there is no attempt to influence the results in an observational study. This is different from an experiment. In an experiment conditions are manipulated to see the response to each condition.
Try These
Now let’s look at 2 other studies and ask some more detailed questions about them.
3 Imagine that our college is having financial problems. The college announces that it will shorten library hours to save money. The library will be closed in the evenings and on the weekends. Some students think that it is okay to pay $20 more per semester to keep the library open in the evenings and on the weekends.
We are interested in learning about the proportion of students at your college who would pay $20 more per semester to keep the library open in the evenings and on the weekends. We plan to select a sample of 100 students. We will ask each of these students whether he or she agrees with the fee increase.
A What type of question does this study ask? Does it ask a question about a population or a question about how one variable responds as another variable is manipulated?
B Is this an observational study or an experiment?
If it is an observational study, what is the population of interest? What is the question we are asking about the population?
If it is an experiment, what is the response? What is the variable that we think might affect the response?
C Suppose that we collect data by asking 100 students who are entering the library whether they would pay the fee. Is this a good way to collect data? Why do you think this?
D Suppose that we collect data by asking 100 students who are at the school gym if they would pay the fee. Is this a good way to collect data? Why do you think this?
E The goal is to obtain a sample of 100 students that is representative of students at the college. Give a better way to select 100 students than the two ways described in parts C and D. Why do you think your way is better?
4 We are interested in learning whether jogging for longer amounts of time decreases the resting heart rate of college students. We want to see if there is a difference between:
§ The resting heart rate of college students that jog 30 minutes three times a week for six weeks, and
§ The resting heart rate of college students that jog for only 15 minutes three times a week for six weeks.
We plan to use 100 college students who do not currently jog and who have volunteered to participate as subjects in this study. Resting heart rate of each subject will be measured at the start of the study. Fifty of the students will participate in a jogging program where they get together three times a week and jog for 30 minutes. The other 50 students will get together three times a week, but will only jog for 15 minutes. At the end of six weeks, resting heart rate will be measured again.
A What type of question does this study ask? Does it ask question about a population or a question about how one variable responds as another variable is manipulated?
B Is this an observational study or an experiment?
If it is an observational study, what is the population of interest? What is the question we are asking about the population?
If it is an experiment, what is the response? What is the variable that we think might affect the response?
C Suppose that we create the two groups for this study according to age. We group the 50 youngest volunteers in the 30 minute jogging group. We group the 50 oldest volunteers in the 15 minute jogging group. Is this a good idea? Why do you think this?
D Suppose that we create the two groups for this study according to weight. We group the 50 volunteers that weigh the most into the 30 minute jogging group. We group the 50 volunteers that weigh the least into the 15 minute jogging group. Is this a good idea? Why do you think this?
E The goal is to divide the 100 volunteers into two groups so that there is a “fair” comparison between the 30 minute and 15 minute jogging groups. Give a better way to divide the 100 volunteers into two groups that is better than age in part (c) and weight in part (d). Why do you think your way is better?
YOU Need to Know
In summary, we ask two types of research questions. Each type of research question is answered by a different type of study. Observational studies are used to answer research questions about characteristics of populations. Experiments are used to answer research questions about how one variable responds as another variable is manipulated.
Introduction
Drawing Conclusions from Statistical Studies
The fourth step in the statistical process is drawing a conclusion. There are two types of conclusions that might be made from a study:
§ Generalize from sample to population.
§ Change in response is caused by experimental conditions (cause-and-effect conclusion).
Both types of conclusions extend the data that are observed. This means that researchers try to say something beyond what was observed. In the conclusion, researchers try to explain what they learned in the study.
One type of conclusion is called “Generalize from sample to population.” Remember that researchers usually only study a sample of a larger population. They try to pick a sample that is representative of the population they want to study. When researchers draw this type of conclusion, they are confident that what they observed in the sample is true for the larger population.
The other type of conclusion is called “Change in response is caused by experimental conditions.” This means that the change in a sample’s response was caused by the manipulation in an experiment. The researchers manipulated a variable and if there was an “effect” or response this means the change was due to the manipulation. This means that we have noticed a significant difference in the response of one variable to the variable that is manipulated.
The table below summarizes when each of these types of conclusions is reasonable.
Type of Conclusion / Reasonable WhenGeneralize from sample to population / Observational study is conducted and the sample is representative of the population
Change in response is caused by experimental conditions (cause-and-effect conclusion) / Experiment is conducted and groups assigned to experimental conditions are similar
The best way to choose a sample that is representative of the population is to use a random sample from the entire population. The best way to ensure similar groups for different experimental conditions is to use random assignment to the experimental groups. You will see more about these ideas in upcoming lessons, but without a random sample in an observational study or random assignment in an experiment, no conclusions can reliably be drawn.