AP Statistics Course of Study

This course of study follows the description set out by CollegeBoard in their AP Statistics course description. It uses the textbook The Practice of Statistics, 3rd edition by Yates, Moore, and Starnes; as well as much of the supplemental material provided to teachers by W.H. Freeman, the book’s publisher.

The purpose of this course is to introduce students to the major concepts and tools for collecting, analyzing, and drawing conclusions from data. After describing each conceptual theme, the portions of the YMS book that correspond to that theme will be detailed.

Following the thematic description, this document will go through the book chapter by chapter. First, the major ideas from each chapter will be grouped. Then, each and every learning target from each chapter will be listed by chapter section.

Students in this course will be exposed to four broad conceptual themes:

Exploring Data:Describing patterns and departures from patterns (20%–30%)Exploratory analysis of data makes use of graphical and numerical techniques to study patterns and departures from patterns. Emphasis will be placed on interpreting information from graphical and numerical displays and summaries. This theme is covered in Chapters 1-4 of this course.

Sampling and Experimentation:Planning and conducting a study (10%–15%)Data must be collected according to a well-developed plan if valid information on a conjecture is to be obtained. This plan includes clarifying the question and deciding upon a method of data collection and analysis. This theme is covered in Chapter 5 of this course; ideas regarding planning and conducting a study are presented in Chapter 4 as well.

Anticipating Patterns:Exploring random phenomena using probability and simulation (20%–30%)Probability is the tool used for anticipating what the distribution of data should look like under a given model. This theme is covered primarily in Chapters 7-9 of this course; the t distribution is covered in Chapter 10.

Statistical Inference:Estimating population parameters and testing hypotheses (30%–40%)Statistical inference guides the selection of appropriate models. This theme is covered in Chapters 10-15 of this course.

Chapter One: Exploring Data

Use a variety of graphical techniques to display a distribution. These will include bar graphs, pie charts, stemplots, histograms, ogives, time plots, and boxplots.

Interpret graphical displays in terms of the shape, center, and spread of the distribution, as well as gaps and outliers.

Use a variety of numerical techniques to describe a distribution. These will include mean, median, quartiles, five-number summary, interquartile range, standard deviation, range, and variance.

Interpret numerical measures in the context of the situation in which they occur.

Learn to identify outliers in a data set.

Explore the effects of a linear transformation of a data set.

Section 1.1: Displaying Distributions with Graphs

Describe what is meant by exploratory data analysis.

Explain what is meant by the distribution of a variable.

Differentiate between categorical variables and quantitative variables.

Construct bar graphs and pie charts for a set of categorical data.

Construct a stemplot for a set of quantitative data.

Construct a back-to-back stemplot to compare two related distributions.

Construct a stemplot using split stems.

Construct a histogram for a set of quantitative data, and discuss how changing the class width can change the impression of the data given by the histogram.

Describe the overall pattern of a distribution by its shape, center, and spread.

Explain what is meant by the mode of a distribution.

Recognize and identify symmetric and skewed distributions.

Explain what is meant by an outlier in a stemplot or histogram.

Construct and interpret an ogive (relative cumulative frequency graph) from a relative frequency table.

Construct a time plot for a set of data collected over time.

Section 1.2: Describing Distributions with Numbers

Given a data set, compute the mean and median as measures of center.

Explain what is meant by a resistant measure.

Identify situations in which the mean is the most appropriate measure of center and situations in which the median is the most appropriate measure.

Given a data set, find the quartiles.

Given a data set, find the five-number summary.

Use the five-number summary of a data set to construct a boxplot for the data.

Compute the interquartile range (IQR) of a data set.

Given a data set, use the 1.5 × IQR rule to identify outliers.

Given a data set, compute the standard deviation and variance as measures of spread.

Give two reasons why we use squared deviations rather than just average deviations from the mean.

Explain what is meant by degrees of freedom.

Identify situations in which the standard deviation is the most appropriate measure of spread and situations in which the interquartile range is the most appropriate measure.

Explain the effect of a linear transformation of a data set on the mean, median, and standard deviation of the set.

Use numerical and graphical techniques to compare two or more data sets.

Chapter Two: Describing Location in a Distribution

Be able to compute measures of relative standing for individual values in a distribution. This includes standardized values (z-scores) and percentile ranks.

Use Chebyshev’s inequality to describe the percentage of values in a distribution within an interval centered at the mean.

Demonstrate an understanding of a density curve, including its mean and median.

Demonstrate and understanding of the Normal distribution and the 68-95-99.7 Rule.

Use tables and technology to find (a) the proportion of values on an interval of the Normal distribution and (b) a value with a given proportion of observations above or below it.

Use a variety of techniques, including construction of a normal probability plot, to assess the Normality of a distribution.

Section 2.1: Measures of Relative Standing and Density Curves

Explain what is meant by a standardized value.

Compute the z-score of an observation given the mean and standard deviation of a distribution.

Compute the pth percentile of an observation.

Define Chebyshev’s inequality, and give an example of its use.

Explain what is meant by a mathematical model.

Define a density curve.

Explain where the mean and median of a density curve are to be found.

Describe the relative position of the mean and median in a symmetric density curve and in a skewed density curve.

Section 2.2: Normal Distributions

Identify the main properties of the Normal curve as a particular density curve.

List three reasons why Normal distributions are important in statistics.

Explain the 68-95-99.7 rule (the empirical rule).

Explain the notation N(, ).

Define the standard Normal distribution.

Use a table of values for the standard Normal curve to compute the proportion of observations that are (a) less than a given z-score, (b) greater than a given z-score, or (c) between two given z-scores.

Use a table of values for the standard Normal curve to find the proportion of observations in any region given any Normal distribution (i.e., given raw data rather than z-scores).

Use a table of values for the standard Normal curve to find a value with a given proportion of observations above or below it (inverse Normal).

Identify at least two graphical techniques for assessing Normality.

Explain what is meant by a Normal probability plot; use it to help assess the Normality of a given data set.

Use technology to perform Normal distribution calculations and to make Normal probability plots.

Chapter Three: Examining Relationships

Construct and interpret a scatterplot for a set of bivariate data.

Compute and interpret the correlation r between two variables.

Demonstrate an understanding of the basic properties of the correlation r.

Explain the meaning of a least squares regression line.

Given a bivariate data set, construct and interpret a regression line.

Demonstrate an understanding of how one measures the quality of a regression line as a model for bivariate data.

Section 3.1: Scatterplots and Correlation

Explain the difference between an explanatory variable and a response variable.

Given a set of bivariate data, construct a scatterplot.

Explain what is meant by the direction, form, and strength of the overall pattern of a scatterplot.

Explain how to recognize an outlier in a scatterplot.

Explain what it means for two variables to be positively or negatively associated.

Explain how to add categorical variables to a scatterplot.

Use a graphing calculator to construct a scatterplot. {Construct a scatterplot by hand.} {Construct a scatterplot using computer software.}

Define the correlation r and describe what it measures.

Given a set of bivariate data, use technology to compute the correlation r. {Manually compute r for a small data set.}

List the four basic properties of the correlation r that you need to know to interpret any correlation.

List four other facts about correlation that must be kept in mind when using r.

Section 3.2: Least-Squares Regression

Explain what is meant by a regression line.

Given a regression equation, interpret the slope and y-intercept in context.

Explain what is meant by extrapolation.

Explain why the regression line is called the “least-squares regression line” (LSRL)

Explain how the coefficients of the regression equation, , can be found given r, sx, sy, and .

Given a bivariate data set, use technology to construct a least-squares regression line. {Manually construct a least-squares regression line for a small data set.}

Define a residual.

Given a bivariate data set, use technology to construct a residual plot for a linear regression.

List two things to consider about a residual plot when checking to see if a straight line is a good model for a bivariate data set.

Explain what is meant by the standard deviation of the residuals.

Define the coefficient of determination, r2, and explain how it is used in determining how well a linear model fits a bivariate set of data.

List and explain four important facts about least-squares regression.

Section 3.3: Correlation and Regression Wisdom

Recall the three limitations on the use of correlation and regression.

Explain what is meant by an outlier in bivariate data.

Explain what is meant by an influential observation and how it relates to regression.

Given a scatterplot in a regression setting, identify outliers and influential observations.

Define a lurking variable.

Give an example of what it means to say “association does not imply causation.”

Explain how correlations based on averages differ from correlations based on individuals.

Chapter Four: More about Relationships between Two Variables

Identify settings in which a transformation might be necessary to achieve linearity.

Use transformations involving powers and logarithms to linearize curved relationships.

Explain what is meant by a two-way table, and describe its parts.

Give an example of Simpson’s paradox.

Explain what gives the best evidence for causation.

Explain the criteria for establishing causation when experimentation is not feasible.

Section 4.1: Transforming to Achieve Linearity

Explain what is meant by transforming (re-expressing) data.

Discuss the advantages of transforming nonlinear data.

Tell where fits into the hierarchy of power transformations.

Explain the ladder of power transformations.

Explain how linear growth differs from exponential growth.

Identify real-life situations in which a transformation can be used to linearize data from an exponential growth model.

Use a logarithmic transformation to linearize a data set that can be modeled by an exponential model.

Identify situations in which a transformation is required to linearize a power model.

Use a transformation to linearize a data set that can be modeled by a power model.

Section 4.2: Relationships between Categorical Variables

Explain what is meant by a two-way table.

Explain what is meant by marginal distributions in a two-way table.

Describe how changing counts to percents is helpful in describing relationships between categorical variables.

Explain what is meant by a conditional distribution.

Define Simpson’s paradox, and give an example of it.

Section 4.3: Establishing Causation

Identify the three ways in which the association between two variables can be explained.

Explain what process provides the best evidence for causation.

Define what is meant by a common response.

Define what it means to say that two variables are confounded.

Discuss why establishing a cause-and-effect relationship through experimentation is not always possible.

Explain what it means to say that a lack of evidence for cause-and-effect relationship does not necessarily mean that there is no cause-and-effect relationship.

List five criteria for establishing causation when you cannot conduct a controlled experiment.

Chapter Five: Producing Data

Distinguish between, and discuss the advantages of, observational studies and experiments.

Identify and give examples of different types of sampling methods, including a clear definition of a simple random sample.

Identify and give examples of sources of bias in sample surveys.

Identify and explain the three basic principles of experimental design.

Explain what is meant by a completely randomized design.

Distinguish between the purposes of randomization and blocking in an experimental design.

Use random numbers from a table or technology to select a random sample.

Section 5.1: Designing Samples

Define population and sample.

Explain how sampling differs from census.

Explain what is meant by a voluntary response sample.

Give an example of a voluntary response sample.

Explain what is meant by convenience sampling.

Define what it means for a sampling method to be biased.

Define, carefully, a simple random sample (SRS).

List the four stems involved in choosing an SRS.

Explain what is meant by systematic random sampling.

Use a table of random digits to select a simple random sample.

Define a probability sample.

Given a population, determine the strata of interest, and select a stratified random sample.

Define a cluster sample.

Define undercoverage and nonresponse as sources of bias in sample surveys.

Give an example of response bias in a survey question.

Write a survey question in which the wording of the question is likely to influence the response.

Identify the major advantage of large random samples.

Section 5.2: Designing Experiments

Define experimental units, subjects, and treatment.

Define factor and level.

Given a number of factors and the number of levels for each factor, determine the number of treatments.

Explain the major advantage of an experiment over an observational study.

Give an example of the placebo effect.

Explain the purpose of a control group.

Explain the difference between control and a control group.

Discuss the purpose of replication, and give an example of replication in the design of an experiment.

Discuss the purpose of randomization in the design of an experiment.

Given a list of subjects, use a table of random numbers to assign individuals to treatment and control groups.

List the three main principles of experimental design.

Explain what it means to say that an observed effect is statistically significant.

Define a completely randomized design.

For an experiment, generate an outline of a completely randomized design.

Define a block.

Give an example of block design in an experiment.

Explain how block design may be better than a completely randomized design.

Give an example of matched pairs design, and explain why matched pairs are an example of block designs.

Explain what is meant by a study being double blind.

Give an example in which a lack of realism negatively affects our ability to generalize the results of a study.

Chapter Six: Probability and Simulation – The Study of Randomness

Perform a simulation of probability problem using a table of random numbers or technology.

Use the basic rules of probability to solve probability problems.

Write out the sample space for a probability random phenomenon, and use it to answer probability questions.

Describe what is meant by the intersection and union of two events.

Discuss the concept of independence.

Use general addition and multiplication rules to solve probability problems.

Solve problems involving conditional probability, using Bayes’s rule when appropriate.

Section 6.1: Simulation

Define simulation.

List the five steps involved in a simulation.

Explain what is meant by independent trials.

Use a table of random digits to carry out a simulation.

Given a probability problem, conduct a simulation in order to estimate the probability desired.

Use a calculator or computer to conduct a simulation of a probability problem.

Section 6.2: Probability Models

Explain how the behavior of a chance event differs in the short-run and long-run.

Explain what is meant by a random phenomenon.

Explain what it means to say that the idea of probability is empirical.

Define probability in terms of relative frequency.

Define sample space.

Define event.

Explain what is meant by a probability model.

Construct a tree diagram.

Use the multiplication principle to determine the number of outcomes in a sample space.

Explain what is meant by sampling with replacement and sampling without replacement.

List the four rules that must be true for any assignment of probability.

Explain what is meant by and .

Explain what is meant by each of the regions in a Venn diagram.

Give an example of two events A and B where .

Use a Venn diagram to illustrate the intersection of two events A and B.

Compute the probability of an event given the probabilities of the outcomes that make up the event.

Explain what is meant by equally likely outcomes.

Compute the probability of an event in the special case of equally likely outcomes.

Define what it means for two events to be independent.

Give the multiplication rule for independent events.

Given two events, determine if they are independent.

Section 6.3 General Probability Rules

State the addition rule for disjoint events.

State the general addition rule for union of two events.

Given any two events A and B, compute .

Define what is meant by a joint event and joint probability.

Explain what is meant by the conditional probability.

State the general multiplication rule for any two events.

Use the general multiplication rule to define .

Explain what is meant by Bayes’s rule.

Define independent events in terms of a conditional probability.

Chapter Seven: Random Variables

Define what is meant by a random variable.

Define a discrete random variable.

Define a continuous random variable.

Explain what is meant by the probability distribution for a random variable.

Explain what is meant by the law of large numbers.

Calculate the mean and variance of a discrete random variable.

Calculate the mean and variance of distributions formed by combining two random variables.