AP Statistics Summer Institute
Exploring Univariate Data
Name: ______
Participant / Gender / Years of teaching experience / Years teaching AP Statistics / Height (inches) / Shoe size1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
A distribution of a variable tells us what values the variable takes and how often it takes these values.
How would you describe the distribution of years experience? Describe the center. What other characteristics are important to note? How do you describe these?
We begin by looking at graphs and add numerical summaries.
Stem and leaf plot
00
1
1
2
2
3
3
4
4
What characteristics of the distribution are evident from the stem and leaf plot?
Back to back Stem and leaf plot
Males / Females0
0
1
1
2
2
3
3
4
4
Compare and contrast the characteristics of the distributions of years experience by men and women.
Construct a histogram of the distribution of the years experience on the grid below.
What characteristics of the distribution are evident from the histogram?
Compared to the stem and leaf plot, what detail does the histogram lack?
When would it be beneficial to use a histogram rather than a stem and leaf plot?
Notes regarding shape:
A distribution is said to be skewed to the rightif it extends further to the right that it does to the left. (The tail extends to the right)
A distribution is said to be skewed to the leftif it extends further to the left that it does to the right. (The tail extends to the left)
A distribution is said to be symmetric if the right and left sides of the histogram are approximately mirror images of each other.
Describing distributions with numbers
Measures of Center
Median (M): The median is the value for which half of the observations in the set are greater than and half of the observations are less than. To find the median:
1. Arrange the observations in increasing order.
2. If the number of observations is odd, the median is the middle value.
3. If the number of observations is even, the median is the average of the middle two.
Mean (): The mean is the average of the set of observations:
or in sigma notation
Find the median and mean years of teaching experience.
Which measure of center is larger? Why?
Measures of Spread
Range = maximum – minimum
Interquartile Range (IQR): .
Quartiles:
The first quartile () is the value for which 25% of the observations are less than. It is the Median of the first half of the set of observations.
The third quartile () is the value for which 75% of the observations are less than. It is the Median of the second half of the set of observations.
Note: IQR is typically used to describe spread when Median is used to describe center.
Five number summary: Min, , Median, , Max
Outliers: An observation is called an outlier if it lies more than above or below.
Variance (): The variance is the roughly the average of the squared differences between each observation and the mean.
Or in sigma notation
Standard deviation (s): The standard deviation is the square root of variance.
Note: Variance and Standard Deviation are used to measure spread when the mean is used to describe center.
Note: When the distribution is approximately symmetric, the mean and standard deviation are generally used to summarize the distribution. If the distribution is skewed, a five number summary is generally used.
Find each of the following for the distribution of years of experience.
:
:
IQR:
Five number summary:
Are there any outliers in the distribution of years of experience?
Complete the table to find variance and standard deviation.
Participant / x / / / = ______= ______
1
2
3
4
5
6 / Which would be more appropriate in describing the distribution of years of experience: a five number summary or the mean and median? Why?
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
=
Construct a boxplot for the number of years experience using the grid as a guide.
Construct parallel boxplots for the number of years experience for men and women using the grid as a guide.
Using the boxplots above, compare and contrast the distributions of years experience for men and women.
Linear transformations: When every value of the variable x is transformed into a new value given by the equation .
Original Data (x) / Median / Mean / Range / IQR / St. Dev. / Variance3, 4, 6, 8, 12, 15, 20
Add 4 to each value in the original data and complete the table.
/ Median / Mean / Range / IQR / St. Dev. / Variance7, 8, 10, 12, 16, 19, 24
Multiply each value in the original data by 3 and complete the table.
/ Median / Mean / Range / IQR / St. Dev. / Variance9, 12, 18, 24, 36, 45, 60
Multiply each value in the original data by 2 and add 3 and complete the table.
/ Median / Mean / Range / IQR / St. Dev. / Variance9, 11, 15, 19, 27, 33, 43
How is each summary statistic of xaffected by the linear transformation ?
Median=
Mean=
Range=
IQR=
St. Dev.=
Variance=
Suppose a teacher gave a test for which and . He wants to apply a linear transformation to “scale” the grades so that and . Find a and b.
1