Name______Date______
Lesson 11.1: Histograms and Dot Plots Algebra I
Data collected by means of surveys, observation, or research is often summarized by graphs. Data refers to the raw information, the information you want to analyze and understand.
Two types of graphs we can use to analyze data are Histograms and Dot Plots.
Histogram Example: Charlie’s Food Factory currently employs 28 workers whose ages are shown below on a histogram.
(a) How many workers have ages between
19 and 21 years?
(b) How many workers are over 21
years of age?
Dot Plot Example: The following Dot Plot also shows the ages of the workers at Charlie’s Food Factory, but in a different format.
(a) How many workers are 18
years old?
(b) What is the range of the ages
of workers?
(c) Would consider this distribution to be uniform, symmetric, skewed right, or skewed left?
(d) What are some differences between a histogram and a dot plot? When would a histogram be more useful and when would a dot plot be more useful?
Exercise #1: The 2006 – 2007 Arlington High School Varsity Boy’s basketball team had an excellent season,
compiling a record of 15 – 5 (15 wins and 5 losses). The total points scored by the team for each of the 20
games are listed below in the order in which the games were played:
76, 55, 76, 64, 46, 91, 65, 46, 45, 53, 56, 53, 57, 67, 62, 64, 67, 52, 58, 62
(a) Complete the frequency table below. (b) Construct the histogram below.
Exercise #2: A farm is studying the weight of baby chickens (chicks) after 1 week of growth. They find the
weight, in ounces, of 20 chicks. The weights are shown below. Construct a dot plot on the axes given.
2, 1, 3, 4, 2, 2, 3, 1, 5, 3, 4, 4, 5, 6, 3, 8, 5, 4, 6
1. The test grades from 18 students at Bolderdash High are listed below.
72, 86, 95, 75, 100, 85, 87, 100, 81, 84, 78, 94, 96, 80, 100, 98, 96, 91
Complete the frequency table. Construct the histogram on the grid below.
Interval / Frequency71 - 75
76 - 80
81 - 85
86 - 90
91 - 95
96 - 100
2.The numbers below represent the number of minutes that it takes for 20 different people to eat breakfast
0, 5, 5, 0, 7, 8, 8, 6, 0, 0, 9, 5, 7, 6, 8, 3, 0, 5, 9, 10
On the line below, construct your own scale and draw a dot plot to represent the data.
______
3. The heights of a group of 36 people are measured to the nearest inch. The heights range from 47 inches to 76 inches. To construct a histogram showing 10 intervals, the length of each interval should be
A. 2 inches
B. 3 inches
C. 4 inches
D. 5 inches
4.A local marketing company did a survey of 30 households to determine how many devices the household contained that family members watched video on (i.e. TV’s, tablets, smart phones, etcetera). The dot plot of the responses is shown below.
How many households have three devices capable of showing video on them?
(1) 1 (3) 7
(2) 2 (4) 5
More households had 4 devices to watch video on than any other number. Which of the following is closest to the percent of households that have 4 devices?
(1) 22% (3) 27%
(2) 34% (4) 45%
The marketing company would like to claim that the majority of households have either 3 or 4 screens capable of watching video on. Does the information displayed on the dot plot support this claim? Explain your reasoning.
5. On a recent Precalculus quiz, Mr. Weiler found the following distribution of scores, which are arranged in 5 point intervals (with the exception of the last interval).
How many students scored in the 75 to 79 point range?
(1) 8 (3) 25
(2) 10 (4) 5
Students do not pass the quiz if they receive lower than a 70. How many students did not pass?
(1) 8 (3) 7
(2) 5 (4) 15
How many total students took the quiz?
(1) 25 (3) 56
(2) 104 (4) 93
Twenty-two students scored in the 80 to 84 range on this test. Does the histogram provide us with enough information to conclude that a student must have scored an 82 on this test? Explain your thinking.
Name______Date______
11.1: Histograms and Dot Plots Algebra I
1.The Fahrenheit temperature readings on 30 April mornings in Stormville, New York, are shown below.
41°, 58°, 61°, 54°, 49°, 46°, 52°, 58°, 67°, 43°,
47°, 60°, 52°, 58°, 48°, 44°, 59°, 66°, 62°, 55°,
44°, 49°, 62°, 61°, 59°, 54°, 57°, 58°, 63°, 60°
Using the data, complete the frequency table below.
On the grid below, construct and label a frequency histogram based on the table.
2. Ms. Romano is a health coach and nutritionist. Recently, she encouraged Matthew to eat a healthier breakfast and recommended a cereal with less sugar. There are many different cereals and it seems like the amount of sugar in each type varies widely. Matthew took a trip to the grocery store and recorded the sugar amount that each cereal has in one serving.
Label the number line below using numbers to represent the amount of sugar in one serving and then construct a dot plot.
3. The test scores for 10 students in Ms. Sampson’s homeroom were 61, 67, 81, 83, 87, 88, 89, 90, 98, and 100. Which frequency table is accurate for this set of data?
1) /2) /
3) /
4) /
4. The students in one social studies class were asked howmany brothers and sisters (siblings) they each have. Thedot plot here shows the results.
a. How many of the students have six siblings?
b. How many of the students have no siblings?
c. How many of the students have three or more siblings?
d. Would you consider this dot plot to be uniform, symmetric, skewed left, or skewed right?
5.The accompanying histogram shows the heights of the students in Kyra’s health class.
What is the total number of students in the class?
1) / 52) / 15
3) / 16
4) / 209
Name______Date______
Lesson 11.2: Measures of Central Tendency Algebra I
In our day to day activities, we deal with many problems that involve related items of numerical information called data. Statistics is the study of sets of such numerical data. When we gather numerical data, besides displaying it, we often want to know a single number that is representative of the data as a whole. We call these types of numbers measures of central tendency. The three most common measures of central tendency are mean, median, and mode.
The mean (or average), sometimes represented by , of a number set is found by adding all the numbers and then dividing by the number of numbers in the set. For example, if Matthew’s grades are 80, 92, 85, 91, 95, and 88, the mean is found by using the formula:
The median is the middle value when a set of numbers is arranged in order from least to great.
- If there is an odd number of numbers, the median is the middle number. For the numbers 2, 5, 7, 19, 20 the median is 7.
- If there is an even number of numbers, the median is the mean, or average, of the two middle numbers. For the numbers 3, 4, 6, 7, 8, 9, the median is
The mode is the value that appears the most times. A set of values may have:
- No mode (every number appears the same number of times)
- One mode (one number appears more times than any other number)
- More than one mode (several different numbers are tied for the most times repeated)
Exercise #1: A survey was taken amongst 12 people on the number of passwords they currently have to remember. The results in ascending order are shown below. State the mean, median, and mode (to the nearest tenth).
0, 1, 2, 1, 1, 3, 2, 6, 3, 3, 4, 3
Exercise #2: Students in Mr. Okafor’s algebra class were trying to determine if people speed along a certain section of roadway. They collected speeds of 20 vehicles, as displayed in the table below.
(a)Find the mean, median and mode for this data set.
(b)The speed limit along this part of the highway is 35 mph. Based on your results from part (a), is it a fair to make the conclusion that the average driver does speed on this roadway?
Exercise 3: Determine the mean and median, if possible. If not possible, explain why not.
When conducting a statistical study, it is not always possible to obtain information about every person or situation to which the study applies. Unlike a census, in which every person is counted, some studies use only a sample or portion of the items being investigated. Whenever a sample is taken, it is vital that it be fair; in other words, the sample reflects the overall population.
Exercise #4: To determine which television programs are the most popular in a large city, a poll is conducted by selecting a sample of people at random and interviewing them. Outside which of the following locations would the interviewer be most likely to find a fair sample? Explain your choice and why the others are inappropriate.
(1) A baseball stadium (3) A grocery store
(2) A concert hall (4) A comedy club
Exercise #5: Truong is trying to determine the average height of high school male students. Because he is on the basketball team, he uses the heights of the 14 players on the team, which are given below in inches.
69, 70, 72, 72, 74, 74, 74, 75, 76, 76, 76, 77, 77, 82
(a)Calculate the mean and median for this data set. Round any non-integer answers to the nearest tenth.
(b)Is the data set above a fair sample to use to determine the average height of high school male students? Explain your answer.
Data sets can have members that are far away from all of the rest of the data set. These elements are called outliers, which can result in a mean that does not represent the true “average” of a data set.
Exercise #6: In Mr. Petrovic’s Advanced Calculus Course, eight students recently took a test. Their grades were as follows:
45, 78, 82, 85, 87, 89, 93, 95
(a) Calculate the mean and median of this data set.
(b) What score is an outlier in this data set?
(c) Which value, the mean or the median, is a better measure of how well the average student did on Mr. Petrovic’s quiz?
Exercise #7: Which statement is true about the data set 3, 4, 5, 6, 7, 7, 10?
1) / mean = mode2) / mean > mode
3) / mean = median
4) / mean < median
Exercise #8: Mr. Taylor raised all his students’ scores on a recent test by five points. How were the mean and the range of the scores affected?
1) / The mean increased by five and the range increased by five.2) / The mean increased by five and the range remained the same.
3) / The mean remained the same and the range increased by five.
4) / The mean remained the same and the range remained the same.
Name______Date______
11.2: Measures of Central Tendency Algebra I
1.The Student Government at Arlington High School decided to conduct a survey to determine where to go on a senior field trip. They asked students the following question: “Would you rather go to a sports event or to an IMAX movie?” At which of the following locations would they most likely get a fair sample?
(1) The gym, after a game (3) A randomly chosen study hall
(2) The auditorium after a play (4) At the Nature Club meeting.
2. For the following data set, calculate the mean and median. Any non-integer answers should be rounded to the nearest tenth.
3, 5, 8, 8, 12, 16, 17, 20, 24
3. For the following data set, calculate the mean and median. Any non-integer answers should be rounded to the nearest tenth.
5, 5, 9, 10, 13, 16, 18, 20, 22, 22
4. Which of the following is true about the data set {3, 5, 5, 7, 9}?
(1) median > range (3) mean > median
(2) median = mean (4) median > mean
5. Which of the following data sets has a median of 7.5?
(1) {6, 7, 8, 9, 10} (3) {3, 5, 7, 8, 10, 14}
(2) {1, 3, 7, 10, 14} (4) {2, 7, 9, 11, 14, 17}
6. A survey is taken by an insurance company to determine how many car accidents the average New York City resident has gotten into in the past 10 years. The company surveyed 20 people who are getting off a train at a subway station. The following table gives the results of the survey.
(a)Calculate the mean and median number of accidents of this data set. Remember, there are 6-zeros in this data set, 8-1’s, etc.
(b)Are there any outliers in this data set? If so, what data value?
(c)Which number, the mean or the median, better represents the number of accidents an average person in
this survey had over this 10 year period? Explain your answer.
(d)Does this sample fairly represent the average number of accidents a typical New York City resident
would get into over a 10 year period? Why or why not?
Name______Date______
Lesson 11.3: Box Plots, IQR, Standard Deviation & Variance Algebra 1
Recall that the median of a number set divides the set into two equal parts, or halves. Quartilesare the values that separate the data into four parts, each containing one-fourth, or 25% of the numbers.
Example: 53, 54, 55, 55, 56, 58, 59, 59, 59, 59, 60, 61, 61, 63, 64, 65, 75, 85, 95, 95
Exercise #1: Determine the 1st quartile, median, and 3rd quartile of the number set below.
53, 39, 51, 54, 40, 46, 41, 50, 49, 39, 50, 51, 39
The five-number summary consists of the minimum value, 1st quartile, median, 3rd quartile, and maximum value.
To Calculate the Five-Number Summary on a Graphing Calculator:
- Press STAT
- Choose 1: Edit
- Enter the data into L1
- Press STAT then the Right Arrow to Calc and choose 1: 1-Var Stats
- Press Enter
- Scroll down to find minx, Q1, Med, Q3, Maxx
Exercise #2: Use your calculator to determine the five number summary of the data set below.
42, 15, 25, 30, 42, 75, 80, 85, 65, 25, 19, 72, 77, 25
Box Plots are another way to represent data graphically. A Box Plot describes a data set using its quartiles and min/max values.
Example:
Exercise #3: The test scores from Mrs. Gray’s math class are shown below.
72, 73, 66, 71, 82, 85, 95, 85, 86, 89, 91, 92
Construct a box-and-whisker plot to display these data.
Exercise #4: The number of songs fifteen students have on their MP3 players is:
120, 124, 132, 145, 200, 255, 260, 292,308, 314, 342, 407, 421, 435, 452
State the values of the minimum, 1st quartile, median, 3rd quartile, and maximum. Using these
values,construct a box-and-whisker plot using an appropriate scale on the line below.
Measures of Central Tendency are good for describing the typical data value in a set. However, they do not inform us of how much variation is there within the data set. A good way to measure variation within a set is with Interquartile Range (IQR).TheInterquartile Range (IQR) is similar to the Range of a data set. Range is the difference between the Max and Min whereas the Interquartile Range (IQR) is the difference between the 3rd Quartile and 1st Quartile.
Exercise #5: The two data sets below each have equal means but differ in the variation within the data set. Determine the Interquartile Range (IQR) of each data set. The IQR is defined as the difference between the third quartile value and the first quartile value.
Data Set #1: 3, 3, 4, 4, 5, 5, 6, 6, 7, 8, 8, 9, 9, 10, 10, 11, 11
Data Set #2: 5, 5, 6, 6, 7, 7, 8, 8, 9, 9
The interquartile range gives a good measure of how spread out the data set is. But, the best measure of variation within a data set is the standard deviation. The actual calculation of standard deviation is complex and we will not go into it here. We will rely on our calculators for its calculation.
How to find Standard Deviation using the Graphing Calculator:
- Following the same steps used to find the five number summary (shown previously in lesson) enter the data into L1 and choose 1-Var Stats
- Standard Deviation is shown by
Exercise #6: A farm is studying the weight of baby chickens (chicks) after 1 week of growth. They find the weight, in ounces, of 20 chicks. The weights are shown below. Calculate the standard deviation for this data set. Round any non-integer values to the nearest tenth. Include appropriate units in your answers.
2, 1, 3, 4, 2, 2, 3, 1, 5, 3, 4, 4, 5, 6, 3, 8, 5, 4, 6, 3
Exercise #7: A marketing company is trying to determine how much diversity there is in the age of people who drink different soft drinks. They take a sample of people and ask them which soda they prefer. For the two sodas, the age of those people who preferred them is given below.
Soda A: 18, 16, 22, 16, 28, 18, 21, 38, 22, 29, 25, 44, 36, 27, 40
Soda B: 25, 22, 18, 30, 27, 19, 22, 28, 25, 19, 23, 29, 26, 18, 20
(a) Explain why standard deviation is a better measure of the diversity in age than the mean.
(b) Which soda appears to have a greater diversity in the age of people who prefer it? How did you decide on this?
(c) Use your calculator to determine the sample standard deviation, normally given as Sx, for both data sets. Round your answers to the nearest tenth.
Exercise #8: Johnny wants to determine how many flowers are on each rose bush in a very large flower garden. He took a sample of the rose bushes and counted the number of flowers in each rose bush that was in his sample. His results are below.
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
(a) If you wanted to determine the standard deviation of Johnny’s results, should you use population standard deviation or sample standard deviation? Why?
(b) Determine the correct standard deviation.
Exercise #9: Paul has 20 rose bushes in his garden. He counted the number of flowers on each bush. His results are below.
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
(a) If you wanted to determine the standard deviation of Paul’s results, should you use population standard deviation or sample standard deviation? Why?
(b) Determine the correct standard deviation.
Note: If the question does not specifically say “sample” use population standard deviation.
Exercise #10: Which of the following data sets would have a standard deviation (population) closest to zero? Do this without your calculator. Explain how you arrived at your answer.
(1) {(3)
(2) (4)
1. Using the line provided, construct a box plot for the 12 scores below.
26, 32, 19, 65, 57, 16, 28, 42, 40, 21, 38, 10
Determine the number of scores that lie above the 75th percentile
2. The box plot below represents the math test scores of 20 students.
What percentage of the test scores are less than 72?
1) / 252) / 50
3) / 75
4) / 100
3.