11.1 Find Measures of Central Tendency & Dispersion

11.1 Find Measures of Central Tendency & Dispersion

Statistics - numerical values used to summarize and compare sets of data. Two important types:

1. measure of central tendency-a number used to

represent the center or middle of a set of data values.

a. mean - average - x bar, sum of numbers divided by

the number of values.

b. median - middle value, (average the middle two values

if there are an even number of values).

c. mode - the value or values that occurs most often,

no mode if all values appear once.

2. measures of dispersion - tells how dispersed, or spread

out data values are.

a. range - difference between the greatest and least

data values.

b. standard deviation - (sigma) describes the typical

difference,(or deviation),between a data value

and the mean. (see formula)

Outliers - a value that is much greater than or much less than most of the other values in the data set. Measures of central tendency and dispersion can give misleading impressions of a data set if the set contains one or more outliers.

11.2 Apply Transformations to Data

Adding a Constant to Data Values:

mean, median, and mode of the new data set can be

obtained by adding the same constant to the mean,

median, and mode of the original data set.

range and standard deviation are unchanged.

Multiplying Data Values by a Constant:

mean, median, mode, range, and standard deviation of

new data set can be found by multiplying each original

statistic by the same constant.

11.2B (See page 1008 examples)

Ways to Organize Data

1. Line Plot

2. Stem & Leaf Plot

3. Histogram (Tally chart)

4. Box and Whisker Plot

11.3 Use Normal Distributions

Normal Distribution, one type of probability distribution, is modeled by a bell-shape curve called a normal curve that is symmetric about the mean.

Areas Under a Normal Curve:

A normal distribution with mean, x-bar, and standard deviation,

sigma, has the following properties -

*the total area under the related normal curve is 1

*about 68% of the area lies within 1 SD of the mean

*about 95% of the area lies within 2 SD of the mean

*about 99.7% of the area lies within 3 SD of the

mean

Standard Normal Distribution is the normal distribution with mean 0 and standard deviation 1.

z-score - for the x-value is the number of SD the x-value lies above or below the mean.

11.4 Select & Draw Conclusions from Samples

Population - group of objects or people that you want information

about.

Sample - subset of a population. Use when it is too difficult to collect

from everyone.

self-selected sample - members of a population volunteer to be

in the sample

convenience sample - members of a population that are easy- to-

reach

systematic sample - members of a population are selected

using a rule

random sample - members of a population have an equal chance

of being selected

Bias in Sampling: (select an unbiased sample to draw accurate

conclusion)

unbiased sample - is representative of the population you want

information about.

biased sample - a sample that under representing or over

representing a part of your population

Sample size: when conducting a survey, make the size of the sample

large enough so it accurately represents the population.

Margin of error: gives a limit on how much the responses of the sample would differ from the responses of the population. As the sample size

increases, the margin of error decreases.

Margin of error = ± 1/√n

If the percent of the sample responding a certain way is p, then the percent of the population that would respond the same is likely between

p - 1/√n and p + 1/√n

11.5 Choose the Best Model from Two-Variable Data

Function General Form

Linear y = ax + b

Quadratic y = ax2 + bx + c

Cubic y = ax3 + bx2 + cx + d

Exponential y = abx

Power y = axb

1. Make a scatter plot of the data on your calculator.

2. Determine the type of function suggested by the pattern

of the graph.

3. Use the regression features of the calculator to find a

model of best fit.