Chapter Two

Descriptive Statistics

Descriptive Statistics

2.1 Describing the Shape of a Distribution

2.2 Describing Central Tendency

2.3 Measures of Variation

2.4 Percentiles, Quartiles, and Box-and-Whiskers Displays

2.5 Describing Qualitative Data

*2.6Using Scatter Plots to Study the Relationship Between Variables

*2.7Misleading Graphs and Charts

2.1 Stem and Leaf Display:
Car Mileage

Example 2.1: The Car Mileage Case

Stem and Leaf Display: Payment Times

Example 2.2: The Accounts Receivable Case

Histograms

Example 2.4: The Accounts Receivable Case

Frequency Histogram

The Normal Curve

Relative Frequency Histogram

Skewness

Left Skewed

Symmetric

Right Skewed

Dot Plots

Scores on Exams 1 and 2

2.2 Population Parameters and Sample Statistics

A population parameter is number calculated from all the population measurements that describes some aspect of the population.

The population mean, denoted , is a population parameter and is the average of the population measurements.

A point estimate is a one-number estimate of the value of a population parameter.

A sample statistic is number calculated using sample measurements that describes some aspect of the sample

Measures of Central Tendency

Mean, σThe average or expected value

Median, MdThe middle point of the ordered measurements

Mode, MoThe most frequent value

The Mean

Population X1, X2, …, XN

Sample x1, x2, …, xn

Population Mean

Sample Mean

The Sample Mean

The sample mean is defined as

and is a point estimate of the population mean, .

Example: Car Mileage Case

Example 2.5:Sample mean for first five car mileages from Table 2.1

The Median

The population or sample median is a value such that 50% of all measurements lie above (or below) it.

The median Md is found as follows:

1.If the number of measurements is odd, the median is the middlemost measurement in the ordered values.

2.If the number of measurements is even, the median is the average of the two middlemost measurements in the ordered values.

Example: Sample Median

Example 2.6: Internists’ Salaries (x$1000)

Since n = 13 (odd,) then the median is the middlemost or 7th measurement, Md=152

The Mode

The mode, Mo of a population or sample of measurements is the measurement that occurs most frequently

Example: Sample Mode

Example 2.2: The Accounts Receivable Case

The value 16 occurs 9 times therefore

Relationships Among Mean, Median and Mode

2.3 Measures of Variation

Range

Largest minus the smallest measurement

Variance

The average of the sum of the squared deviations from the mean

Standard Deviation

The square root of the variance

The Range

Range = largest measurement - smallest measurement

Example:

Internists’ Salaries (in thousands of dollars)

127 132 138 141 144 146 152 154 165 171 177 192 241

Range = 241 - 127 = 114 ($114,000)

The Variance

Population X1, X2, …, XN

Sample x1, x2, …, xn

Population Variance

Sample Variance

The Standard Deviation

Population Standard Deviation, s:

Sample Standard Deviation, s:

Example: Population Variance/Standard Deviation

Population of annual returns for five junk bond mutual funds:

Example: Sample Variance/Standard Deviation

Example 2.11: Sample variance and standard deviation for first five car mileages from Table 2.1

The Empirical Rule for Normal Populations

If a population has mean m and standard deviation s and is described by a normal curve, then

68.26% of the population measurements lie within one standard deviation of the mean: [m-s, m+s]

95.44% of the population measurements lie within two standard deviations of the mean: [m-2s, m+2s]

99.73% of the population measurements lie within three standard deviations of the mean: [m-3s, m+3s]

Example: The Empirical Rule

Example 2.13: The Car Mileage Case

Chebyshev’s Theorem

Let m and s be a population’s mean and standard deviation, then for any value k>1,

At least 100(1 - 1/k2 )% of the population measurements lie in the interval:

[m-ks, m+ks]

2.4 Percentiles and Quartiles

For a set of measurements arranged in increasing order, the pth percentile is a value such that p percent of the measurements fall at or below the value and (100-p) percent of the measurements fall at or above the value.

The first quartile Q1 is the 25th percentile

The second quartile (or median) Md is the 50th percentile

The third quartile Q3 is the 75th percentile.

The interquartile range IQR is Q3 - Q1

Example: Quartiles

20 customer satisfaction ratings:

Box and Whiskers Plots

2.5 Describing Qualitative Data

Population and Sample Proportions

Population X1, X2, …, XN

Sample x1, x2, …, xn

Population Proportion

Sample Proportion

xi = 1 if characteristic present, 0 if not

Example: Sample Proportion

Example 2.16: Marketing Ethics Case

117 out of 205 marketing researchers disapproved of action taken in a hypothetical scenario

X = 117, number of researches who disapprove

n = 205, number of researchers surveyed

Sample Proportion:

Bar Chart

Percentage of Automobiles Sold by Manufacturer,1970 versus 1997

Pie Chart

Percentage of Automobiles Sold by Manufacturer,1997

Pareto Chart

Pareto Chart of Labeling Defects

2.6 Scatter Plots

Restaurant Ratings: Mean Preference vs. Mean Taste

2.7 Misleading Graphs and Charts: Scale Break

Mean Salaries at a Major University, 1999 - 2002

Misleading Graphs and Charts:Horizontal Scale Effects

Mean Salary Increases at a Major University, 1999-2002

Descriptive Statistics

2.1 Describing the Shape of a Distribution

2.2 Describing Central Tendency

2.3 Measures of Variation

2.4 Percentiles, Quartiles, and Box-and-Whiskers Displays

2.5 Describing Qualitative Data

*2.6Using Scatter Plots to Study the Relationship Between Variables

*2.7Misleading Graphs and Charts

Chapter Two / 第二章 Descriptive Statistics / 描述統計學

Descriptive Statistics / 描述統計學

2.1 Describing the Shape of a Distribution / 2.1 描述分配的形成

2.2 Describing Central Tendency / 2.2 描述集中趨勢

2.3 Measures of Variation / 2.3 對差異的衡量

2.4 Percentiles, Quartiles, and Box-and-Whiskers Displays / 2.4 百分點,四分位數和保險箱-和-腮鬚陳列品

2.5 Describing Qualitative Data / 2.5 描述性質上的資料

*2.6 Using Scatter Plots to Study the Relationship Between Variables / *2.6個使用的散布繪製研究變數之間的關係 *2.7 Misleading Graphs and Charts / *2.7 誤導圖表和圖表

2.1 Stem and Leaf Display: / 2.1 阻止而且生葉陳列品: Car Mileage / 汽車運費

Example 2.1: The Car Mileage Case / 例子 2.1: 汽車運費個案

Stem and Leaf Display: Payment Times / 莖和葉陳列品: 支付泰晤士報

Example 2.2: The Accounts Receivable Case / 例子 2.2: 應收賬款個案

Histograms / 直方圖

Example 2.4: The Accounts Receivable Case / 例子 2.4: 應收賬款個案

Frequency Histogram / 次數直方圖

Relative Frequency Histogram / 相對的次數直方圖

The Normal Curve / 正常曲線

Skewness / 斜度

Left Skewed / 左邊歪斜

Symmetric / 對稱的

Right Skewed / 權利歪斜

Dot Plots / 點計畫翻譯

Scores on Exams 1 and 2 / 在考試 1 和 2 上得分

2.2 Population Parameters and Sample Statistics / 2.2 人口參數和樣品統計

A population parameter is number calculated from all the population measurements that describes some aspect of the population. / 一個人口參數是從所有的描述人口的一些方面的人口衡量被計算的數。

The population mean, denoted m, is a population parameter and is the average of the population measurements. / 被指示 m 的總體均值,是一個人口參數而且是人口衡量的平均數。

A point estimate is a one-number estimate of the value of a population parameter. / 一個點數評定是人口參數的價值的一個單一數概算書。

A sample statistic is number calculated using sample measurements that describes some aspect of the sample / 樣品統計數值是使用描述樣品的一些方面的樣品衡量被計算的數

Measures of Central Tendency / 集中趨勢指標

Mean, σ The average or expected value / 意謂, σ平均數或期望值

Median, Md The middle point of the ordered measurements / 中項, Md 被命令的衡量的中央點

Mode, Mo The most frequent value / 眾數, Mo 最時常的價值

The Mean / 平均數

Population X1, X2, …, XN / 人口 X1 , X2 ,…, XN

Sample x1, x2, …, xn / 抽樣調查 x 1 , x 2 ,…, xn

Population Mean / 總體均值

Sample Mean / 抽樣調查平均數

The Sample Mean / 樣品平均數

The sample mean is defined as / 樣品平均數被定義當做

and is a point estimate of the population mean, m. / 而且是總體均值, m 的一個點數評定。

Example: Car Mileage Case / 例子: 汽車運費個案

Example 2.5: Sample mean for first five car mileages from Table 2.1 / 例子 2.5:樣品為來自表 2.1 的最初五個汽車運費意謂

The Median / 中項

The population or sample median is a value such that 50% of all measurements lie above (or below) it. / 人口或樣品中項是一個價值以致於 50% 的所有衡量躺著上述的 (或低於) 它。

The median Md is found as follows: / 中央的 Md 依下列各項被發現:

  1. If the number of measurements is odd, the median is the middlemost measurement in the ordered values. / 1.如果衡量的數是奇數的,中項是被命令的價值的 middlemost 衡量。
  2. 2. If the number of measurements is even, the median is the average of the two middlemost measurements in the ordered values. / 2.如果衡量的數是偶數,中項是被命令的價值的這二個 middlemost 衡量的平均數。

Example: Sample Median / 例子: 樣品中項

Example 2.6: Internists’ Salaries (x$1000) / 例子 2.6: 內科醫師的薪金 (x$1000)

Since n = 13 (odd,) then the median is the middlemost or 7th measurement, Md=152 / 自然後的 n=13(奇數的,) 之後,中項是 middlemost 或第 7個衡量, Md=152

The Mode / 眾數

The mode, Mo of a population or sample of measurements is the measurement that occurs most frequently / 眾數, 人口或衡量的樣品的 Mo 是最時常發生的衡量

Example: Sample Mode / 例子: 樣品眾數

Example 2.2: The Accounts Receivable Case / 例子 2.2: 應收賬款個案

The value 16 occurs 9 times therefore / 價值 16 因此發生 9 次

Relationships Among Mean, Median and Mode / 在平均數,中項和眾數之中的關係

2.3Measures of Variation / 2.3 對差異的衡量

Range / 排列 Largest minus the smallest measurement / 大的減最小的衡量

Variance / 變異 The average of the sum of the squared deviations from the mean / 來自平均數的被一致的離差的總數的平均數

Standard Deviation / 標準離差 The square root of the variance / 變異的正直根

The Range / 範圍

Range = largest measurement - smallest measurement / 排列 = 大的衡量 - 小的衡量

Example: / 例子: Internists’ Salaries (in thousands of dollars) / 內科醫師的薪金 (美元的以千元計) 127 132 138 141 144 146 152 154 165 171 177 192 241 / 127132138141144146152154165171177192241

Range = 241 - 127 = 114 ($114,000) / 排列 =241-127=114($114,000)

The Variance / 變異

Population X1, X2, …, XN / 人口 X1 , X2 ,…, XN

Sample x1, x2, …, xn / 抽樣調查 x 1 , x 2 ,…, xn

Population Variance / 人口變異

Sample Variance / 抽樣調查變異

The Standard Deviation / 標準離差

Population Standard Deviation, s: / 人口標準離差, s:

Sample Standard Deviation, s: / 抽樣調查標準離差, s:

Example: Population Variance/Standard Deviation / 例子: 人口變異/ 標準離差

Population of annual returns for five junk bond mutual funds: / 給五個便直貨的年利潤的人口黏住共同基金:

Example: Sample Variance/Standard Deviation / 例子: 樣品變異/ 標準離差

Example 2.11: Sample variance and standard deviation for first five car mileages from Table 2.1 / 例子 2.11: 為來自表 2.1 的最初五個汽車運費抽樣調查變異和標準離差

The Empirical Rule for Normal Populations / 為正常的人口經驗的規則

If a population has mean m and standard deviation s and is described by a normal curve, then / 如果人口有低劣的 m 和標準離差 s 而且被一個正常曲線描述, 然後

68.26% of the population measurements lie within one standard deviation of the mean: [m-s, m+s] / 68.26% 的人口衡量在平均數的一個標準離差裡面躺著:[m-s, m+s]

95.44% of the population measurements lie within two standard deviations of the mean: [m-2s, m+2s] / 95.44% 的人口衡量在平均數的二個標準離差裡面躺著:[m-2 年代, m+2 年代]

99.73% of the population measurements lie within three standard deviations of the mean: [m-3s, m+3s] / 99.73% 的人口衡量在平均數的三個標準離差裡面躺著:[m-3 年代, m+3 年代]

Example: The Empirical Rule / 例子: 經驗的規則

Example 2.13: The Car Mileage Case / 例子 2.13: 汽車運費個案

Chebyshev’s Theorem / Chebyshev 的定理

Let m and s be a population’s mean and standard deviation, then for any value k>1, / 讓 m ,而且 s 是一個人口的平均數和標準離差, 然後對於任何的價值 k>1,

At least 100(1 - 1/k2 )% of the population measurements lie in the interval: / 至少 100(1-1/k 2)% 人口衡量躺在間距中: [m-ks, m+ks] / [m-ks, m+ks]

2.4 Percentiles and Quartiles / 2.4 百分點和四分位數

For a set of measurements arranged in increasing order, the pth percentile is a value such that p percent of the measurements fall at or below the value and (100-p) percent of the measurements fall at or above the value. / 對於一系列衡量在遞增訂單中安排, pth 百分點是一個價值以致於 p 百分比的衡量下滑在或低於價值和 (100 p 的) 衡量下跌的百分比在或在價值上面。

The first quartile Q1 is the 25th percentile / 第一四分位數 Q1 是第 25個百分點

The second quartile (or median) Md is the 50th percentile / 第二個四分位數 (或中項) Md 是第 50個百分點

The third quartile Q3 is the 75th percentile. / 第三四分位數 Q3 是第 75個百分點。

The interquartile range IQR is Q3 - Q1 / interquartile 系列 IQR 是 Q3- Q1

Example: Quartiles / 例子: 四分位數

20 customer satisfaction ratings: / 20個顧客滿足評定等級:

Box and Whiskers Plots / 保險箱和腮鬚繪製

2.4Describing Qualitative Data / 2.5 描述性質上的資料

Population and Sample Proportions / 人口和樣品比例

Population X1, X2, …, XN / 人口 X1 , X2 ,…, XN

Sample x1, x2, …, xn / 抽樣調查 x 1 , x 2 ,…, xn

Population Proportion / 人口比例

Sample Proportion / 抽樣調查比例

xi = 1 if characteristic present, 0 if not / xi=1 如果特徵的首數呈現,0 如果不

Example: Sample Proportion / 例子: 樣品比例

Example 2.16: Marketing Ethics Case / 例子 2.16: 行銷倫理學個案

117 out of 205 marketing researchers disapproved of action taken in a hypothetical scenario / 205 中有 117個行銷研究人員反對被接受一個假設的情境的行為了

X = 117, number of researches who disapprove / X=117, 不贊成的研究的數

n = 205, number of researchers surveyed / n=205, 研究人員的數審視

Sample Proportion: / 抽樣調查比例:

Bar Chart / 條形圖表

Percentage of Automobiles Sold by Manufacturer,1970 versus 1997 / 被廠商賣的百分率的汽車,1970 對 1997

Pie Chart / 派圖表

Percentage of Automobiles Sold by Manufacturer,1997 / 被廠商賣的百分率的汽車,1997

Pareto Chart / Pareto 圖表

Pareto Chart of Labeling Defects / 分類缺點的 Pareto 圖表

2.6 Scatter Plots / 2.6 ,散布計畫翻譯

Restaurant Ratings: Mean Preference vs. Mean Taste / 餐廳評定等級: 平均數優先權和低劣的品嚐比較

2.7 Misleading Graphs and Charts: Scale Break / 2.7 誤導圖表和圖表::

Mean Salaries at a Major University, 1999 - 2002 / 在一所主要的大學意謂薪金,1999-2002

Misleading Graphs and Charts:Horizontal Scale Effects / 誤導圖表和圖表:水平的規模效果

Mean Salary Increases at a Major University, 1999-2002 / 在一所主要的大學意謂加薪,1999-2002

Descriptive Statistics / 描述統計學

2.1 Describing the Shape of a Distribution / 2.1 描述分配的形成

2.2 Describing Central Tendency / 2.2 描述集中趨勢

2.3 Measures of Variation / 2.3 對差異的衡量

2.4Percentiles, Quartiles, and Box-and-Whiskers Displays / 2.4 百分點,四分位數和保險箱-和-腮鬚陳列品

2.5 Describing Qualitative Data / 2.5 描述性質上的資料

*2.6 Using Scatter Plots to Study the Relationship Between Variables / *2.6個使用的散布繪製研究變數之間的關係

*2.7 Misleading Graphs and Charts /*2.7 誤導圖表和圖表