Lecture Notes for

Applied Business Statistics

A Training Program for BCBS

Professor Ahmadi, Ph.D.

Professor Ahmadi’s Lecture Notes Page 66

Chapter 1

Glossary of Terms:

· Statistics

· Data

· Data Set

· Elements

· Variable

· Observations

· Sample and Population

· Descriptive Statistics

· Statistical Inference

· Qualitative and Quantitative Data

Scales of Measurement:

· Nominal Scale

· Ordinal Scale

· Interval Scale

· Ratio Scale

Chapter 2

Summarizing Quantitative Data

Problem 1. Daily earnings of a sample of twelve individuals are shown below:

100, 126, 138, 142, 148, 150, 168, 182, 191, 193, 195, 199

Summarize the above data by constructing:

a. a frequency distribution

b. a cumulative frequency distribution

c. a relative frequency distribution

d. a cumulative relative frequency distribution

e. a histogram

f. an ogive

cumulative relative cumulative

Class frequency frequency frequency relative frequency

100 - 119

120 - 139

140 - 159

160 - 179

180 - 199

DOT PLOT

Problem 2. In a recent campaign, many airlines reduced their summer fares in order to gain a larger share of the market. The following data represent the prices of round-trip tickets from Atlanta to Boston for a sample of nine airlines:

120 / 140 / 140
160 / 160 / 160
160 / 180 / 180

Construct a dot plot for the above data.

STEM-AND-LEAF DISPLAY

Problem 3. The test scores of 14 individuals on their first statistics examination are shown below:

95 87 52 43 77 84 78

75 63 92 81 83 91 88

a. Construct a stem-and-leaf display for these data.

b. What does the above stem-and-leaf show?


CROSSTABULATION

Problem 4. The following is a crosstabulation of starting salaries (in $1,000's) of a sample of business school graduates by their gender.

Starting Salary
Gender / Less than 30 / 30 up to 35 / 35 and more / Total
Female / 12 / 84 / 24 / 120
Male / 20 / 48 / 12 / 80
Total / 32 / 132 / 36 / 200

a. What general comments can be made about the distribution of starting salaries and the gender of the individuals in the sample?

b. Compute row percentages and comment on the relationship between starting salaries and gender.

SCATTER DIAGRAM

Problem 5. The average grades of 8 students in professor Ahmadi’s statistics class and the number of absences they had during the semester are shown below:

Number of Absences / AverageGrade
Student / (x) / (y)
1 / 1 / 94
2 / 2 / 78
3 / 2 / 70
4 / 1 / 88
5 / 3 / 68
6 / 4 / 40
7 / 8 / 30
8 / 3 / 60

Develop a scatter diagram for the relationship between the number of absences (x) and their average grade (y).


Chapter 3 Formulas

Ungrouped Data

SAMPLE POPULATION

Mean

where n = sample size where N = size of population

Interquartile Range

IQR = Q3 - Q1 (Same as for sample)

where: Q3 = third quartile (i.e., 75th percentile)

Q1 = first quartile (i.e., 25th percentile)

Variance

or: or:

Standard Deviation

Coefficient of Variation (C.V.)

Covariance


Pearson Product Moment Correlation Coefficient

SAMPLE POPULATION

where where

= Sample correlation coefficient = Population correlation coefficient

= Sample covariance = Population covariance

SX = Sample standard deviation of X Population standard deviation of X

= Sample standard deviation of Y Population standard deviation of Y

Weighted Mean

where

Xi = data value i

wi = weight for data value i

Grouped Data

Mean

where

fi = frequency of class i

Mi = midpoint of class i

Variance

)

or


Chapter 3

Measures of Location & Dispersion (Ungrouped Data)

Problem 1. Hourly earnings (in dollars) of a sample of eight employees of Ahmadi, Inc. is shown below:

Individual / Earning (X)
1 / 12
2 / 15
3 / 15
4 / 17
5 / 18
6 / 19
7 / 22
8 / 26

I. Measures of location

a. Compute the mean and explain and show its properties.

b. Determine the median and explain its properties.

c. Determine the 70th percentile.

d. Determine the 25th percentile.

e Find the mode.


II. Compute the following measures of dispersion for the above data:

a. Range

b. Interquartile range

c. Variance & the Standard deviation

d.  Coefficient of variation

e.  A sample of Chatt, Inc. employees had a mean of $21 and a standard deviation of $5. Which company shows a more dispersed data distribution?

f. Use “Descriptive Statistics” in Excel and determine all the statistical measures.


Chapter 3

Five-Number Summary

Problem 2. The weights of 12 individuals who enrolled in a fitness program are shown below:

Individual Weight (Pounds)

1 100

2 105

3 110

4 130

5 135

6 138

7 142

8 145

9 150

10 170

11 240

12 300

a. Provide a five-number summary for the data.

b. Show the box plot for the weight data.


Chapter 3

Covariance & Coefficient of Correlation

Problem 3. The average grades of a sample of 8 students in professor Ahmadi’s statistics class and the number of absences they had during the semester are shown below.

Number of Absences / Average Grade
Student / () / ()
1 / 1 / 94
2 / 2 / 78
3 / 2 / 70
4 / 1 / 88
5 / 3 / 68
6 / 4 / 40
7 / 8 / 30
8 / 3 / 60
TOTAL / 24 / 528

a. Compute the sample covariance and interpret its meaning.

b. Compute the sample coefficient of correlation and interpret its meaning.


Chapter 3

Weighted Mean

Problem 4. The M&A Oil Company has purchased barrels of oil from several suppliers. The purchase price per barrel and the number of barrels purchased are shown below.

Supplier / Price Per Barrel ($) / Number of Barrels
A / 55 / 4,000
B / 49 / 3,000
C / 48 / 9,000
D / 50 / 20,000

Compute the weighted average price per barrel.


Chapter 3

Measures of Location & Dispersion (Grouped Data)

Problem 5. The yearly income distribution for a sample of 30 Ahmadi, Inc. employees is shown below.

Yearly Income Frequency

(In $10,000) fi

4 - 6 2

7 - 9 6

10 - 12 7

13 - 15 10

16 - 18 5

Totals n = 30

a. Compute the mean yearly income.

b.  Compute the variance and the standard deviation of the sample.

c.  A sample of Chatt, Inc. employees had a mean income of $132,000 with a standard deviation of $36,000. Which company shows a more dispersed income distribution?


Chapter 4 Formulas

Counting Rule for Multiple-step Experiments:

Total number of outcomes =

The number of Combinations of N objects taken n at a time:

Sum of the probability of Event A and its Complement: P(A) + P(Ac) = 1.0

Addition Law (the probability of the union of two events):

P(A B) = P(A) + P(B) - P(A B)

Multiplication Law (the probability of the intersection of two events):

P(A B) = P(A) P(B|A) or P(A B) = P(B) P(A|B)

Two Events A and B are Independent if:

P(A|B) = P(A) or P(B|A) = P(B)

Multiplication Law for Independent Events: P(A B) = P(A) P(B)

Conditional Probability:

P(A|B) = or P(B|A) =

Bayes' Theorem in General:

P(Ai|B) =

Summary of Bayes' Theorem Calculations:

Prior Conditional Joint Posterior

Probabilities Probabilities Probabilities Probabilities

Event P(Ai) P(B|Ai) P(Ai B) P(Ai|B)


Chapter 4 - Basic Probability Concepts

Problem 1. Assume you have applied to two different universities (let's refer to them as universities A and B) for your graduate work. In the past, 25% of students (with similar credentials as yours) who applied to university A were accepted; while university B had accepted 35% of the applicants (Assume events are independent of each other).

a. What is the probability that you will be accepted in both universities?

b. What is the probability that you will be accepted to at least one graduate program?

c. What is the probability that one and only one of the universities will accept you?

d. What is the probability that neither university will accept you?

Problem 2. An individual has applied to two different insurance companies for health insurance coverage. The probability that company A will approve her application is 0.63, and the probability that company B will approve her application is 0.55. The probability that both companies will approve her application is 0.3465.

a.  What is the probability that company A will approve her application, given that company B has approved her application?

b.  Are the approval outcomes independent events? Explain; and using the probability concepts, substantiate your answer.

c.  Are the approval outcomes mutually exclusive? Explain;` and using the probability concepts, substantiate your answer.

c. What is the probability that her application will be approved by at least one of the companies?


Chapter 4 - Conditional Probability

Problem 3. A research study investigating the relationship between smoking and heart disease in a sample of 500 individuals provided the following data:

Smoker / Nonsmoker / Total
Record of Heart Disease / 50 / 40 / 90
No Record of Heart Disease / 100 / 310 / 410
Total / 150 / 350 / 500

a. Show the joint probability table.

b. What is the probability that an individual is a smoker and has a record of heart disease?

c. Compute and interpret the marginal probabilities.

d. Given that an individual is a smoker, what is the probability that this individual has heart disease?

e. Given that an individual is a nonsmoker, what is the probability that this individual has heart disease?

f. Does the research show that heart disease and smoking are independent events? Use probabilities to justify your answer.

g. What conclusion would you draw about the relationship between smoking and heart disease?


Chapter 4

BAYES' THEOREM

Problem 4. When Ahmadi, Inc. sets up their drill press machine, 70% of the time it is set up correctly. It is known that if the machine is set up correctly it produces 90% acceptable parts. On the other hand, when the machine is set up incorrectly, it produces 20% acceptable parts. One item from the production is selected and is observed to be acceptable.

a. What is the probability that the machine is set up correctly? That is, we are interested in computing:

P(Correct set up | Acceptable part).

Let the following symbols represent the various events:

E1 = Correct set up

E2 = Incorrect set up

G = Good part (i.e., Acceptable part)

With the above notations we want to determine P(E1 | G).

b. Compute all the posterior probabilities.


Chapter 5 Formulas

Required Conditions for a Discrete Probability Function

f(x) 0

f(x) = 1

Discrete Uniform Probability Function

f(x) = 1/n

where n = the number of values the random variable may assume

Expected Value of a Discrete Random Variable

E(x) = µ = (x f(x))

Variance of a Discrete Random Variable

Variance (x) == (x - µ) 2 f(x

Number of Experimental Outcomes Providing Exactly x Successes in n Trials

=

where n! = n (n - 1) (n - 2) . . . (2)(1) (Remember: 0! = 1)

Binomial Probability Function

f(x) = p x (1 - p) n – x where x = 0 ,1, 2, ..., n

The Mean of a Binomial Distribution

µ = n p

The Variance of a Binomial Distribution

= n p (1 - p)


Chapter 5

Discrete Probability Distributions

Problem 1. The manager of the university bookstore has kept records of the number of diskettes sold per day. She provided the following information regarding diskettes sales for a period of 60 days:

Number of Number

Diskettes Sold of Days

0 6

1 9

2 12

3 18

4 12

5 3

a. Identify the random variable

b. Is the random variable discrete or continuous?

c. Develop a probability distribution for the above data.

d. Is the above a proper probability distribution?

e. Develop a cumulative probability distribution.

f. Determine the expected number of daily sales of diskettes.

g. Determine the variance and the standard deviation.

h. If each diskette yields a net profit of 50 cents, what are the expected yearly profits from the sales of diskettes?


Chapter 5

Introduction to Binomial Distribution

Problem 2. A production process has been producing 10% defective items. A random sample of four items is selected from the production process.

a. What is the probability that the first 3 selected items are non-defective and the last item is defective?

b. If a sample of 4 items is selected, how many outcomes contain exactly 3 non-defective items?

c. What is the probability that a random sample of 4 contains exactly 3 non-defective items?

d. Determine the probability distribution for the number of non-defective items in a sample of four.

e.  Determine the expected number (mean) of non-defectives in a sample of four.

f.  Find the standard deviation for the number of non-defectives.