Lecture Notes for
Applied Business Statistics
A Training Program for BCBS
Professor Ahmadi, Ph.D.
Professor Ahmadi’s Lecture Notes Page 66
Chapter 1
Glossary of Terms:
· Statistics
· Data
· Data Set
· Elements
· Variable
· Observations
· Sample and Population
· Descriptive Statistics
· Statistical Inference
· Qualitative and Quantitative Data
Scales of Measurement:
· Nominal Scale
· Ordinal Scale
· Interval Scale
· Ratio Scale
Chapter 2
Summarizing Quantitative Data
Problem 1. Daily earnings of a sample of twelve individuals are shown below:
100, 126, 138, 142, 148, 150, 168, 182, 191, 193, 195, 199
Summarize the above data by constructing:
a. a frequency distribution
b. a cumulative frequency distribution
c. a relative frequency distribution
d. a cumulative relative frequency distribution
e. a histogram
f. an ogive
cumulative relative cumulative
Class frequency frequency frequency relative frequency
100 - 119
120 - 139
140 - 159
160 - 179
180 - 199
DOT PLOT
Problem 2. In a recent campaign, many airlines reduced their summer fares in order to gain a larger share of the market. The following data represent the prices of round-trip tickets from Atlanta to Boston for a sample of nine airlines:
120 / 140 / 140160 / 160 / 160
160 / 180 / 180
Construct a dot plot for the above data.
STEM-AND-LEAF DISPLAY
Problem 3. The test scores of 14 individuals on their first statistics examination are shown below:
95 87 52 43 77 84 78
75 63 92 81 83 91 88
a. Construct a stem-and-leaf display for these data.
b. What does the above stem-and-leaf show?
CROSSTABULATION
Problem 4. The following is a crosstabulation of starting salaries (in $1,000's) of a sample of business school graduates by their gender.
Starting SalaryGender / Less than 30 / 30 up to 35 / 35 and more / Total
Female / 12 / 84 / 24 / 120
Male / 20 / 48 / 12 / 80
Total / 32 / 132 / 36 / 200
a. What general comments can be made about the distribution of starting salaries and the gender of the individuals in the sample?
b. Compute row percentages and comment on the relationship between starting salaries and gender.
SCATTER DIAGRAM
Problem 5. The average grades of 8 students in professor Ahmadi’s statistics class and the number of absences they had during the semester are shown below:
Number of Absences / AverageGradeStudent / (x) / (y)
1 / 1 / 94
2 / 2 / 78
3 / 2 / 70
4 / 1 / 88
5 / 3 / 68
6 / 4 / 40
7 / 8 / 30
8 / 3 / 60
Develop a scatter diagram for the relationship between the number of absences (x) and their average grade (y).
Chapter 3 Formulas
SAMPLE POPULATION
Mean
where n = sample size where N = size of population
Interquartile Range
IQR = Q3 - Q1 (Same as for sample)
where: Q3 = third quartile (i.e., 75th percentile)
Q1 = first quartile (i.e., 25th percentile)
Variance
or: or:
Standard Deviation
Coefficient of Variation (C.V.)
Covariance
Pearson Product Moment Correlation Coefficient
SAMPLE POPULATION
where where
= Sample correlation coefficient = Population correlation coefficient
= Sample covariance = Population covariance
SX = Sample standard deviation of X Population standard deviation of X
= Sample standard deviation of Y Population standard deviation of Y
Weighted Meanwhere
Xi = data value i
wi = weight for data value i
Grouped DataMean
where
fi = frequency of class i
Mi = midpoint of class i
Variance
)
or
Chapter 3
Measures of Location & Dispersion (Ungrouped Data)
Problem 1. Hourly earnings (in dollars) of a sample of eight employees of Ahmadi, Inc. is shown below:
Individual / Earning (X)1 / 12
2 / 15
3 / 15
4 / 17
5 / 18
6 / 19
7 / 22
8 / 26
I. Measures of location
a. Compute the mean and explain and show its properties.
b. Determine the median and explain its properties.
c. Determine the 70th percentile.
d. Determine the 25th percentile.
e Find the mode.
II. Compute the following measures of dispersion for the above data:
a. Range
b. Interquartile range
c. Variance & the Standard deviation
d. Coefficient of variation
e. A sample of Chatt, Inc. employees had a mean of $21 and a standard deviation of $5. Which company shows a more dispersed data distribution?
f. Use “Descriptive Statistics” in Excel and determine all the statistical measures.
Chapter 3
Five-Number Summary
Problem 2. The weights of 12 individuals who enrolled in a fitness program are shown below:
Individual Weight (Pounds)
1 100
2 105
3 110
4 130
5 135
6 138
7 142
8 145
9 150
10 170
11 240
12 300
a. Provide a five-number summary for the data.
b. Show the box plot for the weight data.
Chapter 3
Covariance & Coefficient of Correlation
Problem 3. The average grades of a sample of 8 students in professor Ahmadi’s statistics class and the number of absences they had during the semester are shown below.
Number of Absences / Average GradeStudent / () / ()
1 / 1 / 94
2 / 2 / 78
3 / 2 / 70
4 / 1 / 88
5 / 3 / 68
6 / 4 / 40
7 / 8 / 30
8 / 3 / 60
TOTAL / 24 / 528
a. Compute the sample covariance and interpret its meaning.
b. Compute the sample coefficient of correlation and interpret its meaning.
Chapter 3
Weighted Mean
Problem 4. The M&A Oil Company has purchased barrels of oil from several suppliers. The purchase price per barrel and the number of barrels purchased are shown below.
Supplier / Price Per Barrel ($) / Number of BarrelsA / 55 / 4,000
B / 49 / 3,000
C / 48 / 9,000
D / 50 / 20,000
Compute the weighted average price per barrel.
Chapter 3
Measures of Location & Dispersion (Grouped Data)
Problem 5. The yearly income distribution for a sample of 30 Ahmadi, Inc. employees is shown below.
Yearly Income Frequency
(In $10,000) fi
4 - 6 2
7 - 9 6
10 - 12 7
13 - 15 10
16 - 18 5
Totals n = 30
a. Compute the mean yearly income.
b. Compute the variance and the standard deviation of the sample.
c. A sample of Chatt, Inc. employees had a mean income of $132,000 with a standard deviation of $36,000. Which company shows a more dispersed income distribution?
Chapter 4 Formulas
Counting Rule for Multiple-step Experiments:
Total number of outcomes =
The number of Combinations of N objects taken n at a time:
Sum of the probability of Event A and its Complement: P(A) + P(Ac) = 1.0
Addition Law (the probability of the union of two events):
P(A B) = P(A) + P(B) - P(A B)
Multiplication Law (the probability of the intersection of two events):
P(A B) = P(A) P(B|A) or P(A B) = P(B) P(A|B)
Two Events A and B are Independent if:
P(A|B) = P(A) or P(B|A) = P(B)
Multiplication Law for Independent Events: P(A B) = P(A) P(B)
Conditional Probability:
P(A|B) = or P(B|A) =
Bayes' Theorem in General:
P(Ai|B) =
Summary of Bayes' Theorem Calculations:
Prior Conditional Joint Posterior
Probabilities Probabilities Probabilities Probabilities
Event P(Ai) P(B|Ai) P(Ai B) P(Ai|B)
Chapter 4 - Basic Probability Concepts
Problem 1. Assume you have applied to two different universities (let's refer to them as universities A and B) for your graduate work. In the past, 25% of students (with similar credentials as yours) who applied to university A were accepted; while university B had accepted 35% of the applicants (Assume events are independent of each other).
a. What is the probability that you will be accepted in both universities?
b. What is the probability that you will be accepted to at least one graduate program?
c. What is the probability that one and only one of the universities will accept you?
d. What is the probability that neither university will accept you?
Problem 2. An individual has applied to two different insurance companies for health insurance coverage. The probability that company A will approve her application is 0.63, and the probability that company B will approve her application is 0.55. The probability that both companies will approve her application is 0.3465.
a. What is the probability that company A will approve her application, given that company B has approved her application?
b. Are the approval outcomes independent events? Explain; and using the probability concepts, substantiate your answer.
c. Are the approval outcomes mutually exclusive? Explain;` and using the probability concepts, substantiate your answer.
c. What is the probability that her application will be approved by at least one of the companies?
Chapter 4 - Conditional Probability
Problem 3. A research study investigating the relationship between smoking and heart disease in a sample of 500 individuals provided the following data:
Smoker / Nonsmoker / TotalRecord of Heart Disease / 50 / 40 / 90
No Record of Heart Disease / 100 / 310 / 410
Total / 150 / 350 / 500
a. Show the joint probability table.
b. What is the probability that an individual is a smoker and has a record of heart disease?
c. Compute and interpret the marginal probabilities.
d. Given that an individual is a smoker, what is the probability that this individual has heart disease?
e. Given that an individual is a nonsmoker, what is the probability that this individual has heart disease?
f. Does the research show that heart disease and smoking are independent events? Use probabilities to justify your answer.
g. What conclusion would you draw about the relationship between smoking and heart disease?
Chapter 4
BAYES' THEOREM
Problem 4. When Ahmadi, Inc. sets up their drill press machine, 70% of the time it is set up correctly. It is known that if the machine is set up correctly it produces 90% acceptable parts. On the other hand, when the machine is set up incorrectly, it produces 20% acceptable parts. One item from the production is selected and is observed to be acceptable.
a. What is the probability that the machine is set up correctly? That is, we are interested in computing:
P(Correct set up | Acceptable part).
Let the following symbols represent the various events:
E1 = Correct set up
E2 = Incorrect set up
G = Good part (i.e., Acceptable part)
With the above notations we want to determine P(E1 | G).
b. Compute all the posterior probabilities.
Chapter 5 Formulas
Required Conditions for a Discrete Probability Function
f(x) 0
f(x) = 1
Discrete Uniform Probability Function
f(x) = 1/n
where n = the number of values the random variable may assume
Expected Value of a Discrete Random Variable
E(x) = µ = (x f(x))
Variance of a Discrete Random Variable
Variance (x) == (x - µ) 2 f(x
Number of Experimental Outcomes Providing Exactly x Successes in n Trials
=
where n! = n (n - 1) (n - 2) . . . (2)(1) (Remember: 0! = 1)
Binomial Probability Function
f(x) = p x (1 - p) n – x where x = 0 ,1, 2, ..., n
The Mean of a Binomial Distribution
µ = n p
The Variance of a Binomial Distribution
= n p (1 - p)
Chapter 5
Discrete Probability Distributions
Problem 1. The manager of the university bookstore has kept records of the number of diskettes sold per day. She provided the following information regarding diskettes sales for a period of 60 days:
Number of Number
Diskettes Sold of Days
0 6
1 9
2 12
3 18
4 12
5 3
a. Identify the random variable
b. Is the random variable discrete or continuous?
c. Develop a probability distribution for the above data.
d. Is the above a proper probability distribution?
e. Develop a cumulative probability distribution.
f. Determine the expected number of daily sales of diskettes.
g. Determine the variance and the standard deviation.
h. If each diskette yields a net profit of 50 cents, what are the expected yearly profits from the sales of diskettes?
Chapter 5
Introduction to Binomial Distribution
Problem 2. A production process has been producing 10% defective items. A random sample of four items is selected from the production process.
a. What is the probability that the first 3 selected items are non-defective and the last item is defective?
b. If a sample of 4 items is selected, how many outcomes contain exactly 3 non-defective items?
c. What is the probability that a random sample of 4 contains exactly 3 non-defective items?
d. Determine the probability distribution for the number of non-defective items in a sample of four.
e. Determine the expected number (mean) of non-defectives in a sample of four.
f. Find the standard deviation for the number of non-defectives.