BUS 211 Notes

Chapter 1 Introduction and Data Collection

Categorical Variables – responses are a selection i.e. Gender (male or female), Class (freshman,
sophomore, junior, senior), Smoke (yes or no), etc.

Numerical Variables – responses are numbers i.e. Income ($30,000), Age (25), etc.
Can be Discrete (Integer) or Continuous (fractional parts),

Chapter 2 Presenting Data in Tables and Charts

Sort Data – Data | Sort

Stem-and-Leaf Graph – PHStat | Descriptive Statistics | Stem-and-Leaf Display

Frequency Distribution - PHStat | Descriptive Statistics | Frequency Distribution

Set up classes then array (bin) the upper limit of the desired frequency distribution

Be sure to include a label for the array (use Upper Limit)

Relative Frequency distribution – Divide the frequency distribution by the total

Percentage Distribution - Divide the frequency distribution by the total and multiply by 100

Or use Format | Cells… | Percentage

Cumulative Distribution – Sum the frequencies from top to bottom listing each total as you go.

Graphs - PHStat does not work well for most graphs use the chart wizard in Excel

Histogram also known as a Vertical Bar Chart or Column Chart -
Set up the frequency distribution then use the midpoints for labels
Double click the chart icon and select a column graph type
Select the frequency without labels as the data
Select the Series tab, mouse into the X-axis label box then select the midpoints
Select Next to insert the title and axis labels and make any other changes
Select Next to pick a location for the chart then Finish
Double click a bar and select Options, set gap width to 0

Polygon also known as a line graph -
Set up the frequency distribution then use the midpoints for labels.
Insert a class with O frequency and an appropriate label at the top and the bottom.
Double click the chart icon and select a line graph type
Select the frequency without labels as the data
Select the Series tab, mouse into the X-axis label box then select the midpoints
Select Next to insert the title and axis labels and make any other changes
Select Next to pick a location for the chart then Finish

Ogive also known as a cumulative line graph or cumulative polygon
Set up the cumulative frequency distribution use the upper class limit for labels.
Insert a class with O frequency and an appropriate label at the top but not the bottom.
Double click the chart icon and select a line graph type and complete the steps

XY ScatterSet up the data in columns with the X values first and the Y in the second column
Double click the chart icon and select XY Scatter graph
Select both columns as the data, do not select the labels, and complete the steps

Bar ChartSame as Histogram but for categorical data.
Use the category labels: if not numerical values they can be selected with the data.

Pie ChartSame as above. Be sure to remove legend, select Data Labels, check Category name

Pareto Chart Raw Data: use line chart on 2 axis or
Select Descriptive Statistics | One-Way Tables & Charts…
Be sure to select labels as the model will not work otherwise
Check table of frequencies and Pareto Diagram

Bivariate Categorical Tables and Charts Use PHStat (also available in Excel - Data | Pivot Wizard)
In PHStat select Descriptive Statistics | Two-Way Tables & Charts

Chapter 3 Numerical Descriptive Measures

Use Tools | Data Analysis | Descriptive Statistics, check the Summary statistics box to get the following:

sample mean, median, mode, standard deviation, variance, range
population mean, median, mode, range

Use fx the individual functions for the following measures

geometric mean (GEOMEAN), population variance (VARP) and standard deviation (STDEVP)

approximate quartiles (QUARTILE), approximate percentiles (PERCENTILE)

Coefficient of variation: Divide the standard deviation by the mean and multiply by 100%

Box-and-Whisker Plot and Five-Number Summary

PHStat | Descriptive Statistics | Box-and-Whisker Plot then check Five-Number Summary
Gives the exact quartiles not approximations

Coefficient of Correlation: fx (CORREL), or Tools | Data Analysis | Correlation

Chapter 4 Basic Probability

Probability of A or B:If A and B are Mutually Exclusive:

Conditional probability of A given B:If A and B are Independent:

Joint Probability of A and B:If A and B are Independent:

Bayes' Theorem

Chapter 5 Some Important Discrete Probability Distributions

Combinations:

Binomial distribution: (for an infinite population)
PHStat | Probability & Prob. Distributions | Binomial then check Cumulative Probabilities

Hypergeometric distribution: (for a finite population)
PHStat | Probability & Prob. Distributions | Hypergeometric no cumulative probabilities available

Poisson distribution:
PHStat | Probability & Prob. Distributions | Poisson then check Cumulative Probabilities-

Chapter 6 The Normal Distribution and Other Continuous Distributions

Normal Distribution

PHStat | Probability & Prob. Distributions | Normal then check the desired calculation
To check the normality assumption construct a stem-and-leaf, box-and-whisker, histogram or a
Normal probability plot PHStat | Probability & Prob. Distributions | Normal Probability Plot

Uniform Distribution

where a and b are the endpoints of the uniform distribution.

Exponential distribution

PHStat | Probability & Prob. Distributions | Exponential

Only returns results for  X, for > x use 1-probability, for results between two values find the
probability for each and subtract the smaller from the larger

Sampling distribution of the mean

Calculate the standard deviation of the sampling distribution also called the Standard error of the mean then use the Normal Distribution calculator if the population is normally distributed or
the sample size is > 30 or the population distribution is symmetrical and the sample size is > 15

Infinite population Finite population

Sampling distribution of the proportion:

Calculate the standard deviation of the sampling distribution (Standard Error of the Mean) then
If np > 5 and n(1-p) > 5 use the Normal Distribution calculator PHStat | Probability & Prob. Distributions | Normal

ps = sample proportion p = population proportion

Infinite population Finite population

Chapter 7 Confidence Interval Estimation

Interval estimate of the population mean (x) with x unknown:

PHStat | Confidence Intervals | Estimate for the Mean, sigma unknown
be sure to check the finite box for finite populations

Interval estimate of the population proportion:

PHStat | Confidence Intervals | Estimate for the Proportion
be sure to check the finite box for finite populations

Interval estimate of the population total:

PHStat | Confidence Intervals | Estimate for the Population Total

Sample size (n) for estimating a mean:

PHStat | Sample Size | Determination for the Mean
be sure to check the finite box for finite populations
Estimate of parameters would be from a preliminary sample

Sample size for estimating a proportion:

PHStat | Sample Size | Determination for the Proportion
be sure to check the finite box for finite populations
Estimate of True Proportion would be the proportion from a preliminary sample

If a preliminary sample is not available use .5

Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests

One Sample numerical data  unknown

Hypothesis

Ho: x = value a two tail testHo: x value Ha: x value upper tail test

Ha: x valueHo: x value Ha: x value lower tail test

Test Statistict

ProcedureSummary Data: PHStat | One-Sample Tests | t Test for the Mean, sigma unknown

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked).

Parentheses indicate information to be taken from the problem

One Sample Categorical Data

Hypothesis

Ho: p = value a two tail testHo: p value Ha: p value upper tail test

Ha: p valueHo: p value Ha: p value lower tail test

Test StatisticZ

ProcedureSummary Data: PHStat | One-Sample Tests | Z Test for the Proportion
Raw Data: No Tests available, calculate p and use PHStat

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficientevidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Parentheses indicate information to be taken from the problem

Chapter 9 Two-Sample Tests

Procedure to determine the proper two sample mean test for numerical data:

Two Sample test of Means with Paired numerical data

HypothesisHo: 1 = 2 a two tail testHo: 12 Ha: 12 upper tail test

Ha: 12Ho: 12 Ha: 12 lower tail test

ProcedureSummary Data:no PHStat calculation available

Raw Data: Data Analysis | t Test: Paired Two Sample for Means

Test Statistict

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Interval estimate of the difference To get t use function TINV(1-Confidence, df)

Use Descriptive Statistics to get D and sd

Or PhStat | Confidence Intervals | Estimate for the Mean, sigma unknown - Select the differences as the data

Two Sample test of Variances with numerical data

HypothesisHo: 21 = 22 a two tail testHo: 2122 Ha: 2122 upper tail test

Ha: 2122 Ho: 2122 Ha: 2122 lower tail test

ProcedureSummary Data:PHStat | Two-Sample Tests | F Test for the Difference in Two Variances

Raw data: Data Analysis | F Test Two Sample for Variances Do not use only gives lower tail value

Test StatisticF

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected–There is not sufficient evidence that (Question asked)

Two Sample test of Means with numerical data2’s not proven unequal with the F test

HypothesisHo: 1 = 2 a two tail testHo: 12 Ha: 12 upper tail test

Ha: 12Ho: 12 Ha: 12 lower tail test

ProcedureSummary Data:PHStat | Two-Sample Tests | t Test for Differences in Two Means

Raw Data: Data Analysis | t Test: Two Sample Assuming Equal Variances

Test Statistict

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Interval estimate of the difference

To get t use function TINV(1-Confidence, df)

Two Sample test of Means with numerical data2‘s proven unequal with the F test

HypothesisHo: 1 = 2 a two tail testHo: 12 Ha: 12 upper tail test

Ha: 12Ho: 12 Ha: 12 lower tail test

ProcedureSummary Data: Use spreadsheet downloaded from the Homework web page

Raw Data: Data Analysis | t Test: Two Sample Assuming Unequal Variances

Test Statistict

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Interval estimate of the difference

To get t use function TINV(1-Confidence, df)

Two Sample test of a Proportion with categorical data

HypothesisHo: p1 = p2a two tail testHo: p1  p2 Ha: p1 p2 upper tail test

Ha: p1 p2Ho: p1  p2 Ha: p1 p2 lower tail test

Procedure PHStat | Two-Sample Tests | Z Test for the Differences in Two Proportions

Test StatisticZ

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Interval estimate of the difference

To get Z use function NORMSINV(two tail)
where two tail=Confidence+(1-Confidence)/2

Chapter 10 Analysis of Variance (Multi (c) Sample tests with numerical data)
Equality of Variances

HypothesisHo: 21 = 22= 23 a two tail test

Ha: not all ’s are equal

ProcedureRaw data:PHStat | Multiple-Sample Tests | Levene’s Test

Test StatisticF

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected–There is not sufficient evidence that (Question asked)

One Factor ANOVA

HypothesisHo: 1 = 2 = 3 … = cc = the number of populations

Ha: not all ’s are equal

ProcedureTools | Data Analysis |Anova: Single Factor

Test StatisticF from the computer printoutP-value = The Probability of

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Tukey's multiple comparison method: (determines which of the c means are different from each other).

ProcedurePHStat | Multiple-Sample Tests | Tukey-Kramer Procedure

Test StatisticCritical Range

Input Q found in the Studentized Range Table where column = c and row = n-c

c = number of groups n = total number of data points in all groups

Decision RuleIf the absolute difference between any two pairs of means is greater than the critical range the pair is different.

Two Factor With Replication

HypothesisHo1: A1 = A2 = A3 … = rr = the number of levels in Factor A

Ha1: not all ’s are equal

Ho2: B1 = B2 = B3 … = cc = the number of levels in Factor B

Ha2: not all ’s are equal

Ho3: No Interaction
Ha3: Interaction

ProcedureTools | Data Analysis |Anova: Two Factor With Replication

Test StatisticF from the computer printout. p-value = The Probability of
For differences in rows see p-value for the Sample row of the ANOVA
For differences in columns see p-value for the Columns row of the ANOVA
For interaction between factors see p-value for the Interaction row of the ANOVA

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionH1 If rejected – There is sufficient evidence of a difference in (factor A)
H2 If rejected – There is sufficient evidence of a difference in (factor B)
H3 If rejected – There is sufficient evidence of an interaction term
If not rejected – There is not sufficient evidence to make a conclusion about …

Tukey's multiple comparison method for Two Factor ANOVA with replication:
No spreadsheet, hand calculate with the following formulas:

MSW from ANOVA MS Within
Q table column is r the number of levels in Factor A
Q table row is rc(n’-1) where c is the levels in Factor B, and n’ is the number of replications

MSW from ANOVA MS Within
Q table column is c the number of levels in Factor B
Q table row is rc(n’-1) where r is the levels in Factor A, and n’ is the number of replications

Chapter 11 Chi-Square Tests and Nonparametric Tests

Two Sample test of a Proportion with categorical data(Alternate Procedure)

HypothesisHo: p1 = p2Ha: p1 p2 (No <, or > Hypothesis)

Procedure PHStat | Two-Sample Tests | Chi-Square Test for the Differences in Two Proportions

Test Statistic2

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Multi (c) Sample test of Proportions with categorical data

HypothesisHo: p1 = p2 = p3… pcc = the number of samples

Ha: not all p’s are equal

Procedure PHStat | Multiple-Sample Tests | Chi-Square Test

Test Statistic2

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Be sure to check the box for the Marascuilo Procedure to determine which proportions are different.

2 Test of Independence

HypothesisHo: Two categorical variables are independent

Ha: Two categorical variables are related

Procedure PHStat | Multiple-Sample Tests | Chi-Square Test

Test Statistic2

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that the variables are related
If not rejected – There is not sufficient evidence that the variables are related.

Two Sample test of Medians with numerical data

HypothesisHo: M1 = M2 a two tail testHo: M1 M2 Ha: M1 M2 upper tail test

Ha: M1 M2Ho: M1 M2 Ha: M1 M2 lower tail test

ProcedureRaw Data PHStat | Two-Sample Tests | Wilcoxon Rank Sum Test

Summary DataNo Tests available.

Test StatisticZ

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Kruskal-Wallis Rank Test for Differences Between c Medians

HypothesisHo: M1 = M2 = M3 = MC

Ha: Not all Mj are equal ( j=1,2,…C)

ProcedureRaw Data PHStat | Multiple-Sample Tests | Kruskal-Wallis Rank Test

Summary DataNo PHStat or Excel calculation available

Test StatisticH

Decision Rule If the p-value is less than alpha Reject the Hypothesis
If the p-value is greater than or equal to alpha Fail to Reject the Hypothesis

ConclusionIf rejected – There is sufficient evidence that (Question asked)
If not rejected – There is not sufficient evidence that (Question asked)

Chapter 12 Simple Linear Regression

Linear Regression Model: relationship represented as