Steps in Hypothesis Testing
- State H0 and H1.
- Set alpha ().
- Gather data for the test.
- Calculate the test statistic.
- Use a Decision Rule or P-value to reach a conclusion.
ZTest of Hypothesis for the Mean ( known)
(Sections 9.2 and 9.3 of the text)
(Use Z-TEST FOR THE MEAN, SIGMA KNOWN in the EXCEL addin PHSTAT)
- H0: = 0
H1: 0
- Set alpha ().
- Gather data for the test.
- Test Statistic Z = ( - 0)//
5. Look up Z for /2 in Z table. If the absolute value of the test statistic Z is greater than the Z from the table or [P-value < ], reject H0; that is, there is evidence that the population has different mean than hypothesized (0). If you don’t reject H0, assume the mean is as hypothesized. *
************************************
isthe true mean in the population.
0 is the hypothesized mean in the population (Note: your text just uses the symbol in the step 4 equation).
is the mean in a sample from the population.
is the true standard deviation in the population.
n is the number in the sample taken from the population.
************************************
* To convert to one tail, look up Z for . Then if upper tail, check if the test statistic Z is greater than the Z from the table. If lower tail, check if the test statistic Z is less than the minus Z from the table.
t Test of Hypothesis for the Mean ( unknown)
(Section 9.4 of the text)
(Use T-TEST FOR THE MEAN, SIGMA UNKNOWN in the EXCEL addin PHSTAT)
- H0: = 0
H1: 0
- Set alpha ().
- Gather data for the test.
- Test Statistic t = ( - 0)/ s/
5. Look up t (column = /2, row = n-1) in t table. If the absolute value of the test statistic t is greater than the t from the table, reject H0; that is, there is evidence that the population has different mean than hypothesized (0). If you don’t reject H0, assume the mean is as hypothesized. *
************************************
isthe true mean in the population.
0 is the hypothesized mean in the population.
is the mean in a sample from the population. (Note: your text just uses the symbol in the step 4 equation).
s is the estimate of the standard deviation in a sample from the population.
n is the number in the sample taken from the population.
************************************
* To convert to one tail, look up t for and row n-1. Then if upper tail, check if the test statistic t is greater than the t from the table. If lower tail, check if the test statistic t is less than the minus t from the table.
Z Test of Hypothesis for the Proportion (Section 9.5 of the text)
(Use Z-TEST FOR THE PROPORTION in the EXCEL addin PHSTAT)
- H0: = 0
H1: 0
- Set alpha ().
- Gather data for the test.
- Test Statistic Z = (p - 0)/Square Root (0 (1- 0)/n).
5. Look up Z for (1- /2) in Z table. If the absolute value of the test statistic Z is greater than the Z from the table or [P-value < ], reject H0; that is, there is evidence that the population has a different proportion than hypothesized (0). If you don’t reject H0, assume the population proportion is as hypothesized. *
***********************************
isthe true proportion in the population.
0 is the hypothesized proportion in the population. (Note: your text just uses the symbol in the step 4 equation).
pis the proportion as estimated from a sample of the population.
n is the number in the sample taken from the population.
************************************
* To convert to one tail, look up Z for . Then if upper tail, check if the test statistic Z is greater than the Z from the table. If lower tail, check if the test statistic Z is less than the minus Z from the table.
To Determine Which T-Test to Use,
First use this test:
F Test for Differences in Two Variances
(Be sure the first variable has the largest variance)[1] (section xx)
(Use F TEST TWO SAMPLE FOR VARIANCES inTOOLS in EXCEL)
- H0: 12 = 22
H1: 12 22
- Set alpha ().
- Gather data for the test.
- Test statistic F = s12 /s22.
- If [F > F (page = /2, column=n1-1, row = n2-1)] or [P-value < ], reject H0; that is, assume variances are not equal. If you don’t reject H0, assume the variances are equal.
************************************
1 and2 are the standard deviations in population 1 and 2.
s1 ands2 are the standard deviations in sample 1 and 2.
n1 andn2 are the number in sample 1 and 2.
Pooled-VarianceT Test for Differences in Two Means(It assumes the variances are equal)
(Often called the pooled variance t-test) (Text Section xx)
(Use T-TEST TWO SAMPLE ASSUMING EQUAL VARIANCE in TOOLS in EXCEL )
1. H0: 1 - 2 = d
H1: 1 - 2d
2. Set alpha ().
3. Gather data for the test.
4. t = ((1 - 2)- d )/square root (sp2/n1+sp2/n2 )
where
sp2=[(n1-1)s12 + (n2-1)s22]/[ n1+ n2 -2]
5. Look up t (column = /2, row = n1 + n2 - 2) in t table. If [- table t < t < + table t] or [P-value < ], reject H0. That is, there is evidence that the groups have different means. If you don’t reject H0, assume the means are equal.
************************************
1 and2 are the true means in population 1 and 2.
d is the hypothesized difference between the means in population 1 and 2. (This is usually 0.)
1 and2 are the means in sample 1 and 2.
sp2 is the pooled variance from combining groups 1 and 2
s1 ands2 are the standard deviations in sample 1 and 2.
n1 andn2 are the number in sample 1 and 2.
T Test for Differences in Two Means
(It assumes the variances are NOT equal)
(Not fully explained in section xx in the text)
(Use T-TEST TWO SAMPLE ASSUMING UNEQUAL VARIANCE in TOOLS in EXCEL)
- H0: 1 - 2 = d
H1: 1 - 2d
- Set alpha ().
- Gather data for the test.
- Test Statistic t = ((1 - 2)- d )/ square root (s12/n1+s22/n2).
- Look up t (column = /2, row = df) in t table. If [- table t < t < + table t] or [P-value < ], reject H0. That is, there is evidence that the groups have different means. If you don’t reject H0, assume the means are equal. In the table lookup, the row number must be revised (from n1 + n2 - 2) due to unequal variances; calculate it as follows:
df=[s12/n1+s22/n2 ]2/[((s12/n1)2/(n1-1)) + ((s22/n2 )2/(n2-1))]
************************************
1 and2 are the true means in population 1 and 2.
d is the hypothesized difference between the means in population 1 and 2. (This is usually 0.)
1 and2 the means in sample 1 and 2.
s1 ands2 are the standard deviations in sample 1 and 2.
n1 andn2 are the number in sample 1 and 2.
F Test for Differences in C Means
(Often called Analysis of Variance F Test or ANOVA)
(Use ANOVA: SINGLE FACTOR in TOOLS in EXCEL)
(Section 10.5)
- H0: 1 = 2 =... = c
H1: at least one mean is different.
- Set alpha ().
- Gather data for the test.
- Test statistic F = (MS Between)/(MS Within)
- If [F > F table value (page = , column=c-1, row = n-c)] or [P-value < ], reject H0 . That is, there is evidence that the groups have different means. If you don’t reject H0, assume the means are equal.
If you find that there is a difference between groups using ANOVA, find the right Q in (table = , column =c and row n-c) and use the TUKEY-KRAMER in PHSTAT to find which groups are different.
************************************
i is the mean in population i.
c is the number of different groups.
n is the total number of data points in all samples combined.
MS Between is the Mean Squared Error Between Groups
MS Within is the Mean Squared Error Within Groups
Z Test for the Difference in Two Proportions
(Text Section xx)
(Use Z-TEST FOR THE DIFFERENCES IN TWO PROPORTIONS in PHSTAT IN EXCEL )
- H0: p1 - p2 = pd
H1: p1 - p2 pd
- Set alpha ().
- Gather data for the test.
- Z = ((ps1 - ps2) - pd)/ square root ((1-) (1/n1+1/n2 )) where
= (n1 ps1 +n2 ps2)/ (n1 +n2)
5. Look up Z for (1- /2) in Z table. If [- table Z < Z < + table Z] or [P-value < ], reject H0. That is, there is evidence that the groups have different proportions. If you don’t reject H0, assume the proportions are equal.
************************************
p1 andp2 are the true proportions in population 1 and 2.
pd is the hypothesized difference between the proportions in population 1 and 2. (This is usually 0.)
ps1 andps2 are the proportions in sample 1 and 2.
is the average proportion in sample 1 and 2 combined.
n1 andn2 are the number in sample 1 and 2.
Chi Squared (2) Test for Differences in c Proportions
(Often just called chi squared test)(Text Section 11.2)
(Use CHI SQUARED TEST under C SAMPLE TESTS in the PHSTAT addin for EXCEL. Do that first, then enter the number of rows and columns of data and finally add the values in each cell.)
- H0: 1 = 2 =... = c
H1: at least one proportion is different.
- Set alpha ().
- Gather data for the test.
- Test statistic is 2 = (fo-fe)2/ fe summedover all cells.
5. If [2 > 2(column = , row = (r-1)(c-1))] or [P-value < ], reject H0.That is, there is evidence that the groups have different proportions; if so, you should be able to use the Marasculio procedure. If you don’t reject H0, assume the proportions are equal.
************************************
i is the proportion in population i.
fo is the observed frequency in each cell
fe is the expected frequency in each cell.
c is the number of columns (groups) in the cross tabulation table.
r is the number of rows in the cross tabulation table.
TTest for the Slope
(Hypothesis Test on a Regression Coefficient)
(An option to request when using Regression in the PHSTAT addin in EXCEL).
(Text Section 12.7)
- H0: i = 0 (no linear relationship)
H1: i 0 (a linear relationship)
- Set alpha ().
- Gather data for the test.
- Test statistic t = bi / sbi
5. Look up t (column = /2, row = n-2) in t table. If the absolute value of the test statistic t is greater than the t from the table or [P-value < ], reject H0. That is, there is evidence that there is a linear relationship. If you don’t reject H0, there is not enough evidence to show that there is a linear relationship.
************************************
i is the actual regression coefficient in the population.
bi is the regression coefficient in the sample.
sbi is the standard error of the sample regression coefficient.
n is the number in the sample.
Test for the Significance of the Multiple Regression Model
(Hypothesis Test on the Overall Regression)
(An option with Multiple Regression in the PHSTATS addin in EXCEL).
(Text Section 13.4)
- H0: 1 = 2 = .... = p = 0 (No linear relationships at all)
H1: at least one linear relationship.
- Set alpha ().
- Gather data for the test.
4. Test Statistic F = (MS Regression) /(MS Residual)
5. If [F > the F in table (page = , column = p, row = n - p - 1)] or [P-value < ], reject H0. That is, there is evidence that at least one linear relationship exists. If you don’t reject H0, there is not enough evidence to show any linear relationships exist.
************************************
i is the actual regression coefficient in the population.
p is the number of predictors.
n is the number in the sample.
MS Regression is the Mean Squares explained by the Regression.
MS Residual is the Mean Squares of the remaining Error.
Durbin-Watson Statistic.
(Also known as the Durbin-Watson Test)
(An option with Regression in the PHSTATS addin in EXCEL).
(Text Section 12.6)
1. H0: The Residuals are Independent
H1: The Residuals are not Independent
2. Set alpha ().
3. Gather data for the test as part of a regression analysis.
4. Test Statistic D = (ei - e i-1)2/ ei2
5. If [ D < dl (table = , column = p (use dl subcolumn), row = n in table E.9)], reject H0. That is, there is evidence that the residuals are not independent. If you don’t reject H0, assume the residuals are independent.
************************************
ei is the error for residual i.
n is the number in the sample.
p is the number of predictors.
T-Test for Significance of Association.
(Hypothesis Test on a Correlation Coefficient)
(Not available in EXCEL).
(Text Section 12.7)
- H0: = 0 (No linear relationship)
H1: 0 (A linear relationship exists)
- Set alpha ().
- Gather data for the test.
- Test Statistic t = r / square root ((1-r2)/(n-2))
5. Look up t (column = /2, row = n-2) in t table. If the absolute value of the test statistic t is greater than the t from the table, reject H0. That is, there is evidence that there is a linear relationship. If you don’t reject H0, you do not have enough evidence to show that there is a linear relationship.
************************************
is the actual correlation coefficient in the population.
r is the correlation coefficient in the sample.
n is the number in the sample.
October 25, 2005
[1] Note: This is a little different than the text. But by putting the largest variance first it is easier to look up the F statistic and you only have to look up one F statistic rather than two. Also then the procedure is like that used in the EXCEL addin.