Chapters 8 and 9 Handout Hypothesis Testing
Student Handout
Chapter 8 and 9 provides statistical methods using sample data to evaluate a hypothesis about a population parameter. It is a process to determine if a sample is likely to come from a given population or did it come from a different population due to a treatment. Testing of hypotheses is performed by determining if a sample is likely or unlikely to occur. We make this determination by finding the z-score. In general, if the, it is likely to be true; if, it is not likely to be true. In these chapters we will make the decision if something is likely or unlikely based on the level of significance and how much error we are willing to accept.
When we test a hypothesis we always write a report. The test report consists of a claim statement, hypothesis statement, level of significance, a sketch, critical values, the test, the p-value, and a conclusion. Each of these will be discussed below. We will complete the Hypothesis Test Report for every test we do.
Testing hypothesis will use the z-distribution and the t-distribution. If the population variance, standard deviation (σ) or the sum of squares (SS) is known we use the z-distribution calculating the z-score as we did in chapters 7. If the population variance or standard deviation is not known and the data appears normal we will use the t-distribution. To calculate the t-score we use the following formulas: . Note this is the same formula replacing the population standard deviation (σ) with the sample standard deviation (s) which is the best estimator of σ. To obtain the t-scores we use Table B.2 on page A-27 in your text. To use Table B.2 you must know the degrees of freedom (n -1) and the level of significance which is provided in the problems. If the level of significance is not provided use the value of 0.05 or 5%.
What is likely? If one has a distribution of sample means taken from some population and a sample of size n is selected from this population, what do you think the value of the sample mean will be?
What do you think the value of the sample mean will be 68% or 95% or 99% of the time?
Suppose we applied a treatment to the population, what do you think will happen to the population mean? For example if you were studying the effects of vitamin C and the common cold, what do you think the effect would be relative to the number of colds one would get?
The philosophy of the hypothesis testing is to determine if a random sample comes from the population we believe is true. We will do this by writing a Test of Hypothesis report. The sequence of steps is provided as follows:
1. The Claim – a statement to the effect of what we think is true. This statement is written as a simple sentence. We usually find the claim in the problem statement. Once we state our claim we must prove or disprove it.
Examples of claims could be:
· Tide gets clothes whiter
· Take these pills to reduce the number of common colds one will get.
· This gasoline additive will not change the efficiency of a family car
· Take this blue pill and you will increase the chances of having a boy baby.
· If a student takes this test in a yellow room his/her grade will be better.
2. The Hypothesis: The hypothesis is written in two parts – the null hypothesis (H0) and the alternate hypothesis (H1). The null hypothesis is a statement of what we will test against; what we believe is true. The alternate hypothesis is what is true if the null hypothesis is not true. The hypothesis is written as a mathematical inequality and/or an equality statement. The null hypothesis and alternate hypothesis always occur in pairs and include the entire domain of the variable. The null hypothesis always contains equality whereas the alternate hypothesis never has equality.
The Null Hypothesis (H0): The null hypothesis is what we test against, what we believe is true and what we base our conclusions on. We determine the null hypothesis by looking at the claim. If the claim includes equality it becomes the null hypothesis. If the claim does not include equality it becomes the alternate hypothesis. The null hypothesis will always be of the form: H0: μ = a μ ≤ a μ ≥ a Each of these statements contain equality.
The Alternate Hypothesis (H1): The alternate is what we believe is true if the null hypothesis is not accepted. H1 never includes equality and is paired to a null hypothesis statement as follows:
If H0: μ = a μ ≤ a μ ≥ a
then H1: μ ≠ a μ > a μ < a
Notice the null hypothesis and the alternate hypothesis are compliments of each other and together include the total domain.
3. Level of Significance (LOS): The level of significance is the proportion, percent or probability that we can be wrong. The LOS is written in terms of the Greek letter α (alpha). It is usually stated in the problem. If it is not stated in the problem then assume α = 0.05. The LOS is the probability or percent of the time we could reject a true hypothesis. It is the probability that the hypothesis Ho can be wrong. If α = 0.04 then we are willing to accept that we can be wrong 4% of the time.
4. The Sketch - The sketch is a picture of the normal curve showing the Population mean (μ), the standard deviation, σ or s and the critical areas. The critical areas are areas where the null hypothesis will be rejected. The critical areas are determined from the alternate hypothesis (H1). If H1 is μ ≠ a then there is a two tail test and α is divided equally between the left and right tail.
If H1 is μ < a then there is a one tail test and all of α is in the left tail. The inequality points to the rejection tail.
If H1 is μ > a then it is a one-tail test and the rejection area is the right tail since the inequality points to the right.
5. Critical Value(s): The critical values (CV) are the z or t values that separate the critical area (or rejection area) from the acceptance area. We obtain the CV from Table B1 in back of the book for a z-test or Table B2 for a t-test.
6. The Test: The test calculates the z-score or the t-score. If we know σ we will use the z-score and the formula, . If we do not know σ and the data appear normal then we use the t-score. The formula is .
Remember M is the sample mean and μ is the population mean. The population mean, µ, is the value in the null hypothesis (H0), σ is the population standard deviation and s is the sample standard deviation. The term sM = is the sample standard error. The formula is:
7. The p-value: The p-value is the proportion of the normal distribution in the tail of the z-score or t-score multiplied by the number of tails in the critical areas. If the p-value is greater than α, the test value is not extreme and is not in the critical area. If the p-value is less than α then the test value is extreme and in the critical or rejection area.
8. The conclusion: The conclusion is a statement based on the results of the test, the hypothesis, and the critical values. It states the reasoning for accepting or rejecting the null hypothesis. If the test value is in the critical region then we do not accept the null hypothesis. If the test value is not in the critical region then we do not reject the null hypothesis. Your conclusion must state the values of the test value and how it relates to the critical values and whether you do not reject or do not accept the null hypothesis. The conclusion always refers back to the null Hypothesis H0.
Example: The Hudson Valley Bottling Company distributes root beer in bottles labeled 32 oz. The bottling company states they are distributing soda with an average μ = 32 oz and a standard deviation σ = 0.75 oz. The Bureau of Weights and Measures randomly selected 50 bottles and measured their contents. The sample mean was 31.8 oz. The Bureau of Weights and Measures claims the company is cheating their customers? Write a report using α = 0.05 level of significance. Should charges be filed?
Claim:
Hypothesis:
Level of Significance:
Critical Value:
Test:
p-value
Conclusion:
Example: A “AAA” battery manufacturer believes that their product lasts an average of 51 hours with a standard deviation 4 hours. They hope the mean time is not less than 51 hrs (because the customers will buy the competitors batteries) nor more than 51 hours (because the customers will buy fewer batteries.) A random sample of 35 batteries is tested and yields a sample mean of 52.2 hours. Test the hypothesis at the 2% significance level (α = 0.02).
Example: A bus company advertised a mean time of 150 minutes from Baltimore to Philadelphia. A consumer group, having received many complaints from travelers, claims the actual time is greater than 150 minutes. A random sample of 28 trips shows a mean time of 153 minutes and a standard deviation of 7.5 minutes. Using a 5% level of significance, is their sufficient evidence to support the consumer groups claim?
Example: When performing destructive testing they usually use small samples due to expense. Five BMW automobiles were crash tested under standard conditions to determine repair costs. The results of these tests showed repair costs as follows: $797 $571 $904 $1147 and $418. Use a 0.05 significance level to test the claim that the mean repair cost for all BMW cars is less than $1000. Should BMW advertise this claim?
Types of Errors : There are two types of errors one can commit when testing a hypothesis – the α-error (Type I Error) and the β-error (Type II Error.)
The α-error or Type I error is the probability of rejecting a true hypothesis or fail to accept a true hypothesis. If our test result is in the critical region then we will reject the null hypothesis. Since it is possible to obtain a sample of outliers by chance alone we would be rejecting a true hypothesis. The α-error is easy to control by adjusting our level of significance. If we make the level of significance smaller we reduce the chance of rejecting a true hypothesis.
The β-error or Type II error is the probability of accepting a false hypothesis or we fail to reject H0 that is really false. This occurs whenever our null hypothesis is wrong or false and by chance we get a test result which is in the acceptance area. Our conclusion will accept the null hypothesis. Since the null hypothesis is wrong we accepted a false hypothesis. The β-error is difficult to control because we do not know if the null hypothesis is wrong. To reduce the chance of a β-error we must increase sample size. This will reduce the standard error. The larger the sample size (n) the less chance of committing a β-error.
DECISION TABLE
Decision H0 is True H0 is False
Reject H0 Type 1 error Correct Decision
α–error
Accept H0 Correct Decision Type II error
β-error
Measuring Effect Size: One concern with hypothesis testing is that it does not evaluate the absolute effect of the treatment. To correct this we calculate Cohen’s d as follows: . Cohen’s d tells us the effect of the treatment in terms of the standard deviation. If we are performing a t-test then use the sample standard deviation, s, in the denominator. A small effect is when d < 0.2. A medium effect is when 0.2 < d < 0.8 and a large effect is when d > 0.8. If Cohen’s d = 0.5 we say that the treatment changed the scores by one-half of the standard deviation.
Explained variance r2: The explained variance for a t-test is determined by the formula . The result of r2 is the percent of the variance caused by the treatment. A small effect is when r2 < 0.09. A medium effect is when 0.09 < r2 < 0.25 and a large effect is when r2 > .25. The value of r2 cannot exceed 1.0 If r2 = 0.785 then we say that 78.5% of the variance is caused by the treatment.
J. Gilbert Page 7 of 7
11/1/2006