252onesl 9/20/07 (Open this document in 'Outline' view!)
B. HYPOTHESIS TESTS FOR ONE SAMPLE
1. The Meaning of Hypothesis Testing
A hypothesis is a statement about the characteristics of a population.
To be of any use to us it must be quantifiable and testable.
The hypothesis to be tested is usually called the Null Hypothesis .
A rival hypothesis is called the Alternative or Research Hypothesis
or . It is usually true that the null hypothesis will be a hypothesis
of "no difference" and the alternative hypothesis covers all other
possibilities.
To start out ask "What do I want to know about a population or
populations?" Can I state this in terms of population parameters?
Can I state this in terms of a testable hypothesis (null hypothesis)?
Does my null hypothesis say that these parameters or differences
between these parameters are insignificant (i.e. not distinct from
zero)?
Next ask "What am I assuming about the population or populations?"
Are the parameters that I am testing appropriate to the type of
population that I am assuming? What can I use to test my
hypothesis? Can I find a sample statistic or statistics to do the job ?
How many samples do I need? Can I calculate the sample statistic
or statistics.? What distribution does the test statistic have? Is this
in accord with the null hypothesis? What errors am I likely to make?
Usually there are three approaches to hypothesis testing
involving a statement about a parameter of a population:
(i) The Test Ratio Method, in which a ratio involving an
estimate of a parameter is tested against a well known
distribution like t;
(ii) The Critical Value Method, in which values of
estimates of a parameter are found which could lead to
rejection of ;
and
(iii) The Confidence Interval Method, in which a
confidence interval is constructed for the parameter and
compared to the value in the null hypothesis.
If I use a test ratio, what is the probability of getting values as
extreme or more extreme than I actually got? (This probability is
the p-value: the lower it is the less likely it is that the null
hypothesis is true. If the p-value falls below the significance level,
I can say that I reject the null hypothesis.)
If I use a test statistic and my significance level is 5% or 1%,
did the value fall among the most likely 95% or 99% of values? Or
was it a very unlikely value?
If I use a confidence interval, did the parameter value in my
null hypothesis fall in the confidence interval?
Remember:
a. A null hypothesis is usually a statement
about a parameter of a population. It is
never a statement about a sample statistic.
A sample statistic is used to test the
hypothesis.
b. A null hypothesis usually contains an
equality, an alternate hypothesis does not
contain an equality.
c. A null hypothesis often says that a
parameter or a difference between parameters
is insignificant. If a result is significant we
reject the null hypothesis.
2. Steps for Testing a Hypothesis Applied to testing
for a Population Mean
a.Outline
i. State the problem as two hypotheses.
ii. Quantify the hypotheses.
iii. Identify the statistic, ratio or interval to be used
iv. Determine the sampling distribution of the statistic to
be used.
v. Select a level of significance.
vi. Find a value or values of the test ratio or statistic that
would lead to rejection of the null hypothesis.
vii. Compute a value of the statistic or ratio from a
random sample
viii. By comparing the results of (vii) with the values found
in (vi) reject or do not reject (‘accept’) the null hypothesis.
b. Application to a Population Mean
To test against . Assume that we
have computed from our sample, and that we do not know
.
i. Test Ratio:
ii. Critical Value:
iii. Confidence Interval:
Note: If , the population standard deviation, is
known, replace and with and .
c. One-sided Tests.
To test against or against
, if you use a critical value or a confidence
interval you must use a one-sided one. Replace with .
One-sided tests take more thinking than two-sided test, and the
most common error is in stating the null hypothesis. In a problem
statement, the question asked is often the alternative hypothesis,
not the null. Always ask yourself if the statement contains a strict
inequality. If it does it cannot be a null hypothesis.
Examples:
Question: Is the mean income less than 20000?
Question: Is the mean income at least 20000?
Question: Is the mean income more than 20000?
Question: Is the mean income at most 20000?
3. The Use of p-value Instead of Significance Levels.
A p-value is a measure of the credibility of the null hypothesis and is
defined as the probability that a test statistic or ratio as as
or than the observed statistic or ratio could occur,
assuming that the null hypothesis is true.
Note: If we have a p-value and want to do a conventional hypothesis
test, we can reject the null hypothesis if the p-value is below the
significance level. The p-value can thus be said to represent the
smallest level of significance at which the null hypothesis can be
rejected.
Other interpretations are:
a) (i) If we strongly doubt ,
(ii) If we somewhat doubt , and
(iii) If we cannot doubt ; or
b) (i) If results are very significant,
(ii) If results are significant,
(iii) If results are marginally significant
and
(iv) if results are not significant.
For an example using t see 252onesx0. For an example using z replace t with z in this paragraph and see 252doctor
4. Type One and Type Two Errors
a. Definitions
A Type one error is rejecting when is true.
A Type two error is not rejecting when is false.
/ / //
Do not reject
/Not an error
/Type II Error
/Reject
/Type I error
/Not an error
b. Probabilities
/ / // / /
/ / /
5. Hypotheses about a Proportion
a. Tests: (For an example see 252onesx1)
To test against
i. Test Ratio:
ii. Critical Value:
iii. Confidence Interval:
(b. Continuity Correction.
The continuity correction acts to expand the 'accept' interval
by in each direction. It should be used if .
i. Test Ratio:
This is the same as testing
against
ii. Critical Value:
iii. Confidence Interval: )
6. The Sign Test
a. The Sign Test for a Median.
To test against
In any distribution outside of the normal distribution, it is
usually easier to use the p-value approach. For example, let
us assume that we are testing the hypotheses
and , where is, as before, the median. The
most important fact to know about testing for a median is that
numbers above and below a median are equally likely to
occur in a random sample.
A test of the median is a test of the proportion of points above
or below the alleged median.
So let us use as the proportion of the observations
in our population that are above 25. ( could just as easily be
the proportion below 25.) If this is true, and we are working
with a continuous distribution, our hypotheses become
and . Now let us assume that we
take a sample of and that we find that , the number
of points above 25, is 5. We expect that half of our points,
or 10, will be above 25, so 5 seems low. We thus use a
binomial table to find for
The table tells us that . Since this is a
two-sided test, we double this probability to .0414 and use
this as our p-value. If our confidence level is 95%, our
significance level must be 5%, and, since the p-value is below
5%, we reject the null hypothesis.
(But if we are to repeat tests, it may be wise to
define acceptance and rejection regions by defining two
critical values, and saying that if , we
will reject the null hypothesis. Again assume that the
significance level is .
We can use the p-value approach by saying that, if we
would reject the null hypothesis using the p-value approach
for some value of , that value is in our rejection region.
Starting from the bottom, try . From the table for
. Since this p-value
is below , we would reject if x were 0. We come to a
similar conclusion if takes values of 2, 3 or 4. If ,
we have already seen that , and that we
would still reject the null hypothesis. But if we try ,
we find which is above , so we
‘accept’ if x is 6 or larger.
But, since this is a two-sided test, it is also possible that is
too large. For example if x is 16,
= 1 - .9941.
Since this is below , we would reject the null hypothesis if
were 16 or larger. So try . . This is still too
low for acceptance, so try 14. . Since this is above , we would
not reject the null hypothesis if were 14. We can thus say
that we do not reject the null hypothesis if is between 6 and
14, or that our critical values for are 5 and 15. If we now
look back at the cumulative binomial table, we see that we
rejected the null hypothesis for probabilities below .025
and above .975 .) {bin}
Let's try a one-sided problem. Suppose that our null
hypothesis is that median income in a region is at least $20000
and that we take a sample with the results shown below.
Let .
Our hypotheses are . Let / Index / Incomebe the proportion of numbers in the population below / 1 / 10132
20000. If the median is exactly 20000, will be exactly / 2 / 11252
0.5. But if the median is above 20000, will be below .5. / 3 / 13475
We can replace our original hypotheses with and / 4 / 14260
. . We see that , the quantity of numbers in the / 5 / 16871
sample below 20000, is 7. Our expected number of items / 6 / 19357
below 20000 is , so 7 is high and our p- / 7 / 19438
value will be / 8 / 23010
. Since the p-value is above the significance level, / 9 / 30278
we accept the null hypothesis. / 10 / 35932
(If we wish to set up accept and reject zones for this
one sided test, we need to try higher values of . A value of
is still too small; it gives a probability of .0547, which
is above , so try . According to the binomial table for
, =.0107,
which is below . So 9 is our critical value, and we will
reject the hypothesis if .)
To clarify the correspondence between hypotheses about a
median and hypotheses about a proportion, let us assume that
is the proportion of the data above 20000. If 2000 is the
median, then, by definition of the median, is one half. But
let us assume that the median is above 2000, say 2100. Then
one half of the data must be above 2100, so that more than one
half of the data must be above 2000, which means less than one
half of the data is below 2000. Since a hypothesis about a
median is a hypothesis about a proportion,
corresponds to . The table below shows these
correspondences depending on the definition of .
Hypotheses aboutHypotheses about a proportion
a median If is the If is the
proportion aboveproportion below
b. The Sign Test more Generally.
This technique can be used in other ways. For
instance let us say that we wish to check the effectiveness of a
product brochure. A sample of 17 clients is asked about their
impression of a product. Then they read the brochure and once
again are asked their impression. We write a (+) if their
impression has improved and a () if it is worse. A zero
indicates no change. Our results are as follows:
Client / 1 / 2 / 3 / 4 / 5 / 6 / 7 / 8 / 9 / 10 / 11 / 12 / 13 / 14 / 15 / 16 / 17Sign / + / + / + / + / 0 / / + / 0 / / + / + / + / / + / 0 / + / +
Since we are hoping for a positive effect, count the zeros as
minuses. We will use the brochure if we believe that the
majority of the population will respond favorably. Let p be the
proportion of plusses in the population. Our hypotheses are
. There are 11 plusses so that we must
find when and . A large binomial table
says that this value is .166, so that we must accept the null
hypothesis and not use the brochure.
In the absence of a binomial table we must use the
Normal approximation to the binomial distribution. If
is our observed proportion, we use . But for the
sign test, . So
(For relatively small values of , a continuity correction is
advisable, so try , where the + applies if
, and the applies if . In the problem above ,
where
=.1660. Since this is a p-value, if takes a
typical value like .05 or .10, we can say p-value and not
reject the null hypothesis. )
7. Hypothesis Test for Means - Rare Events
In statistics, ‘rare events’ is a code word for the Poisson distribution. The
easiest way to approach Poisson results is to use a p-value. For example
if you wish to test against and you have a result
that says , the p-value is .
(For an example, see 252oneslx2) {poiss}
8. Hypothesis Tests for a Variance.
To test against (For an example, see
252oneslx2) {chiSq}
i. Test Ratio: or
for large samples
ii. Critical Value:
(Don't try this for large samples.)
iii. Confidence Interval: or
for large samples
Appendix: One-sided and Two Sided Tests.
Assume the following:
so that .
A 2-sided Test:
(i) Test ratio: We test this against
two values of t, and .
We reject if is above or below . In this case we
do not reject . {ttable}
If we use p-value: . On the t-table 0.678 is
between and . So is between
.25 and .30 and the p-value is between .50 and .60. Since the p-value is
above we do not reject .
(ii) Critical value for :
. We reject if is above the upper critical value
or below the lower critical value
. In this case and we do not reject .
(iii) Confidence interval: This interval is 11.28 to 12.72. Since
is between these two limits, we do not reject .
A Left -Sided test:
(i) Test ratio: (We use the same data
as the two-sided problem) We test this against one value of t,
. We reject if is below . In this
case we do not reject .
If we use p-value: . On the t-table 0.678 is between
and . So is between .25 and .30
and the p-value is between .25 and .30. Since the p-value is above
we do not reject .
(ii) Critical value for : We reject if is below In
this case and we do not reject .
(iii) Confidence interval. Since the alternate hypothesis is ,
use Since
does not contradict we do not reject .
A Right-sided Test:
(i) Test ratio: We test this against
one value of t, . We reject if is above .
In this case we do not reject .
If we use p-value: . On the t-table 0.678 is between
and . So is between .25 and .30,
is between .70 and .75 and the p-value is between .70 and
.75. Since the p-value is above we do not reject .
(ii) Critical value for : We reject if is above
In this case and we do not reject .
(iii) Confidence interval: Since the alternate hypothesis is
, use Since does not contradict
, we do not reject .
More on p-value
Let’s say that you have gotten one of the following results for a test of a
mean with
a)
b)
c) The values of could also come from tests of proportions or
variances.
d)
A 2-sided Test
A p-value is defined as the probability that a test statistic or ratio
as extreme as or more extreme than the observed statistic or ratio
could occur, assuming that the null hypothesis is true.
a) . You want .
To find , look at the t table. Since ,
.
Look at the line. You will find that 1.000 is between
and . This means that
and . Since 1.000 is between these values we can say . So ,
which means .
b) . You want
. You found in a) that 1.000 is between
and . This means that
and , but, since the t
distribution is symmetrical, we can also say
and . Since 1.000 is between these values we
can say . So ,
which means .
c) . You want. Make a diagram for with a center at zero and shade the area above
1.000. Use the Normal table. , so
d) . You want
. Make a diagram for with a center at zero
and shade the area below -1.000. ,
so
A Left-sided Test
A p-value is defined as the probability that a test statistic or ratio
as low as or lower than the observed statistic or ratio could occur,
assuming that the null hypothesis is true.
a) . You want . You found in 2-sided Test a)
that 1.000 is between and . This means
that and . Since 1.000 is
between these values we can say . But you
want , so subtract these probabilities from 1.
So .
b) . You want . You found in a) that . Since the t distribution is symmetrical,
we can also say or .
c) . You want. Make a diagram for with
a center at zero and shade the area below 1.000. ,
so
d) . You want . Make a diagram for
with a center at zero and shade the area below -1.000. ,
so
A Right-sided Test
A p-value is defined as the probability that a test statistic or ratio
as high as or higher than the observed statistic or ratio could occur,
assuming that the null hypothesis is true.
a) . You want . You found in 2-sided Test a)
that 1.000 is between and . This means
that and . Since 1.000 is
between these values we can say .
So
b) . You want . You found in a) that . Since the t distribution is symmetrical,
we can also say . But you want
, so subtract these probabilities from 1.
So .
c) . You want. Make a diagram for with
a center at zero and shade the area above 1.000. ,
so .
d) . You want . Make a diagram for
with a center at zero and shade the area below -1.000. ,
so .
Note that, since every one of these p-values is above 1%, 5% and
10%, you would not reject the null hypothesis if you used any of
these significance levels.
© 2005 R. E. Bove
1