Stat 280 Lab 9: Hypothesis Testing

Stat 280 Lab 9: Significance Tests and Confidence Intervals

Objectives: This lab is designed to introduce hypothesis testing and such diagnostics as significance tests and confidence intervals.

Directions: Follow the instructions below, answering all questions. Your answers should be in the form of a brief report (MS Word), to be handed in to the instructor before you leave. Please include plots and descriptive statistics in your report.

Confidence Interval: A level C confidence interval for a parameter is an interval computed from sample data by a method that has probability C of producing an interval containing the true value of the parameter. There are two important features of confidence intervals:

i)It is an interval of the form (a,b) where a and b are numbers computed from the data.

ii)It has a property called a confidence level that gives the probability that the interval covers the parameter.

We can choose the confidence level, most often 90% or higher because we most often want to be quite sure of our conclusions.

Confidence Interval for a Population Mean: Choose a simple random sample (SRS) of size n from a population having unknown mean  and known standard deviation . A level C confidence interval for  is

xbar  z*/sqrt(n),

where xbar is the sample mean. This confidence interval is of the form

xbar  margin of error.

The confidence interval for a population mean will have a specified margin of error,m, when the sample size is

n= (z*/ m)^2.

Example: You want to estimate the mean SAT-Math score for the more than 250,000 high school seniors in California. You know better than to trust data from the students who choose to take the SAT. Only about 45% of California students take the SAT. These self-selected students are planning to attend college and are not representative of all California seniors. Suppose you give the test to a simple random sample of 500 California high school seniors. The mean score for your sample is xbar=461. What can you say about the mean score  in the population of all 250,000 seniors?

The sample mean, xbar, is the natural estimator of the unknown population mean . Suppose we know the that if the entire population of SAT scores has mean  and standard deviation , then in repeated samples of size 500 the sample mean xbar follows the Normal(,/sqrt(500)) distribution. The standard deviation  of SAT-Math scores in our California population is  = 100. So xbar = 100/sqrt(500) = 4.5. The 95% confidence interval is then

xbar  z* /sqrt(n) = 461  1.96*4.5 = 461  8.82  461  9.

Consider:

The 68-95-99.7 rule says that the probability is about 0.95 that xbar will be within 9 points (two standard deviations of xbar) of the population mean .
To say that xbar lies within 9 points of  is the same as saying  is within 9 points of xbar.
So 95% of all samples will capture the true  in the interval from xbar – 9 to xbar + 9.

A 95% confidence interval for , means thatif we take a number of simple random samples from the same population and construct confidence intervals for each of these samples, in the long run, 95% of all the samples will give an interval that covers the true mean.

In our example, we say that we are 95% confident that the unknown mean score for all California seniors lies between (452, 470). To understand the grounds for confidence realize that there are only two possibilities:

The interval between 452 and 470 contains the true .
Our SRS was one of the few sample for which xbar is not within 9 points or the true . Only 5% of all samples give such inaccurate results.

Tests of Significance: A significance test is a formal procedure for comparing observed data with a hypothesis whose truth we want to assess. The hypothesis is a statement about the parameters in a population or model. The statement being tested in a test of significance is called the null hypothesis. The test of significance is designed to assess the strength of the evidence against the null hypothesis. Usually the null hypothesis is a statement of “no effect” or “no difference”.

The probability, computed assuming that the null hypothesis, Ho, is true, that the test statistic would take a value as extreme or more extreme than that actually observed is called the p-value of the test. The smaller the p-value, the stronger the evidence against Ho provided by the data. The decisive value is called the significance level, denoted by . If the p-value is as small or smaller than , we say that the data are statistically significant at level .

The four steps for conducting a test of significance are as follows:

State the null hypothesis, Ho and the alternative hypothesis Ha. The test is designed to assess the strength of the evidence against Ho; Ha is the statement that we will accept if the evidence enables us to reject Ho.
Calculate the value of the test statistic on which the test will be based. This statistic usually measure how far the data are from Ho.
Find the p-value for the observed data. This is the probability, calculated assuming that Ho is true, that the test will weight against Ho at least as strongly as it does for these data.
State a conclusion. One way to do this is to choose a significance level . If the p-value is less than or equal to , you conclude that the alternative hypothesis is true; if it is greater than , you conclude that the data do not provided sufficient evidence to reject the null hypothesis. Another way to arrive at a conclusion is to use your 100(1-)% confidence interval. If your hypothesized value of the parameter in the null hypothesis falls outside the interval, then you reject Ho.

1. Answer the following questions for the data set pulse.mtw (pulse.xls). These are pulse rates for a bunch of students in a statistics class.

(a)What is the estimate of the population mean?

(b)Construct a 95% confidence interval for the true population mean.

(c)What sample size do we need in order to reduce the above interval width (2*margin of error) by half?

(d)Repeat part (a) &( b) above for a 90% confidence interval.

(e)What can you say about the effect of changing the level of confidence & the sample size on the width ofthe confidenceinterval?

2. Imagine choosing n=16 women at random from a large population and measuring their heights. Assume the heights of the women in this population are normally distributed with mean 64 inches and standard deviation 3 inches. Suppose you then test the null hypothesis that the population mean is 64 against the alternative thatit is different from 64 using level of significance 0.1. Simulate the results of doing this test 20 times by choosing Calc->Random Data->Normal and generating 16 rows of data in C1-C20 with 64 as the mean and 3 as the standard deviation.

(i) Do hypothesis tests using Stat->Basic->Statistics->1-Sample Z with the Test mean option and specify a mean of 64 and a sigma of 3.

(ii) Construct 90% C.I.s and 95% C.I.s(hint: the procedure is the same as in (1) only change the Test mean option into confidence interval option and specify the confidence level 90.0 & 95.0).

(a)What are the null hypothesis and alternative (research) hypothesis?

(b)In how many tests did you fail to rejectthe null hypothesis ? That is, how many times did you make the "correct decision"?

(c)How many times did you make an "incorrect decision"(that is, reject the null hypothesis)? On the average how many times out of 20 would you expect to make the wrong decision?

(d)How many 90% confidence intervals cover the true mean value 64?

(e)Is the frequency that you make the correct decision the same as the frequency that the 90% C.I.s cover 64?

(f)Plot a histogram of the p-values and comment on its shape.

(g)Suppose you used alpha=0.05 instead of alpha=0.10. Does this change any of your decisions to reject or not? Should it in some cases?