ST 361 Estimation --- Interval Estimation for (§7.2, 7.4)

Topics:

I. Interval estimation: confidence interval

II. (Two-sided) Confidence interval for estimating population mean

(a) When the population SD is known: use Z distribution (§7.2)

(b) When the population SD is NOT known: use t distribution (§7.4)

III. (Two-sided) confidence interval for estimating population proportion (§7.3)

IV. Two-sided confidence interval for estimating population mean difference (§7.5)

(a) when the population SD’s are known

(b) when the population SD’s are NOT unknown

------

I. Interval Estimate----Confidence Interval (CI)

v  What is it?

A confidence interval is an interval calculated from a sample such that it will contain the true value of a population parameter (such as the population mean ) with certain probability (called the confidence level)

v  Why?

·  Because of sampling variability, the point estimate is almost never exactly equal to the correct value for the parameter

·  Point estimates don’t tell us how close they are to the actual parameter

Þ  So we use an interval call a confidence interval to report the likely range for the parameter of interest

Confidence interval for the population mean

Consider a sample of (n 30) that is randomly selected from a population with mean and SD . To estimate we use the sample mean. Our goal here is find the likely range of .

·  Recall that no matter what distribution X has (by CLT).

·  Based on this normal distribution of , we can show that the middle 95% of the fall within , or equivalently, . Notice that this is an interval centering at the parameter.

·  However, in reality we don’t know , and instead we only observe from the sample collected. So what we really want is an interval centering at with the same length: or equivalently .

·  Meaning of the interval “”: It contains with 95% probability

Thus we have 95% of confidence that can be covered within the range

·  This is the concept of Confidence Interval (CI)----We called such interval “the 95% confidence interval for ”

Note that the fundamental assumption for constructing the CI for is that :

has a normal distribution (automatically true if X has a normal distribution. Otherwise the sample size n has to be large)

II.  Confidence interval for ; assume

v  If known

If known, the CI for at a given confidence level is

·  4 components:

(a)  The point estimator

(b)  Confidence level, which determines the critical value z*

(c)  The SE of the point estimator

(d)  need follows normal distribution

Ex1. X~N(). The 90% CI for with sample size n is

Ex2. X~N(). The 95% CI for with sample size n is

Ex3. X~N(). The 99% CI for with sample size n is

►  Comment: the higher the confidence level is, the wider or narrower (choose one) a CI becomes.

Ex4. X has mean and SD (known). A sample of size=100 is collected. What is the 95% CI for ?

If from a sample we got = 3.4, and is assumed to be 2.5. Then a 95% CI for is

Ex5. What is the confidence level for the interval ?

Since is the confidence level.

v  If is unknown

·  In practice, most of the time is not known. To calculate CI for, we have to use instead of .

·  When is known, ~ Z (when X has a normal distribution or the sample size n is large), and hence we use a z critical value.

·  When is used, will be distributed as

(a)  If n is large (, is approximately distributed as N(0,1). So we can still use the result before by replacing with the sample SD .

(b)  If n is small (, we have to assume X has a normal distribution with mean and SD (even though its value if unknown). Then has a t-distribution with (n-1) degrees of freedom

When unknown, the CI for at a given confidence level is

·  The t distribution with (n-1) degree of freedom (graph on the last page of the textbook)

Þ  t distribution is similar to the standard normal distribution (the Z distribution) in many aspects: (1) all values are possible

(2) symmetric around zero

(3) bell-shaped

Þ  However, it has heavier tails than the Z distribution. Different sample size results in different thickness of the tail in a t distribution: the smaller the sample size (the degrees of freedom), the thicker the distribution.

Þ  Each t distribution is defined through the degree of freedom (df) and the corresponding t distribution is denoted by

·  Use t- table to find the critical value

Page 566 Table IV

Ex6. Use the t table to find 95% and 99% t-critical value for each of the following sample size:

Sample size
n / Degree of freedom (df) = n-1 / t* (i.e., t-critical value)
95% / 99%
3 / 2 / 4.303 / 9.925
6 / 5 / 2.571 / 4.032
12 / 11 / 2.201 / 3.106
30 / 29 / 2.045 / 2.756
/ 1.96 / 2.576

Ex7. X~Normal distribution. n=25, =8 and s=2. What is the 95% CI for the population mean ?

Ex8. X= # of claims received (per week) by an insurance company. Based on 41 weeks of samples, and s=20.0. What is the 95% CI for

Here n=41 is large enough for us to use the formula [12.38, 24.62]

1