MAT 120 (Brief) Notes, Definitions, Formulas chapters 9-10.1 (so far)
9.1
Here, we begin to explore inferential statistics, where we do not know crucial population parameters like mean or standard deviation, and wish to make inferences about their values from sample data.
The first technique we looked at was building a confidence interval for the population mean, given a sample mean, i.e. an interval of numbers, centered on the sample mean where we think the population mean lies. From the Central Limit Theorem, we know that the means of samples (of sufficient size) are normally distributed, and we can take advantage of the predictable nature of the normal curve and its associated probabilities. Crucial to this technique is the recognition that there is give-and-take, or (more formally) margin-of-error. This is reflected by level of confidence, or .
If the population standard deviation is known, then we construct the confidence interval as
The value can be obtained in a 'reverse lookup' from table V or by using invNorm on the calculator. But here is a list of commonly used values of this type, corresponding to different levels of confidence:
level of confidence area in each tail
90% 0.05 1.645
95% 0.025 1.96
99% 0.005 2.575
9.2
A trickier (and more realistic) problem is constructing a confidence interval for the population mean when the population standard deviation is unknown. It seems reasonable to just replace the population standard deviation with the sample standard deviation s, with some estimated level of certainty. But this approach doesn't work, because the Central Limit Theoremapplies only to sample means, not sample standard deviations. Thus, we can't use the well-known probabilities associated with the normal curve. Ie have to use a different distribution under these conditions, known as the Student'st-distribution, whose values are depicted in Table VI. We don't use z-values here, but rather t-values:
, with n – 1 degrees of freedom also a crucial player in the table. The confidence interval is then constructed as:
10.1
I'm just going to list some definitions here:
A hypothesis is a statement regarding a characteristic of one or more populations.
Hypothesis testing is a procedure, based on sample evidence and probability, used to test statements regarding a characteristic of one or more populations.
Steps In Hypothesis testing:
1. A statement is made regarding the nature of the population.
2. Evidence (sample data) is collected to test the statement.
3. The data are analyzed to assess the plausibility of the statement.
The null hypothesis, denoted , is a statement to be tested. The null hypotheses is a statement of no change, no effect, or no difference. The nulll hypothesis is assumed true until evidence indicates otherwise.
The alternative hypothesis, denoted , is a statement that we are trying to find evidence to support.
The three types of null hypothesis (concerning the mean – note that we could also use proportion p or standard deviation where the mean is ) with notation:
('two-tailed')
('left-tailed')
('right-tailed')
Four possible outcomes of hypothesis testing:
1. Reject when is true. This decision would be correct.
2. Do not reject when is true. This decision would be correct.
3. Reject when is true. This decision would be incorrect. This type of error is called a Type I error.
4. Do not reject when is true. This decision would be incorrect. This type of error is called a Type II error.
The level of significanceis the probability of making a Type I error.