Test of a Hypothesis About Μ When Value of Sigma Is Unknown I

P2010 Two Population Tests

Corty – Ch 8,9

Prequel: Ways of conducting Research involving two populations

Suppose I’ve developed a new pain reliever that I want to compare with Tylenol. The research will involve administration of the pain reliever, a waiting period for it to take effect, the administration of a standard pain (forcing the participants to listen to a statistics lecture on sampling distributions), then administration of an “Absence of Pain” questionnaire, with high scores indicating little pain felt. So higher scores are better.

The research can be conducted in three different ways.

Independent groups Design

Two separate (independent; not paired or matched) groups are used.

One group receives the new pain reliever- it's the experimental group.

The other receives the standard pain reliever - it's the control group.

Matched participants Design

The groups consist of matched pairs - each person has a "matched" twin

in the other group.

Matching is performed with respect to one or more pretest variables related to

the dependent variable.

One group receives the new pain reliever- it's the experimental group.

The other receives the standard pain reliever - it's the control group.

Participants as their own controls Design

One group is identified.

Participants in the group are given the treatment at one time.

They're given the control condition at some other time.

Statistical analyses: looking ahead.

The independent groups design requires the independent groups t-test.

The matched groups and participants as their own controls designs are known

collectively as correlated groups designs. They require the correlated groups t-test or as SPSS calls it, the dependent samples t-test or as Corty calls it, the paired samples t-test.
Independent samples Formulas

Moving from one sample to two independent samples

One sample

Observed mean – Expected mean

t = ------

---

Two independent samples – equal sample sizes

Observed mean 1 – Observed mean 2 – 0

t = ------

S12 + S22

------

Two independent samples – unequal sample sizes

Observed mean 1 – Observed mean 2 – 0

t = ------

(N1-1)S12 + (N2-1)S22 1 + 1

------

N1-1 + N2-1 N1 N2

Obviously, the equal sample-sizes formula is simpler than the unequal sample sizes formula.

But since we NEVER compute t-statistics by hand, the distinction between them is irrelevant in the computer age.

Since the unequal sample sizes formula yields the same number as the equal sample sizes formula when sample sizes happen to be equal the computer programs that do the computations for us always use the unequal sample sizes formula.

Independent Groups t-test Example Problem

Based on an example given in Minium, et. al. p. 251.

A student is interested in whether fragrances enhance memory. He has participants read a passage from a text.

Half the participants read the passage in the presence of a pleasant but unfamiliar fragrance.

The other half read the passage with no experimenter-provided scent present.

One week later, all participants are brought back to the lab, and are given a test of their memory for facts from the passage they had read. The Scent group was given the test on a sheet of paper scented with the same fragrance they had experienced when reading the passage. The other group was given the test with no experimenter-provided scent. The interest was in a comparison of performance of the two groups.

The data are as follows . . .

Performing the analysis using SPSS . . .

Analyze -> Compare Means -> Independent-Samples T Test

The dialog box

The output

Argh!!!! Reading the independent groups t output. Start here on 10/15/15.

SPSS gives three tests of significance including TWO t values. You have to pick the correct one.

Three tests of significance are presented in the table.

The first is a test that compares the variances of the two groups. It’s the F on the very left side of the table.

The result of this test determines which of the following two t-tests is to be used.

The second is the “equal variances assumed” t, in the upper row of the table.

The third is the “equal variance not assumed” t, in the lower row of the table.

Use the following decision rule

If the leftmost “Sig.” is > .05, the population variances are equal, so we used the equal variances t in the top row of the table.

If leftmost “Sig.” is <= .05, the population variances are not equal, so we use the equal variances not assumed t in the bottom row of the table.

(The reasons for this complexity are beyond the scope of this course.)

Symbolically . . .

In this example, the p value for the variances test is .978 which is larger than .05, so we retain the hypothesis of equal variances and use the equal variances t.

In this particular example, both the equal variance t and the unequal variances t are the same value, -3.027. But they won’t always be equal, and their p-values will not always be equal.

Bottom Line: For this course, we will use the “Equal Variances Assumed” t value – the one at the top.

The Hypothesis Testing Answer Sheet for the Independent Groups t-test

Give the name and the formula of the test statistic that will be employed to test the null hypothesis.

Independent Groups t-test If you get the name correct, I’ll assume you could write the formula.

Check the assumptions of the test

Distributions appear to be approximately US within each group.

Null Hypothesis:______

Alternative Hypothesis:______

What significance level will you use to separate "likely" value from "unlikely" values of the test statistic?

Significance Level = ______.05______

What is the value of the test statistic computed from your data and the p-value?

t = -3.027 p-value = .005 from Equal Variances Assumed line of SPSS output

What is your conclusion? Do you reject or not reject the null hypothesis?

Reject the null

What are the upper and lower limits of a 95% confidence interval appropriate for the problem? Present them in a sentence, with standard interpretive language.

Lower Limt = -8.05 Upper Limit = -1.55

We can be 95% sure that the difference in population means is between -8.05 and -1.55.

State the implications of your conclusion for the problem you were asked to solve. That is, relate your statistical conclusion to the problem.

Recall associated with scents is apparently better than recall with no scent associated with it.

Effect size from the conduct of the Independent Groups t-test

Alas, SPSS does not print the effect size, nor does it print a quantity that can be easily converted to the effect size.

Corty discusses effect size on pages 248-249. He does not straightforwardly present the formula.

I won’t require it, but if you wish to compute the effect size, the formula is

Observed mean 1 – Observed mean 2

d = ------

(N1-1)S12 + (N2-1)S22

------

N1-1 + N2-1

Just for kicks, let’s compute it.

(N1-1)S12 + (N2-1)S22 14*4.2562 + 14*4.4272

S2pooled = ------= ------= 4.34

N1-1 + N2-1 14 + 14

21.4 – 26.2 -4.8

d = ------= ------= -1.10, a huge effect size,

4.34 4.34
A Second Example of the Independent Groups t-test

Who uses social media more – males or females.

A psychologist is studying the use of social media – such as Facebook – among young adults. She is at the beginning of her research program, and right now is simply interested in discovering who uses social media more – males or females. She decides that the population to which she would like to generalize her results is the population of university students, specifically the university at which she works.

She takes a convenience sample of students eating lunch at the university center and asks them to fill out a simple questionnaire. One of the questions is, “On average, how many times a day do you check your favorite ‘social media’ account?”

She obtained the following data . . .

Females

12 32 40 28 54 35 29 40 27 53 37 23 19

Males

23 18 42 28 35 18 33 29 30 21 17 8 37 25

Test the hypothesis that the mean number of times social media is used by the population of males students is equal to the mean number of times social media is used by the population of female students.

Performing the Analysis

1. Enter the data into the computer

SPSS Data Entry Excel Data Entry

Performing the Analysis cont’d . . .

2. Invoke the Independent t procedure

Carrying out the analysis using the SPSS Independent Groups t-test

Carrying out the analysis using Excel Two-Sample t Assuming Equal Variances

Note that Excel is much more flexible than SPSS concerning where the data can be located in the spreadsheet.

SPSS requires that ALL the scores values be in the same column and REQUIRES that you create a second column to indicate which group each score is in. Excel does not.

But SPSS gives you a LOT more statistical capability than Excel. Small price to pay.

3. Examine the Results

SPSS results

T-Test

Group Statistics
sex / N / Mean / Std. Deviation / Std. Error Mean
uses / 1 / 13 / 33.00 / 12.159 / 3.372
0 / 14 / 26.00 / 9.215 / 2.463
Independent Samples Test
Levene's Test for Equality of Variances / t-test for Equality of Means
F / Sig. / t / df / Sig. (2-tailed) / Mean Difference / Std. Error Difference / 95% Confidence Interval of the Difference
Lower / Upper
uses / Equal variances assumed / .671 / .420 / 1.694 / 25 / .103 / 7.000 / 4.133 / -1.511 / 15.511
Equal variances not assumed / 1.676 / 22.347 / .108 / 7.000 / 4.176 / -1.652 / 15.652

Excel Results

The key results from the two programs are the same, just presented differently.

4. Present the Results

Corty’s Hypothesis Testing Answer Sheet

1. Give the name and the formula of the test statistic that will be employed to test the null hypothesis.

Independent Groups t-test. (Equal population variances assumed.)

2. Do the data meet the assumptions of the test? Provide evidence.

I see no drastic violations of the assumptions.

3. The null and alternative hypotheses.

Null Hypothesis: Population mean use of social media in males and females is equal.

Alternative Hypothesis:_There is a difference in male and female population mean use of social media.

4. What significance level will you use to separate "likely" value from "unlikely" values of the test statistic?

Significance Level = .05

5. What is the value of the test statistic computed from your data and the p-value?

Test statistic value = 1.69 _ p-value = .10

6. What is your conclusion? Do you reject or not reject the null hypothesis?

I fail to reject the null. The evidence suggests that the population means are equal.

7. What are the upper and lower limits of a 95% confidence interval appropriate for the problem?

Lower Limit = -1.51 Upper Limit = 15.51

8. State the implications of your conclusion for the problem you were asked to solve. That is, relate your statistical conclusion to the problem.

Male and female students at this university use social media about equally often.

Paired-Samples t Test: Overview Start here on 10/22/15.

If persons in the two conditions are paired so that each person in one condition has a “match” in the other condition, a different test is required.

Names for this research design . . .

Matched Participants Design

Each person in one condition is matched with a person in the other condition on some test.

Participants as Their Own Controls Design

Each person serves in both conditions, so each person is matched with himself/herself

PrePost Designs

A version of the above in which persons are tested twice – first before a treatment, then after it

Longitudinal Designs

A version of the above in which persons are tested, then after a prespecified period of time, tested again.

Repeated Measures Designs

Any design in which the same persons are tested more than once

The research designs presented above are more efficient than the independent groups designs.

There are two major differences

1. The two conditions are more likely to be equal because matched or the same people are in both.

This means that any difference between the groups can be attributed to the treatment and not to pre-existing group differences.

2. If there is a difference due to whatever treatment is being evaluated, it will be more likely to be

detected if a paired-samples design such as these is used.

Test Statistic: Paired-Samples t Test

The official formula is presented here for completeness. But we won’t use it.

Where (more than you ever wanted to know about the correlated groups t formula)

= Mean of the sample from the first population

= Mean of the sample from the second population

S1 = Standard deviation of the sample from the first population.

S2 = Standard deviation of the sample from the second population.