Statistical Package Assignment
SLIS 5080 – Research Methods
Benjamin Baron
1. Describe the distribution of items checked out per year for the total sample:
For the sample (n=20), the mean is 14.4000 items, with a variance of 52.1474 and standard deviation of 7.2213 items. The data curve is positively skewed (0.3298) and the curve is slightly platykurtic (-0.6239). The median number of items checked out per year is 15, and the data set is bimodal (15, 19 items).
2. In the population, do the elderly and adolescent reading ability scores differ?
Based on the results of the Independent T-test, we cannot be at least 95% certain that scores will differ in the population. (see hypothesis testing worksheet: question 2).
3. If the elderly scores are normally distributed, what percentage of the population would fall between 30.45 and 42.07?
The mean for elderly reading scores is 30.4545, with a standard deviation of 11.6221. Roughly 34% of the population would have reading scores between30.45 and 42.07.
This is shown through the following calculations:
Z1 = (x – X) / SZ2 = (x – X) / S
Z1 = (30.45 – 30.4545) / 11.6221Z2 = (42.07 – 30.4545) / 11.6221
Z1 = (-0.0045) / 11.6221Z2 = (11.6155) / 11.6221
Z1 = -3.8719Z2 = 0.9994
Using the Cumulative Normal Frequency Distribution table, for Z1the area under the normal curve would be 0.01%; for Z2the area under the normal curve would be 33.89%. The sum total for Areas Under the Normal Curve would be 33.90%.
Likewise, since the upper threshold of the desired reading scores (42.07) is one standard deviation above the mean, the area under the normal curve can be estimated with 34.13% of the population having reading scores between 30.45 and 42.07.
4. What is the relationship in the population of age (elderly versus adolescents) and preference for obtaining materials?
Based on the data, it is not at least 95% certain that there is a relationship between age group and a preference in the method of obtaining materials (see hypothesis testing worksheet: question 4).
5. Is there a significant difference in the population between ages (elderly versus adolescent) in the number of items checked out per year?
Based on the results of the Independent T-test, we can be at least 95% certain that the number of books checked out by the elderly and adolescents will differ in the population (see hypothesis testing worksheet: question 5).
6. For the whole group of subjects, what is the relationship between number of items checked out and reading ability? Will a relationship be found in the population?
Based T-test for Pearson’s r, there does not appear to be a relationship between the number of items checked out per year and reading scores in the sample. Moreover, a relationship within the population is not found (see hypothesis testing worksheet: question 6).
Hypothesis Testing Worksheet: Question 2
1. Research Question: In the population, do the elderly and adolescent reading scores differ?
2. Null Hypothesis Based on Research Question: The reading scores of elderly and adolescent do not differ in the population.
3. Collect and Observe the Data and All Descriptive Statistics: Elderly reading scores consist of 11 cases with a mean of 30.4545 and a standard deviation of 11.6221. Adolescent reading scores consist of 9 cases with a mean of 28.2222 and a standard deviation of 10.3051.
4. Based on observing the data and descriptive statistics, what do you think about the research question? The means and standard deviations for each group appear too similar to draw any clear conclusions in favor of or in opposition to the research question.
5. Select probability level (usually .05): A probability level α = 0.05 has been selected.
6. Select the inferential statistical tool that you will use to infer from the sample to its population:An Independent T-test will be used since the means for each group will be studied as they relate to the population.
7. Calculate the statistic: t = 0.4549
8. Calculate the degrees of freedom for the statistical tool: df = 18
9. Look up the table statistic (for the calculated df and selected probability): the table statistic is 2.101 @ 95% confidence.
10. If the obtained statistic (the one you calculated) is greater than the table statistic, then reject the null hypothesis. If it is not greater, then do not reject the null hypothesis. Write the concluding statement: As the derived statistic (t = 0.4549) is less than the table statistic (2.101), the null hypothesis cannot be rejected. We are not at least 95% certain that there is a difference between the scores of elderly and adolescents in the population.
Hypothesis Testing Worksheet: Question 4
1. Research Question: Is there a relationship between age group (elderly versus adolescents) and a preference for obtaining materials (mail versus bookmobile).
2. Null Hypothesis Based on Research Question: There is no relationship between age group and preference for obtaining materials.
3. Collect and Observe the Data and All Descriptive Statistics: Of the 20 subjects, 11 are classed as elderly and 9 are classed as adolescents. Of the subjects 11 prefer to obtain materials by mail while 9 prefer to obtain materials from a bookmobile. Elderly subjects preferring mail number 8 individuals, and elderly subjects preferring bookmobiles number 3 individuals. Adolescent subjects preferring mail number 3 individuals, and adolescent subjects preferring bookmobiles number 6 individuals.
4. Based on observing the data and descriptive statistics, what do you think about the research question? At a glace it appears that there is a preference based on age. However, since the study is dealing with a small pool of subjects, such a conclusion may not be appropriate.
5. Select probability level (usually .05): A probability level α = 0.05 has been selected.
6. Select the inferential statistical tool that you will use to infer from the sample to its population: Since the data collected for this research question is nominal, and the relationship within the sample is being studied, a chi-square test of independence will be employed.
7. Calculate the statistic: Results provide that X2= 1.7162as computed with Yate’s correction.
8. Calculate the degrees of freedom for the statistical tool:df = 1
9. Look up the table statistic (for the calculated df and selected probability): The table statistic is: 3.841 @ 95%confidence.
10. If the obtained statistic (the one you calculated) is greater than the table statistic, then reject the null hypothesis. If it is not greater, then do not reject the null hypothesis. Write the concluding statement: As the derived statistic (X2 = 1.7162) is less than the table statistic (3.841), the null hypothesis cannot be rejected. We are not at least 95% certain that there is a relationship between age group and a preference for obtaining materials.
Hypothesis Testing Worksheet: Question 5
1. Research Question:Is there a significant difference in the population between ages (elderly versus adolescent) in the number of items checked out per year?
2. Null Hypothesis Based On Research Question: There is not a difference in the population in the number of items checked out per year between elderly and adolescents.
3. Collect and Observe the Data and All Descriptive Statistics: Elderly subjects (11 cases) checked out a mean of 17.8182 items per year with a standard deviation of 7.0967. Adolescents (9 cases) checked out a mean of 10.2222 items per yearwith a standard deviation of 5.0194.
4. Based on observing the data and descriptive statistics, what do you think about the research question? It appears that elderly patrons check out more books per year, based on the descriptive statistics.
5. Select probability level (usually .05): A probability level α = 0.05 has been selected.
6. Select the inferential statistical tool that you will use to infer from the sample to its population: An Independent T-test will be used since the means for each group will be studied as they relate to the population.
7. Calculate the statistic: t = 2.7965
8. Calculate the degrees of freedom for the statistical tool: df = 18
9. Look up the table statistic (for the calculated df and selected probability): the table statistic is 2.101 @ 95% confidence.
10. If the obtained statistic (the one you calculated) is greater than the table statistic, then reject the null hypothesis. If it is not greater, then do not reject the null hypothesis. Write the concluding statement: As the derived statistic (t = 2.7965) is greater than the table statistic (2.101), the null hypothesis can be rejected. We are at least 95% certain that there is a difference between the number of books checked out per year by elderly and adolescents in the population.
Hypothesis Testing Worksheet: Question 6
1. Research Question:For the whole group of subjects, what is the relationship between number of items checked out and reading ability? Will a relationship be found in the population?
2. Null Hypothesis Based On Research Question: A relationship will not be found in the population between number of items checked out and reading ability for the whole group of subjects.
3. Collect and Observe the Data and All Descriptive Statistics: The mean for items checked out per year for all groups is 14.4000 with a standard deviation of 7.2213. The mean for the reading scores for all groups is 29.4500 with a standard deviation of 10.8214.
4. Based on observing the data and descriptive statistics, what do you think about the research question? Looking at a scatter plot of the items checked out/reading score data, there does not appear to be a strong relationship between the two variables.
5. Select probability level (usually .05): A probability level α = 0.05 has been selected.
6. Select the inferential statistical tool that you will use to infer from the sample to its population: The T-test for Pearson’s r will be used for this question. The relationship between ratio variables in the sample as it relates to the population is being studied.
7. Calculate the statistic: r = 0.2474, t = 1.0835
8. Calculate the degrees of freedom for the statistical tool: df = 18
9. Look up the table statistic (for the calculated df and selected probability): the table statistic is 2.101 @ 95% confidence.
10. If the obtained statistic (the one you calculated) is greater than the table statistic, then reject the null hypothesis. If it is not greater, then do not reject the null hypothesis. Write the concluding statement:Since the derived statistic (t = 1.0835) is not greater than the table statistic (2.101), we cannot reject the null hypothesis. We are not at least 95% certain that there would be a non-zero relationship between number of items checked out per year and reading scores in the population. Moreover, the value of r = .2474, which indicates that no relationship between the two variables can be claimed.
Raw Data Set:
Subject: / Age: / Preference: / Items / year: / Score on scale (0-50):1 / E / M / 15 / 26
2 / E / B / 18 / 35
3 / A / B / 11 / 24
4 / A / B / 8 / 37
5 / E / M / 5 / 19
6 / E / M / 19 / 41
7 / A / M / 3 / 39
8 / A / B / 7 / 23
9 / E / M / 15 / 15
10 / E / B / 25 / 45
11 / A / B / 19 / 19
12 / A / B / 9 / 38
13 / E / M / 28 / 29
14 / A / M / 15 / 9
15 / E / M / 9 / 17
16 / E / M / 27 / 45
17 / A / B / 14 / 36
18 / E / M / 19 / 21
19 / E / B / 16 / 42
20 / A / M / 6 / 29
Notes:
Age: A = Adolescent, E = Elderly
Preference: B = Bookmobile, M = Mail delivery