The SoupData Analysis
Due: Thursday, December 3rd, 2015
In the accompanying Excel file you are provided with 8 columns worth of data. Columns A & B indicate the soup flavor where 1 = chicken noodle, 2 = vegetable and 3 = tomato.
Column C indicates the soup type where 1 = canned/condensed, 2 = canned/ready to serve, 3 = dry/cook-up and 4 = dry/instant. The other columns indicate the following:
Column D = Cost
Column E = Calories
Column F = Fat
Column G = Calories from Fat
Column H = Sodium
The following are some points to think about to get you started but is not an exhaustive list (you will want to focus on other issues as well):
1. Which variables are skewed? Which are more skewed than others?
2. Compare the variation among the variables?
3. What other types of relationships exist between the variables?
4. Compare and contrast the characteristics of different soup flavors.
5. Compare and contrast the characteristics of different of soup types.
6. Does tomato soup cost on average more than vegetable?
7. How would you describe the characteristics of the expensive soups?
8. Do calories vary more in chicken noodle soups or canned soups?
9. What about the correlation amongst the variables?
10. Are there any observations that display unusual behavior (i.e. beyond 3 standard deviations away from the mean)?
Objective: To apply statistical procedures learned in the course to the soup data set.
Please include approximately 6-8 different types of analysis. Examples include but are not limited to:
1.Descriptive statistics on a variable with interpretations.
2.At least one test of hypothesis involving two populations.
3.At least one confidence interval involving two populations.
4.At least one test of hypothesis for means or proportions involving three or more populations.
5.Inferential analysis involving relationship between variables (regression & correlation).
Each analysis needs to be clearly and professionally presented and formatted. Any statistical analysis or spreadsheet must be discussed in a non-technical form so that anyone not involved in the quantitative analysis would understand it. Results need to be interpreted and the implications of your conclusions discussed. You'll want to communicate the results of your analysis in a clear, articulate, professional, easily understood presentation.
For each test of hypothesis, clearly state the null and alternative hypotheses, and interpret your conclusion in non-statistical terms. Use a 5% level of significance. Where appropriate, use Excel’s Functions & Formulas and/or Data Analysis Tools or other statistical software. Feel free to include any extra insight and graphics/diagrams that would help your presentation.
Evaluation depends on how skillfully the student performs all of the above, not simply on the ability to calculate a certain result or provide a basic solution. Papers will be graded in relation to each other resulting in a hierarchy of grades based on performance.
The written parts of the report should be restricted to a maximum of eight (8) typed pages. While some tables, graphs, computer output, etc. can be part of your paper, most results should be attached in an appendix. Your paper should then periodically refer to these.