STAT 110 – Summer 2014 Assignment #8 ( pts.)
Due Tuesday, June17th
- A researcher is interested in determining if smoking during pregnancy increases the risk of having an infant with a low birth weight (< 2,500g). Use the data file NC Birth (n = 10000).JMP.
Research Question: Do infants whose mother smoked during pregnancy have a greater chance of having a low birth weight than infants whose mothers did not smoke during pregnancy?
- There are two categorical variables of interest in this study: Low Birth?(Low or Norm) and Smoker(Smoker (Cigs) or Nonsmoker (No Cigs)). Which is the response variable and which is the predictor variable? (2 pts)
- Complete the 2 X 2 contingency table below using the information above. (2 pts.)
Low Birth?
Smoker / Norm / Low / Row TotalsSmoker
(Cigs)
Nonsmoker
(No Cigs)
Column Totals / n =
- Use JMP to create a mosaic plot with the relevant percentages shown and paste it below. What are the relevant proportions/percentages for this study? (3pts)
- What is the difference in the proportion of infants with low birth weight in the two groups determined by smoking status during pregnancy? Does this seem large or small? (2 pts.)
- Give a 95% confidence interval for the difference in the proportion of infants with low birth weightbetween the two populations being studied. Interpret. (3 pts.)
- What is the relative risk (RR) for low birth weight associated with a mother smoking during pregnancy? Use this value in a sentence that explains how it is to be interpreted. (3 pts.)
- Find a 95% confidence interval for the relative risk (RR) above. Does this confidence interval suggest there is increased risk of having a low birth weight infant associated with smoking during pregnancy? Explain how this confidence interval can be used to answer this question. (3 pts.)
- Next, use JMP to carry out Fisher’s Exact Test for these data: (8 pts.)
Step 1: / Convert the research question into H0 and Ha. State it in words and in terms of the population parameters (p).
H0:
Ha:
Step 2: / Determine the p-value and make a decision concerning H0.
p-value:
Step 3: / Write a conclusion in terms of the original research question.
Smoker / Normal / Low / Row Totals
Smoker
(Cigs) / 948
( ) / 152
( ) / 1100
Nonsmoker
(No Cigs) / 8158
( ) / 734
( ) / 8858
Column Totals / 9072 / 886 / n = 9958
- Find the expected value for each cell in the table above and put them inside the
parentheses. (4 pts.)
- Use a chi-square test to determine if the proportion of infants with low birth weight who differs between those born to mothers who smoked during pregnancy vs. those that did not. (3 pts.)
- What is percentage of mothers in this sample that smoked during pregnancy? Is this a statistic or a parameter? Find a confidence interval for this percentage in the population of pregnant women in North Carolina. Interpret. (4 pts.)
- This is describes a slightly larger scale study of the smoker exposure and tumor growth study in the notes.
Twenty-four male strain A/J mice (6–8 weeks old) were randomly assigned to be exposed to simulated environmental tobacco smoke that consisted of a mixture of 89% sidestream and 11% mainstream smoke from Kentucky 1R4F reference cigarettes, at a chamber concentration of 87 mg/m3 total suspended particulate matter. All mice were exposed in 0.44 m3 stainless steel inhalation chambers for 6 h/day, 5 days/week for 5 months. Another 24 mice were randomly assigned to be kept in clean air during this time period. Then, all of the mice were allowed to recover for a further 4 months in filtered air before being killed for analysis of lung tumor incidence. Of the mice exposed to smoke, 20 out of 24 had a tumor; of the controls kept in air, only 9 of 24 had a tumor.
- Create a contingency table for these data using the template below. (3 pts.)
Developed Tumor?
Treatment / Tumor / No Tumor / Row TotalSmoke
Control
Column Total / n =
- Enter these data into JMP and obtain mosaic plot of the results. Also find the following: (2 pts.)
- Use Fisher’s exact test to determine whether these data provide strong evidence that the tumor incidence was greater in the group exposed to smoke. Be sure to clearly state your p-value and conclusion in the context of the problem. (3pts)
- Calculate a 95% confidence interval for the difference in proportions. You can use either JMP or the formula discussed in the notes to find this interval. (3 pts)
- Provide a clear interpretation of this confidence interval to discuss the degree of the difference between the proportions. (2 pts)
- Find the relative risk for developing tumors associated with smoke exposure and find a confidence interval for the RR. Provide clear explanations of these for the researchers. (3 pts.)
- Consider the following contingency table which classifies subjects according to both their profession and their frequency of alcohol use. These data are in the file Profession Alcohol.JMP.
Alcohol Use
Profession / High / Low
Clergy / 32 / 268
Educators / 51 / 199
Executives / 67 / 233
Retailers / 83 / 267
- Construct a mosaic plot for these data, and describe the differences you see in the frequency of alcohol use across professions based on this sample. (2 pts.)
- Calculate the expected frequencies for the following cells and contribution to the overall chi-square statistic for the following cells: (4pts)
Retailers, High
E=
=
Clergy, Low
E=
=
- Carry out an appropriate hypothesis test to determine whether there is evidence of a significant relationship between profession and alcohol use. Be sure to clearly state hypotheses, your p-value, and the conclusion in the context of the problem. (4pts)
- Computer Preference and Year in School
Is there a relationship between the year in school a WSU student is and which laptop platform they choose? Use the data in the file Student Survey 2.JMPto answer this research question.
The variables you need to use are:
- Mac or PC
- Year in School – Freshman, Sophomore, Junior, Senior (all other classifications have been removed due to low numbers of respondents)
- State the hypotheses, find the p-value from the appropriate test, and summarize your findings. (4 pts.)
- If you conclude there are is a significant relationship between year in school and laptop preference, summarize any important differences you see citing appropriate percentages in your discussion. Can you think of a reason why you see this pattern? (Hint: students can get a new laptop every two years) (4 pts.)
- Eye Color and Hair Color
Is there a relationship between hair color and eye color of WSU students? Use the data in the file Student Survey 2.JMP to answer this question.
The variables you need to use are:
- Eye color – Blue, Brown, Green, Other
- Hair color – Black, Brown, Blond, Other
- State the hypotheses, find the p-value from the appropriate test, and summarize your findings. (4 pts.)
- If you conclude there are is a significant relationship between hair colorand eye color, summarize any important differences you see citing appropriate percentages in your discussion.(2 pts.)
- Select Correspondence Analysis from the Contingency Analysis … pull down menu. A new plot will appear in your output below the test results, examine this plot and comment on what do you think it shows? Look at your mosaic plot at the same time, does it convey similar information? (3 pts.)
1