Assignment 12 – NURS 701 ~ Additional Problems (Part II)
Multiple Regression & ANCOVA (35pts.)
3 – Cancer, Weight Loss, and Protein Turnover
Data File: CancerWeight Loss
Weight loss among cancer patients is a well-known phenomenon. Of interest to clinicians is the role played in the process by metabolic abnormalities. One investigation into the relationships among these variables yielded data on whole-body protein turnover (Y) and percentage of ideal body weight for height (X). Subjects were lung cancer patients and health controls of the same age.
a) Perform and summarize your findings from an appropriate t-test (pooled or non-pooled) to compare the mean protein turnover of lung cancer patients to that for healthy controls. Quantify the size of the difference in these means using a 95% CI for the difference in the population means. Interpret this interval. (3 pts.)
b) Construct a scatter plot of these data using a different plotting symbol for each of the two groups, cancer and healthy controls. Fit a separate regression line for each group (use the Group By… option in the Bivariate Fit pull-down menu selecting Disease Status as the Grouping Column from the list of variables). Discuss any differences and similarities between these two regression lines. (3 pts.)
c) Fit a model that incorporates both deal body weight for height and cancer status. Be sure to allow for both the intercepts and slopes to be different (i.e. fit the unrelated lines model to these data). To do this in JMP select Analyze > Fit Model set the dialog box up as shown below. To include the interaction term which allows for the different slopes highlight Disease Status & Ideal Body Weight and select Full Factorial from the Macros pull-down menu.Is the interaction term needed? Explain (2 pts.)
d) Fit the parallel lines model to these data and interpret the parameter estimates. To do this drop the interaction term from the previous model or simply put both Disease Status and Ideal Body Weight into the model effects box. May one conclude that the two sampled populations differ with respect to mean protein turnover when percentage of ideal weight is taken into account? (3 pts.)
e) Give and interpret a 95% CI for the difference between lung cancer patients and healthy controls adjusting for the percentage of ideal body weight. To do this you will need to look at the parameter estimates section of the output from the model in part (d) and use the estimated coefficient for Disease Status along with the standard error. Don’t forget to multiple the resulting CI by 2 to account for the +1 and -1 coding (see the Powerpoint for details). How does this interval compare with the confidence interval you found in part (a)? (3 pts.)
f) Check model assumptions for these data. Discuss your findings. (3 pts.)
4 – Caregiver Burden of Senile Dementia Patients
Son et al. in their paper “Korean Adult Child Caregivers or Older Adults with Dementia” in the Journal of Gerontological Nursing (2003) examined the relationship between burden on the family caregivers and general characteristics of the family member with senile dementia. The dependent variable or response is caregiver burden as measured by the Korean Burden Inventory (BURDEN). Scores on this response ranges from 28 to 140, with higher scores indicating higher burden. Explanatory variables or predictors were the following:
- X1 = CGAGE – caregiver age (years)
- X2 = CGINCOME – caregiver income (Won-Korean currency)
- X3 = CGDUR – caregiver-duration of caregiving (month)
- X4 = ADL – total activities of daily living where low scores indicate the elderly perform activities independently.
- X5 = MEM – memory and behavioral problems with higher scores indicating more problems
- X6 = COG – cognitive impairment with lower scores indicating a greater degree of cognitive impairment.
- X7 = SOCIALSU – total score of perceived social support (25 – 175, higher values indicating more support)
- Y = BURDEN - caregiver burden as measured by the Korean Burden Inventory.
Data File: Caregiver Burden.JMP
a) Examine a scatter plot matrix for these data. Which predictors exhibit the strongest
correlation with the response? Comment on anything else that seems interesting in the
scatter plot matrix. (4 pts.)
b) Find the partial correlation for the potential predictors with the response, which
predictors have the strongest adjusted relationship with the response. Explain. (3 pts.)
c) Fit the multiple regression model using all of the available predictors. What is the R-
square? What is the adjusted R-square? Interpret these quantities. (3 pts.)
List the predictors in order of removal along with theirs p-values. (3 pts.)
Dropped variable p-value Model R2 Change in R2
Step 1
Step 2
Step 3
Etc…
e) Check and discuss model assumptions. Include the appropriate graphical displays.
(3 pts.)
f) Use the information provided below along with your final model to predict the caregiver burden score for the following individuals. (2 pts.)
Weng Kee Wong
CGAGE = 31, CGINCOME = 200, CGDUR = 48, ADL = 30, MEM = 7, COG = 17, SOCIALSU = 129
Shang Kai Shek
CGAGE = 25, CGINCOME = 600, CGDUR = 24, ADL = 47, MEM = 29, COG = 3, SOCIALSU = 137
In doing this use your final reduced model from part (d). In order to find the predicted caregiver burden score you need to plug the appropriate predictor values these individuals into your model. For example, suppose your final model included CGAGE and ADL only (it won’t, this is hypothetical) you would have the following:
Caregiver Burden (Y) = CGAGE + ADL
where the values for the intercept (and “slopes” (will be found in the Parameter Estimate section. You would then take the CGAGE and ADL values for individual whose burden score you are trying to predict into the equation to obtain the predicted burden score.