STAT 601 ~ Assignment #1 (Due Monday, June 4th by 11:59 PM)

75 points

Note this assignment has been shortened from the one shown in the walk-through, problems 2, 5, and 8 have been removed.

Review the following:

Narrated Powerpoint Lectures under headings 1 – 4

Non-narrated Powerpoint Lectures under headings 1 – 4(optional)

Lecture Handouts : 1 – 4, there are blank and annotated ones.

1. An except taken from “Lactation suppression: a pilot study” published in the Australian Journal of Advanced Nursing (1986) is as follows:

“The first method (experimental group) was based on a new approach which claimed that post-partum breast discomfort in non-breastfeeding women would be reduced if milk was extracted from the breasts. The second approach (hospital policy) involved adhering to the normal hospital methods. … Non-breastfeeding women admitted to Ward A were assigned to the experimental group (n = 95) and non-breastfeeding women admitted to Ward B acted as the control group or hospital policy group (n = 57). … Three times daily, participants entered details on the questionnaire relating to self-assessment of pain, use of analgesia and methods employed in suppressing lactation.”

Describe any design problems that are evident. Also discuss biases or confounding that might have arisen from any design deficiencies. (4 pts.)

3. An experiment was carried out to measure the effect of growth hormone on girls affected by a growth disorder called Turner’s syndrome. All 34 girls in the study were given the growth hormone. Their heights were measured at the time the hormone was first administered and again one year later. What are some problems with a study conducted in this way? (2 pts.)

4. A survey of nurse managers at major hospitals was taken. The survey revealed that one nurse manager in five was under medication for stress and almost half had visited doctors because of the pressure they were under. These figures came from the 250 questionnaires returned from the 2500 that were sent out. How reliable do you think the results are and why? (3 pts.)

6. As part of a clinical study several variables were measured for each patient.Classify each the variables measured according to its data type (C = Continuous, O = Ordinal or N = Nominal).(6 pts.)

a) Age

b) Gender

c) Marital Status

d) Race

e) Previous hospitalization?

f) Anxiety Score

g) Cholesterol level

h) Smoking status

i) Alcohol consumption

j) Family history of cancer?
h) Blood pressure

i) White blood cell count

7. Using the article labeled “VA Cost Study” on the D2L site, answer the following questions using Table 1 on pg. 1378 of the paper. (8 pts.)

a) Classify each of the variables according to data type (C = continuous, O = ordinal, N = nominal).

b)How many survey responders were there? Nonresponders?

c)How many subjects and what percentage of the responders had a prior stroke?

d)How poststroke Modified Rankin Scale Score was most prevelant amongst the responders?

e)How many subjects and what percentage of the nonresponders had cognitive impairment at discharge?

f)On which characteristics/variables do the responders and nonresponders significantly differ on? Later in the course we will examine different statistical methods that are appropriate for making these types of decisions based upon our data.

Problems 9 – 12 - Right Heart Catheterization

EFFECTIVENESS OF RIGHT HEART CATHETERIZATION IN
CRITICALLY ILL PATIENTS (JAMA, 1996), Connors et al.

An excerpt from the abstract….
OBJECTIVE: To examine the association between the use of right heart catheterization (RHC) during the first 24 hours of care in the intensive care unit (ICU) and subsequent survival, length of stay, intensity of care, and cost of care.

DESIGN: Prospective cohort study conducted at U.S. teaching hospitals between 1989 and 1994.

SUBJECTS: A total of 5735 critically ill adult patients receiving care in an ICU for 1 of 9 pre-specified
disease categories (MOSF w/ sepsis, MOSF w/ malignancy, lung cancer, COPD, coma, colon cancer, cirrhosis, CHF, ARF).

Variables that make up this database our described below:

Demographics and Disease Category

Variable nameVariable Definition

AgeAge

SexSex

RaceRace (white, black, other)

EduYears of education

IncomeIncome (Under $11k, $11 – $25k, $25 - $50k, > $50k)

NinsclasMedical insurance (Private, Private & Medicare, No Insurance, Medicare,

Medicaid, Medicaid & Medicare)

Cat1Primary disease category (MOSF w/ sepsis, MOSF w/ malignancy,
lung cancer, COPD, coma, colon cancer, cirrhosis, CHF, ARF)

Categories of Admission Diagnosis
Diagnosis variables are all coded as (Y or N)

RespRespiratory Diagnosis

CardCardiovascular Diagnosis

NeuroNeurological Diagnosis

GastrGastrointestinal Diagnosis

RenalRenal Diagnosis

MetaMetabolic Diagnosis

HemaHematologic Diagnosis

SepsSepsis Diagnosis

TraumaTrauma Diagnosis

OrthoOrthopedic Diagnosis
Das2d3pcDASI (Duke Activity Status Index)

Dnr1DNR status on day1 (Yes or No)

CaCancer (Yes, No, Metastatic)

Surv2md1Logistic model estimate of the probability of surviving 2 months

dth30Patient died within 30 days? (Yes or No)
Aps1APACHE score

Scoma1Glasgow Coma Score

Wtkilo1Weight

Temp1Temperature

Meanbp1Mean blood pressure

Resp1Respiratory rate

Hrt1Heart rate

Pafi1PaO2/FIO2 ratio

Paco21PaCo2

Ph1PH

Wblc1WBC

Hema1Hematocrit

Sod1Sodium

Pot1Potassium

Crea1Creatinine

Bili1Bilirubin

Alb1Albumin

Categories of Comorbidities Illness (0 = no, 1 = yes)

CardiohxAcute MI, Peripheral Vascular Disease, Severe Cardiovascular Symptoms
(NYHA-Class III), Very Severe Cardiovascular Symptoms (NYHA-Class IV)

ChfhxCongestive Heart Failure

DementhxDementia, Stroke or Cerebral Infarct, Parkinson’s Disease

PsychhxPsychiatric History, Active Psychosis or Severe Depression

ChrpulhxChronic Pulmonary Disease, Severe Pulmonary Disease, Very Severe
Pulmonary Disease

RenalhxChronic Renal Disease, Chronic Hemodialysis or Peritoneal Dialysis

LiverhxCirrhosis, Hepatic Failure

GibledhxUpper GI Bleeding

MalighxSolid Tumor, Metastatic Disease, Chronic Leukemia/Myeloma, Acute
Leukemia, Lymphoma

ImmunhxImmunosupperssion, Organ Transplant, HIV Positivity, Diabetes Mellitus

Without End Organ Damage, Diabetes Mellitus With End Organ Damage,
Connective Tissue Disease

Transhx Transfer (> 24 Hours) from Another Hospital

AmihxDefinite Myocardial Infarction

Swang1*Right Heart Catheterization (RHC vs. No RHC)

SadmdteStudy Admission Date

DthdteDate of Death

LstctdteDate of Last Contact

DschdteHospital Discharge Date

Death *Death at any time up to 180 Days (Yes or No)

PtidPatient ID (for labeling purposes only)

The dataset in JMP format is available here: Right heart catheterization. It is available under the Datasets section of the website and is call Right Heart Catheterization.

QUESTIONS AND TASKS

9) Complete the following table for comparing and contrasting the two treatment groups
in this study on the following demographics. (10 pts.)

CharacteristicsNo RHC (n = 3551)RHC (n = 2184)

of the Subjectsn%n%

Sex

Female

Male

Race

Black

White

Other

Income

< $11,000

$11,000 - $25,000

$25,000 - $50,000

> $50,000

10) Use appropriate descriptive methods to examine the distribution of the APACHE score (variable name = aps1). Discuss and/or report the following summary statistics:

a)Mean, median, range, SD, 25th percentile (Q1), 75th percentile (Q3),

interquartile range (IQR), and the coefficient of variation (CV) (3 pts.)

b)How would you characterize the distributional shape for the APACHE scores? (1pt.)

c)Which is a better measure of typical value, the mean or the median, or doesn’t it matter? Explain. (2 pts.)

The next questions ask you to examine the relationship between two variables from this study. In question 11 you will examine the relationship between the 30-day mortality (dth30) and whether or not they had a right heart catheter /Swan-Ganz line (swang1) put in or not. In question 12 you will compare the APACHE scores of those who had a heart catheter put in to the scores of those who did not. You will probably want to review the examining relationships between variables Powerpoint lecture first.

11) Compare the 30-day mortality rates of patients who had a right heart catheter put in
to those who did not. Construct an appropriate graphical display and summary
table to do this.
(Use variables: dth30 = 30-day mortality indicator & swang1 = RHC or No RHC)

Summarize the important findings from this analysis in one or two sentences. Be
sure to incorporate all relevant computer output in your assignment that you turn in.
(6 pts.)

12) Compare the APACHE scores of patients who had a right heart catheter put in to
the scores of those who did not. Use an appropriate graphical display and
supporting summary statistics to compare these two groups of patients.
(Use variables: aps1swang1)

a) Summarize your findings from this analysis in one or two sentences,
citing appropriate numerical results in your discussion. Be sure to incorporate all
relevant computer output in your assignment when you turn it in. (6 pts.)

b) Given your results from problems 10 and 11 what can you say about the possible
confounding effect of the APACHE score? (2 pts.)

13) Birth Weights of Infants Born in North Carolina

All of the parts of this question also require the use of JMP and the use of the low birth weight dataset described on pg. 62 of the Daniels text (if you don’t have the book don’t worry, you don’t need it). The dataset is called NCbirth.JMP and is available on the course website under Datasets.

Data description:

The North Carolina State Center for Health Statistics and Howard W. Odum Institute for Research in Social Science at the University of North Carolina at Chapel Hill make publicly available birth and infant death data for all children born in the state of North Carolina. These data can be accessed at:

The data contained in NCbirth.JMP represent a random sample of n = 800 births in North Carolina in 2001. The variables and their coding are described in the table on the following page.

In addition to those described above, the following variables were created and added to the file NCbirth.JMP:

White? Coded as White or Non-White (dichotomous version of RACEMOM)

Hispanic?Coded as Non-Hisp or Hisp (dichotomous version of HISPMOM)

a) Construct a display and obtain summary statistics for comparing the birth weight
in grams (tgrams) of infants born to mothers who smoked during pregnancy vs.
those that did not. Discuss in a written short-paragraph the important results from
your descriptive analysis. (3 pts.)

b)A researcher believes that smoking during pregnancy is associated with increased risk

of having a baby prematurely (gestational age < 36 weeks). What do these data suggest? Construct an appropriate display and give supporting statistics to answer this question. Summarize your findings with one or two sentences. (3 pts.)

c)The researcher also believes that smoking during pregnancy is associated with an

increased risk of having an infant with a low birth weight. What do these data suggest? Construct an appropriate display and give supporting statistics to answer this question. Summarize your findings with one or two sentences. (3 pts.)

d) Which variable has more variation, weight gained during pregnancy by mother or
birth weight of infant in grams? Justify your answer using the appropriate statistics.
(3 pts.)

g) Complete the table below for comparing white mothers to non-white mothers. For
nominal variables report frequency (with % in parentheses) and for numeric variables
report the mean (with SD in parentheses) (10 pts.)

White Non-white

Variable mothers (n = 604) mothers (n = 195)

Mother’s Age (yrs.)
Gestation (weeks)
Birth weight (g)
Marital Married
Status Not married
Smoking Smoked
Status Did not smoke
Low Birth Yes
Weight? No
Premature? Yes
No

1