ONLINE RESOURCE for Intensive Care Medicine

Development and Validation of Pediatric Risk Estimate Score for Children Using Extracorporeal Respiratory Support (Ped-RESCUERS)

Ryan P. Barbaro, MD, MSc, Philip S. Boonstra, PhD, Matthew L. Paden, MD,Lloyd A. Roberts, MBBS, Gail M. Annich, MD, MS,Robert H. Bartlett, MD, Frank W. Moler, MD, MS, and Matthew M. Davis, MD, MAPP

Affiliations: 1Department of Pediatrics, University of Michigan, Ann Arbor; 2Child Health Evaluation and Research (CHEAR) Unit, University of Michigan, Ann Arbor; 3School of Public Health Department of Biostatistics, University of Michigan, Ann Arbor 4Division of Pediatric Critical Care, Emory University, Atlanta, Georgia; 5Intensive Care Department, Alfred Hospitaland School of Public Health and Preventative Medicine, Monash University Melbourne, Australia;6Critical Care Medicine, University of Toronto, Toronto, Canada;7Department of Surgery, University of Michigan, Ann Arbor; 8Department of Internal Medicine, University of Michigan, Ann Arbor, and 9Gerald R. Ford School of Public Policy and Department of Health Management and Policy, School of Public Health, University of Michigan, Ann Arbor

Address correspondence to: Ryan Barbaro, University of Michigan, 1500 East Medical Center Drive, Mott F-6790/Box 5243, Ann Arbor, MI 48109, , (734) 764-5302

Supplemental Methods

Multiple Imputations

Primary Diagnosis and Comorbid Conditions

Model Fitting

Model Selection

Supplemental Tables

Table e1.Proportion ofmissing data among candidate variables

Table e2.International Classification of Diseases-9- Clinical Modification (ICD-9-CM) Definitions of primary diagnosis, cardiac arrest and acute renal failure

Table e3.Pre-ECMO characteristics of patients in the validation (2013-2014) dataset

Table e4. Blood gas and supportive therapies prior to extracorporeal membrane oxygenation (ECMO) support in the validation (2013-2014) dataset

Supplemental Tables

Figure e1. Expected versus observed mortality in pediatric respiratory ECMO

Supplemental Methods

Multiple Imputations

To make use of the partial information available in the observations with missing data, we used multiple imputation [1], specifically, iterative chained equations[2, 3] using the MICE statistical software [4, 5]. For continuous- and mixed continuous/discrete-type (`numeric') variables, for which a proportion of measurements may be exactly equal to a single number, we used predictive mean matching (PMM) [6], which imputes missing values with observed values from other similar observations. PMM has been found to be competitive with alternative imputation strategies [7, 8]. For binary- and categorical-type variables, we used logistic and categorical regression models. We evaluated 34 candidate variables for potential inclusion in the imputation analysis model. These are presented in Table 1, together with the percentages of missing values.

Logical relationships between variables were enforced during the course of the imputations, e.g. if ventilator type was imputed as high frequency oscillatory ventilation (HFOV), a measure for positive end expiratory pressure(PEEP) is not defined. Derived variables, such as the ratio of arterial partial pressure of oxygen to fraction of inspired oxygen (PF ratio), were imputed deterministically when one or more of their ingredients was missing. In order to prevent successive imputations of ventilator type from getting “stuck" at an initialized value due to high correlation between ventilator type and ventilator setting [rate, peak inspiratory pressure (PIP), PEEP, and/or mean airway pressure (MAP)], the imputation equation for ventilator type ignored the current imputed ventilator settings. In general, all covariates considered for inclusion in the final analysis model, which is described below, were used in the imputation model, plus percent oxygen saturation of hemoglobin in arterial blood, systolic blood pressure, diastolic blood pressure, fraction of inspired oxygen, and the arterial partial pressure of oxygen (PaO2).

Imputed datasets were individually analyzed via Bayesian methods, with the results combined across imputations as described in Gelman et al. [9] and Zhou and Reiter [10]. Based on results in Zhou and Reiter [10] we imputed 100 complete datasets for analysis.

We re-ran our entire analysis strategy to investigate sensitivity to unusual observed values of ventilator settings, which may impact the imputation and/or subsequent statistical analysis.

Unusual ventilator settings were defined as:

  1. Conventional ventilator peak inspiratory pressure > 50 and set respiratory rate <8 breaths per minute in a patient who does not have either asthma or an arterial partial pressure of carbon dioxide >100. (8 patients)
  2. HFOV with an amplitude < 10 cm H2O. (9 patients)
  3. HFOV with an frequency > 15 hertz (25 additional patients)
  4. HFOV with a MAP < 15 cm H2O or > 55 cm H2O (14 additional patients)
  5. HFOV with a recorded PEEP (25 additional patients)

In the sensitivity analysis, for these patients, all ventilator-related predictors (ventilator type, rate/frequency, PIP/amplitude, PEEP, and MAP) were assumed to be missing and multiply imputed. Additionally, we set to missing the PaO2 measurement for 6 patients having PaO2< 10 mmHg.

Candidate Variables

We defined primary diagnostic categories based on the International Classification of Disease-9-Clinical Modification (ICD-9-CM) diagnostic codes for primary diagnoses in the Extracorporeal Life Support Organization (ELSO) registry. We based our diagnostic categories on the previous work by Zabrocki and colleagues that identified 16 primary diagnoses. We excluded pneumocystis pneumonia and fungal pneumonia because these diagnoses were present in less than 1% of cases (or less < 16 cases). Since the diagnoses of influenza, bacterial pneumonia and viral pneumonia had similar outcomes and pathology, we combined them into a single category. We also added pulmonary hypertension and cardiac disease as diagnostic categories. Finally, if a patient was classified as acute respiratory failure, we used the secondary diagnosis to reclassify them into asthma, aspiration pneumonia, bronchiolitis, pertussis, pneumonia, pulmonary hemorrhage, chronic respiratory failure, congenital airway anomaly, drowning or inhalation injury, non-pulmonary infection, trauma or post-operative, pulmonary hypertension, cardiac disease, other respiratory disease or other.

We defined acute renal failure and cardiac arrest based on the ICD-9-CM codes. While someone could develop organ failure or cardiac arrest on ECMO we tried ensure we were capturing pre-ECMO disease by excluding those patients that had acute renal failure or cardiac arrest listed as a complication of ECMO. We defined comorbidities using ICD-9-CM as defined by Feudtner et al [11].

Model Fitting

After interacting ventilator settings with ventilator type, re-coding categorical variables into dummy indicator variables, and adding one prospectively identified covariate-by-covariate interaction (age in months on the log-scale by mean arterial pressure), a total of 53 potential predictors were considered for inclusion in the final analysis model. These are listed in Table 2. Not all possible ventilator-type-by-setting interactions were considered.

We location-scale transformed each imputed dataset to a common standard. For location, we used the observed mean (ignoring missing values) of each covariate. For scale, we used twice the standard deviation of the observed data. This is recommended by Gelman[12] in the presence of a mixture of continuous and binary covariates, because a change in two standard deviations for a binary predictor with probability 0.5 is interpreted as the difference in association between 0 and 1.

Next, we fit a multiple logistic regression model,

Equation (1): logit Pr(Y = 1|X) = μ + XTβ

where logit(x) = log(x/[1-x]), Y indicates whether the patient died, X is a length-53vector of potential variables, and β={β1,…,β53) is the vector of log-odds ratios (ORs). Based on our location-scale choice, βk is interpreted as the log-OR for mortality comparing a patient whose measurement is two standard deviations above the mean to a patient whose measurement is equal to the mean.

We employed a novel Bayesian variable selection strategy to include only those predictors whose association with mortality, measured by the log-OR, βk appears to be considerably far from zero. To do so, we used the so-called `horseshoe prior' [13, 14]. The horseshoe places a hierarchical shrinkage prior on each βk , given by βk N(0, [λkτ]2), λkC+(0,1), k=1,…,53, and τ~ C+(0,1), where N(0,σ2) is the normal distribution with standard deviation σ and C+(0,1) is the standard half-Cauchy distribution whose density is proportional to f(x)=1/(x2+1), for x ϵ (0,∞). As we will discuss, our model selection strategy depends on a prior that can shrink small observed associations to zero. Intuitively, the horseshoe is able to do so because of the heavy-tailed nature of the Cauchy distribution and the dual local/global shrinkage induced by the scale product λkτ . A small value of the global parameter τ can have a large shrinkage effect on small associations, pulling the corresponding βk’s to zero, while, simultaneously, the predictor-specific local parameters λk may be large, allowing for the flexibility to leave alone the βk’s corresponding to large empirical associations.

We fit the model using Hamiltonian Monte Carlo methods [15] with the STAN software interface in R [5, 16]. For each of the 100 imputed datasets, we ran the sampler for a warm-up period of 400 iterations, keeping the subsequent 300 draws of β, and combining these across all 100 imputed datasets, so as to propagate imputation uncertainty [9, 10]. This yields a combined 30,000 draws of βfrom the posterior distribution.

Model Selection: Development of Ped-RESCUERS

Model estimation after model selection can invalidate distributions of test statistics under

the null hypothesis [17, 18] and negatively impact out- of-sample predictions due to overfitting[19]. Because of this, we designed our strategy in such a way that the values of estimated log-ORs in the fit model did not depend on which variables were ultimately selected and which were not. To summarize our approach, we fit a single Bayesian logistic regression model given in equation (1). The kth predictor (out of 53 total) was selected ifPr(βk 0.08) 0.75 or Pr(βk < −0.08) 0.75.

The details of our strategy are as follows. We adaptively divided the set of predictors into X = {X(1), X(2)}, where X(1)denotes those that have a sufficiently large posterior probability of either being risk or protective factors, i.e. satisfying the criteria above. For each predictor, these probabilities are estimated by determining the proportion of draws, out of 30,000, for which βk exceeds 0.08 (risk factor) or falls below −0.08 (protective factor). Given a hypothetical baseline probability of mortality of logit-1(-0.405) = 0.40, where logit-1(x) = 1/[1 + exp(−x)], a log-OR of t changes this probability to logit-1(-0.405+t). Our choice of cutoff t = ±0.08 means that X(1)comprises those predictors for which the data are highly suggestive (with probability 0.75) of a two-standard deviation increase changing a baseline probability of 0.40 by at least 0.02, since logit-1(-0.405+0.08) = 0.42 and logit-1(-0.405-0.08) = 0.38. The set of log-ORs corresponding to X(1), collectively denoted by β(1), was estimated by the posterior mean, E[β(1)], the average of the 30,000 draws.

The remaining predictors are in X(2), comprising those that explain little variation in mortality. Accordingly, we make the simplifying assumption that the effect of X(2) is captured as a population-wide constant: X(2)Tβ2 ≈ E[X(2)]Tβ(2). In fact, because each covariate is standardized to have mean zero, E[X(2)]Tβ(2) = 0 by construction, meaning that these covariates may be dropped entirely.

To summarize, a patient’s model-based risk of mortality p(X) = logit-1(µ + XTβ) is estimated by pˆ(X) = pˆ∗(X(1)) = logit-1(E[µ] + X(1) TE[β(1)]).

1

Table e1. Proportion ofmissing data among candidate variables

Variable / Development (N=1,611) / Validation (N=847)
Missing / % / Missing / %
Mean Arterial Pressure,a mm Hg / 336 / 20.9 / 158 / 18
Time between admission and ECMO, hours / 100 / 6.2 / 65 / 9.1
Weight, kilograms / 56 / 3.5 / 25 / 3.1
Age, months / 0 / 0 / 0
Female / 10 / 0.6 / 22
Primary Diagnosis / 0 / 0 / 0 / 0
Comorbiditiesb / 0 / 0 / 0
Complications of Acute Illness
Acute Renal Failurec / 0 / 0 / 0 / 0
Cardiac Arrestc / 0 / 0 / 0 / 0
pHa / 86 / 5.3 / 74 / 7.1
PaCO2a / 88 / 5.5 / 78 / 7.3
Bicarbonatea / 174 / 10.8 / 120 / 12
PF ratioa / 173 / 10.7 / 113 / 8
pre-ECMO duration of mechanical ventilation / 90 / 5.6 / 82 / 10
Ventilator Type / 233 / 14.5 / 121 / 12.7
Rate / 253 / 15.7 / 152 / 17.8
Peak Inspiratory Pressure or Amplitude, as appropriate / 325 / 20.2 / 171 / 19.6
Positive End Expiratory Pressure, if appropriated / 243 / 15.1 / 137 / 15.1
Mean Airway Pressure / 462 / 28.7 / 263 / 28.9
Hand Bagging / 40 / 2.5 / 0 / 0
Inhaled Nitric Oxide / 0 / 0 / 0 / 0
Neuromuscular Blockade / 0 / 0 / 0 / 0
Vasoactive Infusions / 0 / 0 / 0 / 0
Milrinone Infusions / 0 / 0 / 0 / 0
Continuous Renal Replacement Therapy / 0 / 0 / 0 / 0

PaCO2= the arterial partial pressure of carbon dioxide, HCO3=serum bicarbonate, PF ratio = ratio of the arterial pressure of oxygen to the fraction of inspired oxygen, ECMO= Extracorporeal Membrane Oxygenation,

aMost abnormal value recorded within 6 hours of receipt of ECMO support

bComorbidities are defined by Feudtner et al.[11]

cPre-ECMO renal failure and pre-ECMO cardiac arrest are defined by ICD-9-CM codes plus the respective absence of renal failure or cardiac arrest as an ECMO complication.

dChildren on the high frequency ventilator were not expected to have a positive end expiratory pressure and so they did not contribute towards the tally of missing positive end expiratory pressure values.

1

Table e2.International Classification of Diseases-9- Clinical Modification (ICD-9-CM) Definitions of primary diagnosis, cardiac arrest and acute renal failure

Diagnosis / ICD-9-CM / Description
Asthma / 493-493.12 / extrinsic and intrinsic asthma
493.9-493.92 / unspecified asthma
Bronchiolitis / 79.6, 480.1 / respiratory syncytial virus (RSV)
466.1-466.19 / bronchiolitis
Pertussis / 033-033.9, 484.3 / whooping cough and pertussis pneumonia
Cardiac Arrest / 427.5 / cardiac arrest
779.85 / cardiac arrest of newborn
v12.53 / personal history of cardiac arrest
Acute Renal Failure / 584-584.9 / acute kidney failure

1

Table e3.Pre-ECMO characteristics of patients in the validation time period, 2013-2014

Variable / All, (N=847) / Survived, (N=525) / Died, (N=322)
Median (Interquartile Range)
Mean Arterial Pressure,a mm Hg / 58 (48-70) / 59 (48-71) / 56 (45-68)
Time between admission and ECMO, hours / 48 (10-172) / 39 (9-142) / 62 (15-221)
Weight, kilograms / 14 (7-40) / 13 (7-34) / 16 (6-46)
Age, months / 32 (7-134) / 30 (8-115) / 39 (5-155)
Number (%)
Female / 392 (46.3) / 246 (46.9) / 146 (45.3)
Primary Diagnosis
Asthma / 45 (5.3) / 39 (7.4) / 6 (1.9)
Aspiration Pneumonia / 9 (1.1) / 9 (1.7) / 0 (0)
Bronchiolitis / 73 (8.6) / 53 (10.1) / 20 (6.2)
Pertussis / 14 (1.7) / 4 (0.8) / 10 (3.1)
Viral or Bacterial Pneumonia / 195 (23.0) / 133 (25.3) / 62 (19.3)
Pulmonary Hemorrhage / 9 (1.1) / 8 (1.5) / 1 (0.3)
Chronic Respiratory Failure / 22 (2.6) / 13 (2.5) / 9 (2.8)
Congenital Airway Anomaly / 15 (1.8) / 6 (1.1) / 9 (2.8)
Other Respiratory Disease / 157 (18.5) / 87 (16.6) / 70 (21.7)
Drowning, Inhalation, or Foreign Body / 18 (2.1) / 12 (2.3) / 6 (1.9)
Infection / 101 (11.9) / 54 (10.3) / 47 (14.6)
Trauma or Postoperative / 38 (4.5) / 22 (4.2) / 16 (5.0)
Pulmonary Hypertension / 20 (2.4) / 9 (1.7) / 11 (3.4)
Cardiac Disease / 51 (6.0) / 28 (5.3) / 23 (7.1)
Other / 80 (9.5) / 48 (9.1) / 32 (9.9)
Comorbidities
Neuromuscular / 5 (0.6) / 1 (0.2) / 4 (1.2)
Cardiovascular / 92 (10.9) / 46 (8.8) / 46 (14.3)
Respiratory / 63 (7.4) / 39 (7.4) / 24 (7.4)
Renal / 7 (0.8) / 5 (0.9) / 2 (0.6)
Gastrointestinal / 13 (1.5) / 5 (0.9) / 8 (2.5)
Hematology / 8 (0.9) / 6 (1.1) / 2 (0.6)
Immunodeficiency / 23 (2.7) / 14 (2.7) / 9 (2.8)
Metabolic / 10 (1.2) / 4 (0.8) / 6 (1.9)
Other Congenital or Genetic Defects / 57 (6.7) / 32 (6.1) / 25 (7.8)
Malignancy / 50 (5.9) / 19 (3.6) / 31 (9.6)
Complications of Acute Illness
Acute Renal Failureb / 17 (2.0) / 8 (1.5) / 9 (2.8)
Cardiac Arrestb / 38 (4.5) / 23 (4.4) / 15 (4.7)

aMost abnormal value recorded within 6 hours of receipt of ECMO support

bPre-ECMO renal failure and pre-ECMO cardiac arrest are defined by ICD-9-CM codes plus the respective absence of renal failure or cardiac arrest as an ECMO complication. Comorbidities as defined by Feudtner et al [11].

1

Table e4: Blood gas and supportive therapies prior to extracorporeal membrane oxygenation (ECMO) support in the validation time period, 2013-2014

Variable / All, (N=847) / Survived, (N=525) / Died, (N=322)
Median (IQR)
pHa / 7.19 (7.06-7.31) / 7.20 (7.06-7.32) / 7.17 (7.07-7.29)
PaCO2,a mm Hg / 67 (50-92) / 67 (50-93) / 67 (51-90)
Bicarbonate,ammol/L / 26 (20-33) / 26 (21-33) / 25 (19-34)
PF Ratioa / 56 (43-80) / 58 (45-84) / 52 (40-75)
Pre-ECMO duration of mechanical ventilation, hours / 40 (14-128) / 39 (13-117) / 42 (15-150)
Conventional Ventilator Settings
Rate†, breaths per minute / 26 (20-35) / 25 (20-32) / 29 (20-36)
Peak Inspiratory Pressure,a cm H2O / 34 (29-40) / 34 (28-40) / 34 (30-40)
Positive End Expiratory Pressure,a cm H2O / 10 (6-12) / 10 (6-12) / 10 (7-12)
Mean Airway Pressure,a cm H2O / 19 (15-24) / 18 (15-23) / 21 (16-25)
High Frequency Oscillatory Ventilator Settings
Frequency,a Hertz / 7 (6-8) / 7 (6-8) / 7 (5-9)
Amplitude,a cm H2O / 55 (45-70) / 56 (45-70) / 55 (45-65)
Mean Airway Pressure,a cm H2O / 28 (24-32) / 27 (24-31) / 30 (25-34)
Other Ventilatorb Settings
Mean Airway Pressure,a cm H2O / 20 (12-34) / 19 (14-34) / 28 (6-32)
Number (%)
Ventilator Type
Conventional Ventilator / 380 (44.9) / 231 (44.0) / 149 (46.3)
High Frequency Oscillatory Ventilator / 329 (38.8) / 203 (38.7) / 126 (39.1)
Other Ventilatorb / 17 (2.0) / 11 (2.1) / 6 (1.9)
Missing / 121 (14.3) / 80 (15.2) / 41 (12.7)
Other Pre-ECMO Supportive Therapies
Hand Ventilation / 75 (8.9) / 49 (9.3) / 26 (8.1)
Inhaled Nitric Oxide / 366 (43.2) / 221 (42.1) / 145 (45.0)
Neuromuscular Blockade / 484 (57.1) / 301 (57.3) / 183 (56.8)
Vasoactive Infusions / 343 (40.5) / 208 (39.6) / 135 (41.9)
Milrinone Infusion / 121 (14.3) / 77 (14.7) / 44 (13.7)
Continuous Renal Replacement Therapy / 14 (1.7) / 7 (1.3) / 7 (2.2)

PaCO2= the arterial partial pressure of carbon dioxide, PF ratio=the ratio of arterial partial pressure of oxygen to fraction of inspired oxygen

aMost abnormal value recorded within 6 hours prior to the receipt of ECMO support

bOther Ventilator in the Extracorporeal Life Support Organization Registry is defined as other high frequency ventilator and usually corresponds to the jet ventilator.

Figure e1. Expected versus observed mortality in pediatric respiratory ECMO

Patients in the 2014 cohort were sorted by order of increasing estimated risk of mortality. The x-axis is the expected probability of mortality, in a moving window of 25 patients, and the y-axis gives the difference between expected and observed probability of mortality. Values above (below) the line y = 0 indicate over-(under-)estimation of mortality risk. The dashed bars are the point-wise 95% confidence bands based on the null hypothesis of correct calibration.

1

References

1.Rubin DB (2004) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons.

2.Van Buuren S, Brand JPL, Groothuis-Oudshoorn CGM, Rubin DB, (2006) Fully conditional specication in multivariate imputation. J Stat Comput Simul 76 (12): 1049-1064.

3.White IR, Royston P, Wood AM, (2011) Multiple imputation using chained equations: Issues and guidance for practice. Stat Med 30: 377-399.

4.Van Buuren S, Groothuis-Oudshoorn K, (2011) mice: Multivariate Imputation by Chained Equations in R. J Stat Software 45 1-67.

5.R Development Core Team (2013) R: A language environment for statistical computing. In: Editor (ed)^(eds) Book R: A language environment for statistical computing. R Foundation for Statistical Computing.

6.Little RJ, (1988) Missing-data adjustments in large surveys. J Bus Econ Stat 6: 287-296.

7.Marshall A, Altman DG, Royston P, Holder RL, (2010) Comparison of techniques for handling missing covariate data within prognostic modelling studies: a simulation study. BMC Med Res Methodol 10: 7

8.Vink G, Frank LE, Pannekoek J, van Buuren S, (2014) Predictive mean matching imputation of semicontinuous variables. Stat Neerl 68: 61-90.

9.Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian data analysis. Taylor & Francis.

10.Zhou X, Reiter JP, (2010) A Note on Bayesian Inference After Multiple Imputation. Am Stat 64: 159-163.

11.Feudtner C, Hays RM, Haynes G, Geyer JR, Neff JM, Koepsell TD, (2001) Deaths attributed to pediatric complex chronic conditions: national trends and implications for supportive care services. Pediatrics 107: E99.

12.Gelman A, (2008) Scaling regression inputs by dividing by two standard deviations. Stat Med 27: 2865-2873.

13.Carvalho CM, Polson NG, Scott JG, (2009) Handling Sparsity via the Horseshoe. J Mach Learn Res 5: 73-80.

14.Carvalho CM, Polson NG, Scott JG, (2010) The horseshoe estimator for sparse signals. Biometrika 97: 465-480.

15.Neal RM (2011) Mcmc using hamiltonian dynamics. In: Brooks S, Gelman A, Jones GL, Meng X (eds) Handbook of Markov Chain Monte Carlo. Chapman & Hall/CRC, Boca Raton, FL

16.Stan Development Team (2014) RStan: the R interface to Stan, Version 2.5.0.

17.Leeb H, ouml, tscher BM, (2005) Model selection and inference: Facts and fiction. Econometric Theory 21: 21-59.

18.Berk R, Brown L, Zhao L, (2010) Statistical inference after model selection. J Quant Criminol 26: 217-236.

19.Breiman L, (1996) Bagging predictors. Mach Learn 24: 123-140.