1 Impact of Complete Case Analysis

18-déc.-14

Supplementary material

1 Impact of complete case analysis

This example is based on a prospective observational study aiming to identify predictive factors forinsufficient peak serum concentration (Cmax) in a population ofcritically ill patients treated by Amikacin (15). The final multivariate analysis of predictive factors of amikacin Cmax60 mg/L included the 24-h fluid balance, the BMI and cirrhosis (Table below).Among the 181 episodes analyzed, 37 (20%) had no information on the 24-h fluid balance. A Complete Case Analysis (CCA) excluding patients with missing data and a Multiple Imputation (MI) approach were performed. We reproduced here the table of this multivariate analysis.

First, CCA implies a loss of information and thus a loss of power. This is illustrated here by the larger 95% Confidence Interval (CI) for the OR of the BMI based on CCA compared to MI. Secondly, if the excluded patients are not a representative subsample from the whole sample, their exclusion can lead to bias and affect the results. On this example, the conclusions with regards to the effect of cirrhosis on the outcome differ according to the method: with CCA, the 95%CI of OR for cirrhosis includes 1 (i.e., is not statistically significant) while that based on MI does not (and thus is significant). In contrast, the “24-h fluid balance” has a significant impact based on the CCA and no impact based on MI. This illustrates the heavy impact that could have CCA on conclusions and, thus, on the performance of predictive risk models in ICU.

de Montmollin E, Bouadma L, Gault N, Mourvillier B, Mariotte E, Chemam S, Massias L, Papy E, Tubach F, Wolff M, Sonneville R. Predictors of insufficient amikacin peak concentration in critically ill patients receiving a 25 mg/kg total body weight regimen. Intensive Care Med. 2014 Jul;40(7):998-1005.

2 Variables to include in imputation model

An important consideration in choosing the imputation model is the selection of variables that will contribute to the imputation process. Obviously, variables with missing values should be included. However, an important principle of MI iscompatibility of the imputation model with the analysis model, that is, it is desirable that the two models both be consistent with a single overall model for the data (12), and this requires that all the variables in the analysis model (includingthe outcome variable) be included in the imputation model, even thosethat have no missing values. Additionally, one may wish to include `auxiliary' variables that are correlated with the missing variables, in order to make the MAR assumption more credible. The MAR assumption means that, although missingness in a variable may depend on the value of that variable, any such dependence ceases once other observed variables are conditioned on. Thus, although it is not possible to validate the MAR assumption just from the data, this assumption gains in plausibility by including more covariates in the imputation model. One of the most common mistakes in MI is to omit the outcome from the imputation model, which resulting in potentially biased estimates. In the specific setting where some individuals have missing outcomes, it has been suggested that these individuals should be included in the imputation but then excluded from the analysis (2). However, including all individuals in the analysis (even those with missing outcome) is worthwhile for bias reduction and/or improving efficiency when auxiliary variables are highly correlated with the outcome (12).

3 Choice of number of imputed datasets

Readers of articles using MI will notice that m, the number of imputed datasets, varies between applications. The larger m is, the greater is the efficiency of the data analysis, but the longer this analysis takes to carry out. Early articles on MI suggested that m as small as 3-5 were adequate, since the efficiency gain from using larger m was minimal. However, more recent advice is that m should be at least equal to the percentage of incomplete cases (12).