Electronic supplementary material

Evaluation of the user-friendliness of 7 new generation intensive care ventilators

Laurence Vignaux, Didier Tassaux, Philippe Jolliet

Service des soins intensifs, Hôpitaux universitaires de Genève, Geneva, Switzerland

Word count: 2'904

Running title: ICU ventilator user-friendliness

Keywords: mechanical ventilation; user-friendliness; ergonomics

Address for correspondence
Laurence Vignaux

Service des soins intensifs
Hôpital Cantonal Universitaire
1211 Geneva 14, Switzerland
Phone number: (+41 22) 372.90.47Telefax: (+41 22) 372.91.05
Email:

Material and methods

All machines tested were stock models with the most recent software version, without modifications, and all were tested in operating conditions conforming to the manufacturer's specifications.

All physicians participating in the study had prior knowledge and training in the general principles and practice of mechanical ventilation, and had at least a ten-year experience with ventilators both in their specialty and in the ICU. However, none were familiar with the particular machines tested. The pulmonologists were in charge of home ventilation care and the anesthesiologists routinely used operating room ventilators. To further ensure homogeneity among the physicians, all received a lexicon with the current terms used for mechanical ventilation, as well as a chapter summarizing the principles of mechanical ventilation, one week prior to the test. Immediately before the first ventilator test, each physician passed an 8-question multiple-choice exam on mechanical ventilation. The score obtained was recorded (maximum 40 points).

Test procedure: All machines were tested in random order, determined by drawing a folded slip of paper with the name of the machine from a box.

Each ventilator was equipped with a standard double limb circuit and connected to a test lung model (Pneu View AI 2601I TTL, Michigan Instruments, Grand Rapids, MI, USA) described in previous studies [1, 2]. The mechanics of the lung chamber were set to normal elastance (E, 20 cmH2O.L-1) and resistance (R, 5 cmH2O.L-1.s).

The tasks were indicated one at a time, orally, by the observer, once the previous task had been accomplished. The physicians could request any clarification needed as to the nature of the task, to the exclusion of any information likely to facilitate the procedure on the specific ventilator (e.g. knob location, menu access). The time to perform each task was measured with a precision stopwatch by the tester and recorded. An arbitrary upper limit of 3 minutes was set, at and above which task failure was declared.

The tester was an experienced respiratory physiotherapist (LV) with in-depth knowledge of each machine and task to be accomplished. The tester's time to acomplish each task was used as the reference [3]. For each task, the tester gave the starting signal and stopped the timing as soon as the goal had been attained, or at the 3 minute task failure limit.

The tasks to accomplish were the following:

1. Turning the ventilator on

With the ventilator completely assembled and connected to the power supply, the tester had to start the ventilator; the stop signal was given at the first insufflation produced by the ventilator.

2. Recognizing mode and parameters

With the ventilator turned on and running in a preset mode, the tester had to indicate on a written chart: a) whether the machine was in pressure or volume control; b) what were the set parameters [volume control: inspired oxyge fraction (FIO2), tidal volume (VT), respiratory rate (RR), inspiratory flow, plateau time (tplat) and positive end-expiratory pressure (PEEP); pressure control: FIO2, inspiratory pressure level, RR, inspiratory flow, tplat and PEEP] and the measured parameters [volume control: plateau pressure, inspiratory:expiratory ratio (I:E) and minute volume (MV); pressure control: VT, MV]. The stop signal was given when the observer had filled the chart in.

The first machine tested was in pressure control, then pressure and volume control were alternated on each subsequent ventilator. The settings were: volume control: FIO2 0.4, VT 500 ml, RR 12/min, inspiratory flow 60 L/min, tplat 0.5 s, PEEP 5 cm H2O; pressure control: FIO2 0.3, inspiratory pressure level 20 cm H2O, RR 15/min, PEEP 4 cm H2O.

3. Recognizing and setting alarms

With the ventilator turned on and running in a preset mode, the tester changed one alarm setting to trigger an alarm signal, and the physician had to stop the alarm, define what was its cause, correct the level of the modified parameter, and reset the alarm so that the message disappeared from the display. The timing stop was given when the latter occurred. The alarms were: high pressure, low MV, high RR, low VT, alternated in that order between machines.

4. Mode change

With the ventilator turned on and running in a preset mode, the physician had to change from volume control to pressure control, using the settings listed above. The first machine tested was in pressure control, then pressure and volume control were alternated on each subsequent ventilator. The timing stop was given with the first insufflation of the new mode.

5. Finding and activating the pre-oxygenation function

This setting, which sets an FIO2 of 1.0 during two minutes, had to be turned on. The timing stop was given as soon as the function was activated.

6. Pressure support setting

With the ventilator turned on and running in a preset volume control mode using the settings listed above, the physician has to set a given level of pressure support and PEEP, pressurization slope, inspiratory trigger and cycling (if adjustable). The settings were the following: pressure support level (PSL) 15 cm H2O, PEEP 5 cm H2O, slope 100 ms, inspiratory trigger 3 L/min, cycling 30 % of the peak inspiratory flow. The timing stop was given with the first insufflation of the new mode.

7. Stand by

With the ventilator turned on and running in a preset volume control mode using the settings listed above, the physician had to activate the standby mode (except for the PB840 which has no standby mode). The timing stop was given as soon as the mode was activated.

8. Finding and activating the NIV mode

With the ventilator in standby mode, locate and activate the specific NIV function. The timing stop was given as soon as the function was activated.

Besides the time taken to accomplish the tasks, physicians were asked to grade their subjective impression of overall difficulty of interacting with each device on a scale of 0 (very easy) to 10 (very difficult)[3]. Finally, a difficulty index (DI) was computed for each ventilator: DI = (total time for 8 tasks (s) x n failures) / 1000. The rationale for this index was to take into account both the time required for performing all tasks on each machine, plus the number of failures, i.e. the higher the DI, the more difficult the interaction with the machine. The purpose was to correlate the subjective impression of difficulty by the physicians and the objective assessment reflected by the DI.

Statistics

Time values are expressed as median (interquartile range). Other values are expressed as mean (SD). Times for all tasks were compared between anesthesiologists and pulmonologists, and between observers' and reference times, by a Student's t test or a Mann-Whitney test depending on the normal or non-normal nature of the distribution. Difference between ventilators for each task was assessed by a Kruskal Wallis analysis of variance. The proportion of failures for each task was evaluated for the seven ventilators, and compared between ventilators by means of Fisher's exact test.

Diffficulty scores given by the physicians were compared between ventilators by means of Student's t test. The relation between the DI and difficulty score were analyzed by linear regression

All p. values < 0.05 were considered significant.

Results

Overall results

No significant difference was observed between anesthesiologists and pulmonologists in terms of time or task failure. The score obtained by the physicians for the pre-study multiple-choice test was 35 ± 3 (maximum 40).

Table 2 in the main articleshows the mean values for the difficulty score given by the physicians for each ventilator. Figure 1 documents the total time needed per ventilator for the execution of all 8 tasks by the 10 physicians. Figure 2 shows the correlation between the difficulty score determined by the 10 physicians and the difficulty index (DI).The number of task failures of each participating physician for all tasks to be performed is shown in figure 3. The duration of each participating physician for all tasks to be performed is shown in figure 4.

Specific tasks

1. Turning the ventilator on (Figure 1 in main article)

The shortest time was obtained with the Elysée, with a median of 41 s (36-52), significantly different from the others (p<0.0001). Failure to turn on the Servo I occured with 50 % of physicians.

2. Recognizing mode and parameters (Figure 1 in main article)

The times needed of 60 s (38-75) for the Engström Carestation and 80 s (74-97) for the Evita XL, were significantly shorter compared to all other ventilators (p < 0.001). In those two machines the short time was due to the fact that the main screen clearly displays the mode and relevant parameters, without requiring access to sub-menus. The most commonly encountered types of error were: confusion between adjusted and measured parameters, error on FIO2 (Elysee), confusion between P max and P controlled above PEEP (Evita XL and BIPAP mode). The most often omitted parameters were I:E (Elysee, G5) and plateau pressure (Avea and PB 840). On the Engström Carestation, Avea and PB 840 most observers (6/10) read the plateau pressure on the display curve because they couldn’t find the numerical value. This was the one of the most often failed tasks (table 1 in main article)

3. Recognizing and setting alarms (Figure 1 in main article)

Very few failures were observed for this task (table 1 in main article), the shortest time of 34 s (27-48) being observed with the Engström Carestation (p < 0.01) and the longest 133 s (73-145) with the Avea (p < 0.05).

Stopping the alarm and recognizing its origin were rapid, as a clear message indicating the cause appears on the screen, while the symbol to stop the alarm is obvious and easy to locate. A various amount of time was needed to modify and validate the relevant alarm limit, mainly because of the need for double validation. Indeed, often physicians thought they had correctly set the alarm limits, whereas no change had occurred due to the double validation requirement.

4. Mode change(Figure 1 in main article)

The shortest time was documented with the Engström Carestation at 49 (42-74) s, but this did not reach statistical significance. The longest times were seen with the Avea and PB840, at 100 s (89-131) and 114 s (105-130), respectively (p < 0.01).

The most frequently observed reason for delayed mode change was searching for the appropriate “change mode” command on three machines (Servoi, Avea, PB840). On the PB840, seven steps are necessary to change the ventilatory mode, and on the Avea, all settings are represented by symbols.

5. Finding and activating the pre-oxygenation function(Figure 1 in main article)

Only 1 failure was observed (table 1 main article). This is the task were the median time was the closest to the reference time. No significant difference between ventilators was found.

6. Pressure support setting(Figure 1 in main article)

No statistically significant difference was observed between ventilators. All physicians failed to set pressure support on the Avea. The most common types of errors were the choice of the mode (PB840), trigger adjustment (Servoi, G5), the slope and the cycling setting (Avea).

7. Stand by(Figure 1 in main article)

Significantly lower times of 11 s (7-18) and 17 s (6-20) were measured for the Servoi and G5, respectively, compared to all other ventilators (p<0.001) . Six physicians failed the task with the Elysée.

8. Finding and activating the NIV mode(Figure 1 in main article)

Significantly lower times of 18 s (6-22) and 12 s (8-6) were measured for the Servoi and G5, respectively, compared to all other ventilators (p<0.001).

Difficulty rating and DI (Figure 1 in main article)

There was a strong correlation between the rating by physicians and the DI (p < 0.001).

Limitations of the study

First, our study population was made up of pulmonologists and anesthesiologists who do not use ICU ventilators in their daily practice. However, it is nearly impossible to find experienced ICU physicians without knowledge of at least some of the ICU ventilators used in the study. One alternative is to test young ICU physicians in training, as they have had less time to familiarize themselves with these machines, as done in a recent study [4]. However, task difficulties and failures might be attributable to their lack of overall experience rather than to the shortcomings of ventilator ergonomics. Therefore we chose to test physicians with a solid background in ventilation (mean 10 year experience) and its principles, ensuring homogeneity through the pre-test reading material and the multiple-choice question assessment. That the latter goal was achieved is suggested by the absence of difference in time needed or failed tasks between the two groups. In the same line of thought, one might questions the rationale of choosing physicians without prior training on the ventilators tested, as all ICUs normally don't introduce a new ventilator into routine practice in an ICU without proper training being dispensed to the staff. However, not all staff get to attend the training sessions, and even so receiving basic training on a ventilator does not necessarily imply that one is immune to the confusing aspects of certain interface shortcomings, as what might appear clear when explained during a training session might be less so at 2AM in an emergency. Second, the tasks we chose were arbitrarily defined, given the absence of a validated protocol for such tests. However, in our view they reflected the tasks that most often need to be performed by a physician under usual clinical conditions. Furthermore, they were comparable to those used in two recent trials addressing this issue [3, 4]. Third, the DI has not been formally validated. Nonetheless, its construction rests on the two parameters that reflect the difficulties encountered, i.e. task failures and time, and is strongly correlated with the subjective rating. Fourth, failure was defined as the need for ≥ 3 min. to complete the task, which might seem arbitrary. The duration was chosen because it has been used in the study by Gonzalez Bermejo et al. [3] and because it seems clinically relevant to consider that in an emergency, 3 min. is so long to adjust a ventilator that it must be considered as a failure to reach the intended goal. Fifth, the reference time was that of an experienced physiotherapist familiar with the ventilators tested, which might not reflect routine clinical conditions. While this is of course true, it is not the main focus of the study, which outlines the large differences between machines and the very long times that are needed to accomplish seemingly simple tasks. Sixth, one might question the relevance of our results compared with those that might have been obtained in clinical conditions with caregivers involved in everyday practice wth these machines.This is certzainly true from a theoretical standpoint, because it is a fact that clinical data would be more relevant than information stemming from a laboratory or bench study. However, there are several issues which precluded our going down that path: a) it would have been very difficult for us to ensure that the level of clinical experience among our subjects was homogeneous; b) the physicians we studied were not part of our permanent ICU staff, making it difficult to have them spend several days in the ICU doing clinical work with the ventilators; c) even if we had dropped the idea of studying outside physicians as we did in this study, and done an audit approach with our permanent staff, other difficulties would have arisen: population homogeneity, possible previous experience with one of the ventilators, difficulty to isolate problems due to interface from those stemming from other confounding factors present in the patient environment; d) only two of the ventilators were part of our routine equipment, raising loan availability issues. Conversely, in a laboratory setting it is possible to standardize study conditions so as to obtain results that, although more limited in scope and clinical relevance, have at least the merit of homogeneity and thereby contribute a smaller but clearer piece of the puzzle. Of note, the learning curve effect is a potentially confounding factor in our study as some of the interface difficulties might not have led to large delays had the physicians been experienced with the device. Supporting this is the very short time needed to accomplish the task by the reference RT who was experienced with all the ventilators tested. Nonetheless, our goal was to reproduce the situation of a physician called upon to interact with an unfamiliar ventilator, a not so infrequent occurence, e.g. during a night shift call. In this type of situation, there can be no learning curve effect, and the ability to solve the issues at hand will depend both on the experience of the physician with mechanical ventilation in general and on the user-friendliness of the machine. The first issue was addressed by ensuring a homogeneous basic knowledge of mechanical ventilation, confirmed by the results of the pre-test MCQ. The second point was the subject of the study. The study purposely does not address the learning curve effect, to avoid testing two hypotheses at once. The effect would have been very difficult to test from a practical standpoint, given that we would have had to standardize the learning process to ensure that all physicians reach the same level of proficiency in a before-after study design.