Differences inthe interpretationof FEV1 using fourcommonly used reference formulas in asthmatic patients

Barbosa M., Barbosa T., Brito T., Campos J., Carvalho L., Carvalho R., Costa A., Dias J., Maciel C., Mosca A., Pires C., Santos M., Silva F., Viana D. -

ADVISER Tiago António Queirós Jacinto | CLASS 2

Department of Biostatistics and Medical Informatics, Faculty of Medicine, University of Oporto, 4200-319 Oporto, Portugal

ABSTRACT

BACKGROUND Forced Expiratory Volume in the first second (FEV1) values are used to determine the lung function ofasthmatic patients. These reference valuesare obtained with different equations,integrating several variables, including individuals' characteristics. However, the evolution of the populations was not always taken into account. Consequently, some misdiagnosis can occur.

AIMAnalyze the differences between four of the most commonly used reference equations, in a consecutive sample of asthmatic patients performing spirometry and to determine whether it causes misclassification of obstructive defects.

METHODSData on 235 asthmatic people attending the Allergology Department of Hospital de S. Joãowas consecutively collected with the same technique (spirometry) and instrument (spirometer), by the same researcher. This study is observational, transversal, analytical and the unit of analysis is the individual. The other criteria of selection taken into account were being an adult and medical diagnosis of asthma. We have obtainedForced Vital Capacity (FVC), FEV1,Tiffeneau Index, Age, Height, Weight and Gender.

RESULTS We foundstatistically and clinicallysignificant differences among the results obtained by different FEV1 reference formulas in patients with asthma.

CONCLUSIONAccording to the equation used, one patient may have several diagnoses and subsequently submitted to different therapies. So, the equations must be applied for the population they are made to in order to allow a more accurate clinical approach, choosing the right one to each single patient.

KEY-WORDS Asthma; spirometry; reference values

INTRODUCTION

Pulmonary diseases, such as asthma and Chronic Obstructive Pulmonary Disease (COPD) can be diagnosed and monitored using spirometry (Lynne, E. et al, 2008). This test analyzes how well a patient can breathe, measuring the amount of air he can blow out of his lungs and how fast that can be done (Enright, P., European Lung Foundation). Several parameters can be obtained. One of the most studied is Forced Expiratory Volume during the first second (FEV1) (Huib, A. et al, 1997). Comparisons between these results and reference standards (calculated using specific equations) allow doctors and researchers to make an accurate interpretation of spirometry. (American Thoracic Society, 1991; Miller, M et al, 2005; Miller, M. et al, 2009).

However, there are several different equations, based on different variables, which might change the classification from normal to abnormal. (Quadrelli, S. et al 1999). Furthermore, some of these formulas were calculated more than 20 years ago (Crapo et al, 1981; Knudson et al, 1983; Morris et al, 1971; ECCS, 1993) and, therefore, interpersonal differences (ethnicity, age, etc.) are not taken into account (Memon, M. A. et al, 2007). This may lead to misinterpretation of the results and, consequently, errors in diagnosis (Collen, J. et al, 2008).Several attempts have been made to homogenize reference values, but none has successfully solved this problem (Marek, W. etal, 2009).

Pulmonologists and other physicians interpreting spirometry need to be aware of the presence and nature of the changes mentioned previously. When observing patients longitudinally, they should know that modifications in interpretation may be due to differences in the reference standard even in the setting of equivalent spirometric measures. (Collen, J. et al, 2008)

The objective of this work is to analyze the differences between four of the most commonly used reference equations, in a consecutive sample of asthmatic patients performing spirometry, and to determine whether it causes misclassification of obstructive defects.

RESEARCH QUESTION AND AIMS

How can the different reference values of FEV1 affect the treatment and diagnosis of asthmatics?

Aims:

  • Analyze the reference value of FEV1 in asthmatics;
  • Explore the use of different reference values of FEV1;
  • Establish an association between the use of different reference values and the diagnosis and treatment of asthmatics
  • Outline that these formulas reference are not always applicable to the characteristics of a patient

METHODS

STUDY PARTICIPANTS

Our target population are asthmatic patients from the Allergology Department of Hospital de S. João, Porto.The sample consists on 235 adult patients with a medical diagnosis of asthma consecutively performing spirometry in theAllergology Department of Hospital de S. João. The inclusion criteria were: (1) age >= 18 years, (2) medical diagnosis of asthma and (3) have performed spirometry according to the ATS/ERS criteria. The unit of analysis are asthmatic participants.

STUDY DESIGN

This study is transversal (the data was obtained in a specific moment in time and the evolution of the patients status was not followed), observational and analytical (flow chart of the study design is presented in the Appendix 1).

DATA COLECTION METHODS

All spirometry data was consecutively collected with the same technique (spirometry, following the ATS/ERS criteria) (Miller, MR. et al, 2005) and instrument (spirometer– Masterscreen IOS, Carefusion USA). Other information, such as age, height and gender was directly questioned to the patients.

VARIABLES DESCRIPTION

The variables collected were: (1) Identification (each patient was numbered from 1 to 235); (2) age (in years) - all the participants are over eighteen (adults); (3) The variable age was recoded from 0 to 8 to analyze the values of agreement according to age groups (1: 18-30; 2: 31-35; 3: 36-40; 4: 41-45; 5: 46-50; 6: 51-55; 7: 56-60; 8: 61-70); (4) sex (male or female); (5) height (in cm); (6) all the values obtained through Spirometry (real values):FVC (measures the volume changed of the lung between a full inspiration to total lung capacity and a maximal expiration to residual volume; this measurement is performed during forceful exhalation), FEV1 (measures the maximum volume of air that a person can blow out during one second, as well as the speed at what this can be done, in order to discover the capacity of the patient's lungs and how well they work) andTiffeneau Index(FEV1/FVC) (a value less than 0,7 indicates the possibility of an obstructive lung disease; in this case, it might be necessary a bronchodilator to confirm the diagnosis); (7) the predicted values for FEV1, FVC and Tiffeneau Index calculated using the reference formulas (according to each equation, the appropriate variables of a patient were selected in order to make the calculation); (8) The percent predicted values result of the quotient: (Real value/Predicted value) × 100; (9) each percentage (FEV1 and FVC) was recoded from 0 to 5 in order to analyze the agreement among the equations (0: 0-35% - very severe; 1: 36-50%- severe; 2: 51-60% - moderate/severe; 3: 61-70% - moderate; 4: 71-79% - mild; 5: ≥80% - normal) and from 0 to 1 (0: <80% - obstructed and 1: ≥80% - healthy); (10) differences between the means of each group of predicted values for FEV1 and FVC (6 combinations: Knudson/Morris, Knudson/Crapo, Knudson/ECCS, Morris/Crapo, Morris/ECCS and Crapo/ECCS).

STATISTICAL ANALYSIS

The distribution for each variablewas determined using the K-S test and visual analysis of histograms.As FEV1, FVC and Tiffeneau Index predicted values did not follow a normal distribution when the total sample was analyzed, we proceeded to an analysis with the sample split by sex (normal distribution obtained). The explanation for this is related to the use of different formulas for each gender.

In order to verify thestatistical significance of differences between the means of each group of predicted values for FEV1 and FVC,we run One Sample T tests. The cut-off used was 0,05.

Using the recoded variables (0 and 1), we performed a chi-square test to determine if the differences between the percentages of healthy people obtained with different formulas (for FVC and FEV1) were statistically significant (6 combinations). With the recoded values from 0 to 5we analyzed the level of agreement, obtaining the kappa value(<0 – no agreement; 0-0,2 – slight agreement; 0,21-0,4 – fair agreement ; 0,41-0,6 – moderate agreement; 0,61-0,8 – substantial agreement; 0,8-1 –almost perfect agreement)for each of the 6 combinations for FEV1 values.These two steps were applied to the total sample, as they do not require a normal distribution.

The K linear weighting test was calculated directly in the following site

Creating a new recoded variable (1 to 8), we divided the sample into groups according to age and represented the agreement in a group scatter chart. This procedure was done with the sample split by sex.

These procedures were accomplished using the Statistical Analysis Software SPSS (Statistical Package for the Social Sciences).

RESULTS

Table 1 describes the sample, showing the mean, the standard deviation and the range for each variable in the whole sample and within the sex.

Variables / Mean (SD) / Range / Mean (SD) / Range / Mean (SD) / Range
Sample (n=235) [100%] / Male subjects (n=32) [13,6%]
/ Female subjects (n=203) [86,4%]
Age, yr / 47,3(10,8) / 18 – 70 / 45 (13,9) / 18 – 63 / 47,7 (10,2) / 19 – 70
Height, cm / 164,9(5,2) / 144 – 194 / 169,6 (6,3) / 158 – 194 / 164,2 (4,6) / 144 – 165
FEV1, L / 2,9 (0,8) / 0,62 – 7,5 / 3,6 (1,0) / 0,99 – 7,5 / 2,8 (0,7) / 0,62 – 4,2
FVC, L / 3,9 (0,8) / 1,33 – 8,94 / 4,6 (1,1) / 2,93 – 8,94 / 3,7 (0,7) / 1,33 – 5,56
Tiffeneau index / 74,9 (9,5) / 33,79– 100 / 76,3 (13,2) / 33,79 – 100 / 74,6 (8,9) / 40,63 – 97,84

Table 1 - Sample Description

ONE SAMPLE TEST

This test showed statistically significant differences for all the pairs of equations (p<0,05).

In Table 2, we can observe which pairs differ more and how much.The major discrepancy in the FVC and FEV1 predicted values for men is found in the mean difference between the equations of Knudson and Morris. For women, it is found between the Crapo and ECCS equations for FVC and Crapo and Morris for FEV1.

Clinically significant differences (>0,2L) (Miller, MR. et al, 2005)are detected mainly for FVC values, although there are also 5 in 6 values for FEV1 where these differences exist, all for men.

FVC / FEV1
Male / Female / Male / Female
Knudson - Morris / -1,76 / -0,16 / -1,16 / 0,08
ECCS - Morris / -1,66 / -0,29 / 0,06 / 0,06
Crapo - Morris / -1,39 / 0,00 / 0,29 / 0,19
Crapo - Knudson / 0,37 / 0,16 / 0,45 / 0,11
Crapo - ECCS / 0,27 / 0,30 / 0,24 / 0,13
ECCS - Knudson / -0,10 / -0,14 / -0,22 / -0,02

Table 2 - Pairs sorted by means difference, in liters, between FVC and FEV1 predicted means for men and women

CHI SQUARE TEST

There are differences(all statistically significant) between the classifications of normal FEV1 values among the equations (Table 3). For example, there is a difference of almost 5% between Crapo and Morris percentages.

Knudson / Crapo / Morris / ECCS
Normal (FEV1≥80%) / 89,4 (%) / 86,8 (%) / 91,1 (%) / 90,6 (%)
Low FEV1 / 10,6 (%) / 13,2 (%) / 8,9 (%) / 9,4 (%)
Normal (FVC≥80%) / 95,3 (%) / 93,2 (%) / 86,0 (%) / 95,7 (%)
Low FVC / 4,7 (%) / 6,8 (%) / 14,0 (%) / 4,3 (%)

Table 3 - Percentages of normal and low FEV1 and FVC obtained with different formulas

AGREEMENT TEST

According to Simple K, ECCS-Morris is the pair with best agreement (0,947 – good). On the other hand, Crapo-Morris has the weaker agreement (0,624 - substantial), as shown in table 4.

K Linear Weighting shows a similar result in the order of agreement, except for the pairs Knudson-Morris and Crapo-Knudson which invert position.

Reference Equations / Simple K / Agreement / K Linear Weighting / Agreement
ECCS - Morris / 0,947 / good / 0,951 / good
ECCS - Knudson / 0,807 / good / 0,916 / good
Knudson - Morris / 0,774 / substantial / 0,855 / good
Crapo - Knudson / 0,752 / substantial / 0,864 / good
Crapo - ECCS / 0,659 / substantial / 0,846 / good
Crapo - Morris / 0,624 / substantial / 0,796 / substantial

Table 4 - Results of Simple Kappa and Kappa Linear Weighting

Morris-ECCS is the pair with the highest percentage of agreement (99,1%), which corresponds to 233 patients with the same diagnose. The pair which has less concordant classifications is Crapo-Morris (92,8%, 218 patients), as we can see in table 5.

Table 5 - Agreement in the classification of the severity of the obstructive defect

Reference Equations / Concordant classifications / Discordant classifications
Patients / % / Patients / %
Crapo - Morris / 218 / 92,8% / 17 / 7,2%
Crapo – ECCS / 219 / 93,2% / 16 / 6,8%
Knudson – Crapo / 223 / 94,9% / 12 / 5,1%
Knudson – Morris / 226 / 96,2% / 9 / 3,8%
Knudson – ECCS / 227 / 96,6% / 8 / 3,4%
Morris - ECCS / 233 / 99,1% / 2 / 0,9%

Graph 1 shows the level of agreement between the means of FEV1 values for these equations within age groups and with the sample separated by gender.

FEV1 values in men are higher than women for all the equations. For both gender, the highest FEV1 value was the one obtained with Crapo equation as it is shown in the graph below.

DISCUSSION

After analyzing the differences between four of the most commonly used reference equations, we determined it can cause misdiagnosis. For instance, comparing the values for FEV1 using the equations Morris-Crapo, we found a percentage of discordance of 7,2%, which means that a total of 17 patients in 235 would be differently diagnosed (Table 5). Analyzing the table 3, we found a difference of almost 5% between the percentages of low FEV1 for Morris and Crapo, also showing that there are some patients which would be wrongly classified as healthy or obstructed.

An article by Collen and colleagues (Collen, J. et al, 2008)studied the effect of using NHANES III reference values (the equation recommended for people between 8 and 80 years old in the United States). For this, Collen et al compared NHANES III with Crapo et al (Crapo), Knudson et al (Knudson), and Morris et al (Morris) reference formulas, in non-Hispanic white patients. They conducted a cross-sectional study of the patients undergoing spirometry and measured FEV1 and FVC for each one. These values were interpreted in comparison with the ones obtained using the reported equations. Subsequent patients were classified as normal, restricted, obstructed, or mixed, based on the American Thoracic Society (ATS)/European Respiratory Society (ERS) guidelines. They found greater discordance when the results from Knudson and Morris equations were compared to those of NHANES III (45.5% and 35.3%, respectively). Relative to NHANES III, the prediction equations by Knudson, Crapo, and Morris tend to overclassify obstruction and underclassify restriction.

Another recent article by Collen et al (Collen, J. et al, 2010)very similar to the previous study, analyzed discordance in spirometry comparing three commonly used reference equations (Crapo, Morris and Knudson) to NHANES III in African-American patients. According to the article, discordance in interpretation was more relevant when results from Morris and Knudson equations were compared to NHANES III. There was a tendency for Knudson and Morris to underclassify restriction, and for Crapo and Morris to overclassify obstruction.

Similarly to our study, these authors compared older reference formulas (as Crapo, Morris and Knudson) with a more recent one, NHANES III, while we used ECCS. There are many similarities in what concerns to the type of study, data collection methods, variables measured, classification of patients and statistical analysis. While they studied the differences between the formulas in a particular ethnicity, our study was performed in an asthmatic population. All the results showed significant discrepancies between reference formulas, changing the classification of patients.

One of the limitations inherent to this work is the fact that the data collection is limited to a single department of one specific hospital where severe cases are more prevalent. Therefore, the data presented is confined to a group that might not represent the asthmatic population, as well as referring to pathologies and conditions that are possibly in a more severe state. Also, demographic and social factors may influence outcomes.

Furthermore, the database includes a higher number of females compared to males, which may compromise the final conclusions. This implies that the results, mainly the ones related to males, due to the small number analyzed, should not be generalized. One possibility to overcome this problem might be performing a new study only for men. Moreover, it could have been used more recent equations in order to achieve more reliable and accurate results, adapted to the current asthmatic population.

As we expected, there are substantial differences among the reference formulas analyzed, especially in men. These results have significance not only statistically but also in clinical practice. Furthermore, even small differences in statistical terms (for example the ones observed in the concordance tests) may represent significant clinical consequences. Therefore, one patient, according to the equation used, may have several diagnoses and subsequently submitted to different therapies. Serious repercussions can occur due to this reality: while a healthy individual may be treated unnecessarily, an ill one may be left without therapy. This reflects the relevance of the issue and emphasizes the need to change: formulas must be restructured and correctly applied to each patient’s specific characteristics.

To sum up, the equations must be applied for the population they are made to in order to allow a more accurate clinical approach, choosing the right one to each single patient.

REFERENCES

American Thoracic Society. Lung Function Testing: Selection of references values and Interpretative Strategies. Am Rev Respir Dis 1991; 144: 1202-1218

Arabalibeik H, Khomami MH, Agin K, Setayeshi S. Classification of restrictive and obstructive pulmonary diseases using spirometry data. Stud Health Technol Inform 2009; 142: 25

Collen J, Greenburg D, Holley A, King CS, HnatiuK O; Discordance in Spirometric Interpretations using three commonly used reference equations vs National Health and Nutrition Examination Study III. Chest 2008; 134: 1009-1014

Collen J, Greenburg D, King C et al. Racial discordance in spirometry comparing four commonly used reference equations to the National Health and Nutrition Examination Study III. Respiratory Medicine 2010; 104:705-11

Crapo RO, Morris AH, Gardner RM. Reference spirometric values using techniques and equipment that meet ATS recommendations. Am Rev Respir Dis 1981; 123: 659-664

Enright Pl. Testing your lungs: spirometry [Internet]; [Cited 15 October 2009], Available from:

Kerstjens HA, Rijcken B, Schouten JP, Postma DS. Decline of FEV1 by age and smoking status: facts, figures, and fallacies. Thorax 1997; 52: 820-7

Knudson RJ, Lebowitz MD, Holberg CJ, Burrows B. Changes in the normal maximal expiratory flow-volume curve with growth and aging. Am Rev Respir Dis 1983; 127: 725-734

Marek W, Marek E, Mückenhoff K, et al. Lung function in the elderly: do we need new reference values?. Pneumologie 2009; 63: 235-43

Memon MA, Sandila MP, Ahmed ST. Spirometric reference values in healthy, non-smoking, urban Pakistani population. J Pak Med Assoc 2007; 57: 193-5

Miller MR, Hankinson J, Brusasco V, et al. Standardisation of Spirometry. Eur Respir J 2005; 26: 319–38

Miller MR, Pedersen OF, Pellegrino R, Brusasco V. Debating the definition of airflow obstruction: time to move on?. Eur Respir J 2009; 34: 527–8