Simulation Study of Instrumental Variable Approaches with an Application toa Study of the Antidiabetic Effect of Bezafibrate
Bing Cai, PhD
Sean Hennessy, PharmD, PhD
James H. Flory, MD, MSCE
Daohang Sha, PhD
Thomas R. Ten Have, PhD, MPH*
Dylan S. Small, PhD
Bing Cai is a director in the Epidemiology Department of Pfizer Inc., 500 Arcola Road, Collegeville, PA 19421(E-mail: );Sean Hennessy is Associate Professor and Thomas Ten Have was Professor in the Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine Blockley Hall, Philadelphia, PA 19104-6021 (E-mail:); James Flory is a Fellow in the Division of Endocrinology, Diabetes, and Metabolism in the Department of Medicine at New York-Presbyterian Hospital/Weill Cornell Medical Center, New York Presbyterian Hospital, New York, NY 10021; Daohang Sha is a staff member of the Biostatistics Analysis Center, University of Pennsylvania School of Medicine, Blockley Hall, Philadlephia, PA 19104; Dylan Small is Associate Professor, Department of Statistics, Wharton School, 464 JMHH/6340, University of Pennsylvania, Philadelphia, PA 19104 (E-mail: );
*deceasedAbstract
Purpose: We studiedthe application of the generalized structural mean model (GSMM) of instrumental variable (IV) methods in estimating treatment odds ratios for binary outcomes in pharmacoepidemiologic studies and evaluated the bias of GSMM compared to other IV methods.
Methods: Because of the bias of standard IV methods, including two-stage predictor substitution (2SPS) and two-stage residual inclusion (2SRI) with binary outcomes, we implemented another IV approach based on the GSMM of Vansteelandt and Goetghebeur. We performed simulations under the principal stratification setting and evaluated whether GSMM provides approximately unbiased estimates of the causal odds ratio, and compared its bias and mean squared error to that of 2SPS and 2SRI. We then applied different IV methods to a study comparing bezafibrate vs. other fibrates on the risk of diabetes.
Results: Our simulations showed that unlike the standard logistic, 2SPS, and 2SRI procedures,our implementation of GSMM provides anapproximately unbiased estimate of the causal odds ratio even under unmeasured confounding. However, for the effect of bezafibrate vs. other fibrates on the risk of diabetes, the GSMM and two-stage approaches yielded similarly attenuated and statistically non-significant odds ratio estimates. The attenuation of the odds ratio by the two-stage and GSMM IV approaches suggests unmeasured confounding, although violations of the IV assumptions or differences in the parameters estimated could be playing a role.
Conclusion: The GSMM IV approach provides approximately unbiased adjustment for unmeasured confounding on binary outcomes when a valid IV is available.
1. Introduction
Instrumental variable (IV) methods are increasingly used to adjust for unmeasured confounding in pharmacoepidemiology.1-8 An IV is a variable that is associated with the treatment but is independent of unmeasured confounders and has no direct effect on the outcome.9 In addition, an IV is often assumed to satisfy a monotonicity condition, i.e., that subjects’ treatment level increases monotonically with the level of the IV.9 IVs satisfying the monotonicity condition enable identification of the local average treatment effect (LATE), i.e., the treatment effect among compliers. Under linear models, two two-stage IV approaches, i.e., two-stage predictor substitution (2SPS) and two-stage residual inclusion (2SRI), have been shown to yield unbiased estimators of the LATE.9 In 2SPS, a first-stage model is fit to predict the treatment based on the instrument and covariates, and a second-stage model is fit for the mean of the outcome as a function of the predicted value of the treatment from the first stage and the covariates; the estimated LATE is the coefficient on the predicted treatment in the second-stage model.10 In 2SRI, the first stage is the same as for 2SPS, but the second-stage model is the mean of the outcome as a function of the residual from the first-stage regression, actual value of the treatment, and the covariates; the estimated LATE is the coefficient on the actual treatment.3,11 For continuous outcomes, the 2SPS and 2SRI IV estimatorsare equivalent. For binary outcomes, the 2SPS and 2SRI IV odds ratio estimatorsare not equivalent, and both have been shown to be biased for the LATE when there is strong confounding.12 An alternative to the biased two-stage IV estimators of the LATE odds ratio is the generalized structural mean model (GSMM), which Vansteelandt and Goetghebeur have shown is a consistent estimator under certain conditions.1, 2 Here we implemented the GSMM approach such that it allows both levels of a dichotomous IV to have access to the treatment, as is the case in most observational studies, and we evaluated the bias of the method with a simulation study.
In addition, we compare different estimators in a study of the effect of bezafibrate on the risk of diabetes. Bezafibrate is a lipid-lowering medication that is widely prescribed in the United Kingdom. Recent published results suggest that the protective effect of bezafibrate against the onset of diabetes is superior to that of other fibrates.13-17 Flory et al. published a retrospective cohort study using the General Practice Research Database (GPRD) to compare the risk of diabetes between users of bezafibrate versus other fibrates, and showedlower incidence of diabetes in bezafibrate users.18 This analysis controlled for measured confounders but not unmeasured confounders. Inspection of baseline data in the publication by Flory et al. showed statistically significant differences in theprescribing patterns for different fibrates, most notably higher rates of baseline diabetes in patients receiving fenofibrate compared to patients receiving bezafibrate.Although the clinical properties and indications for the fibrates are very similar to one another, this observation suggested that providersmightstill have subtle prescribing preferences that could introduce unmeasured confounding. To control for this potential unmeasured confounding, we present IV-based results, using the previous prescription from the same practice as the IV; this preference-based IV has been used by other pharmacoepidemiologic studies.6,19-21
2. Methods
2.1 Notation and Assumptions
The treatment (bezafibrate) variable is:denotes that a patient received the treatment of interest (bezafibrate); and means a patient received an alternative treatment (other fibrate). The IV variable is : corresponds to a patient exhibiting level 1 of the IV that predisposes to bezafibrate treatment (practice’s previous fibrate prescription was for bezafibrate) and to a patient exhibiting level 0 of the IV that predisposes to a non-bezafibrate fibrate (practice’s previous fibrate prescription was for a different fibrate). The vector of observed baseline confounders will be denoted by .
For the outcome variables, we define as the observed outcome: if the patient exhibits the outcome (incident diabetes), and if the patient does not exhibit the outcome (no diabetes). The potential outcomeis the outcome a patient would exhibit if she/he were to receive bezafibrate; is theoutcome if thepatient were to receive a different fibrate. Under the assumptions below, if the patient actually receives bezafibrate; and if the patient receives a different fibrate.
The potential treatment received () is the treatment level a patient would receive if predisposed (not predisposed) to take bezafibrate by the IV. Subjects can be classified into four compliance classes: (1) “always takers” of bezafibrate, , would always take the fibrate previously prescribed by that practice; (2) “compliers” with the suggestion of the IV, ; (3) “defiers” take the opposite of what the previous prescription was (“defiers”), ; and (4) never takers of bezafibrate, .
The assumptions of the IV approaches pertain to the relationships among the IV, treatment, and outcome. We make the same five assumptions about an IV as Angrist, Imbens, and Rubin9: 1) stable unittreatment value assumption (SUTVA); 2) random assignment; 3) exclusion restriction, which means the IV affects the outcome only through the treatment and has no direct effect;4) non-zero averagecausal effect of the IV on treatment; and 5) monotonicity, which means that there are no defiers, as defined above. These assumptions enable identification of the LATE odds ratio, the likelihood of success under treatment compared to control for the compliers.
2.2 Models
Details of the estimation procedure for the GSMM and the two-stage IV procedures are in Appendix I and II. We focus here on the GSMM. The GSMM seeks to estimate
where is the causal log odds ratio for the effect of treatment on the risk of the outcome (bezafibrate effect on diabetes risk) among those who received treatment and have IV level r. The causal log odds ratio in equation (1) is assumed to not be modified by the IV ; consequently is the treatment on treated (TOT) causal log odds ratio, measuring the effect of treatment for those who received treatment. Under the assumptions described in Section 2.1 and the assumption that the causal log odds ratio is not modified by the IV, is equal to the causal log odds ratio for the effect of treatment for the compliers.22
Because the nonlinearity of the logistic function distorts integration of the probability model under (1) with respect to treatment,, which is necessary for estimation, we need to specify an “association” model for each level of the IV. Such a model is not causal, because no parameter corresponds to a causal treatment effect. The association model is specified separately for each level of the IV as a standard logistic model for the log odds ratio of treatment on outcome, adjusting for observed covariates and the IV:
where the first subscript for the, log odds ratio parameters indicates the level of the IV within which the logistic models in (2) and (3) are applied. For a clinical trial with controls not having access to treatment, equation (2) is the only association model, because in the control arm, for all patients, so that one cannot apply the association model in equation (3). However, for observational studies, can be either 0 or 1 for either level of the IV, so we specify the second association model in (3) for the second level of the IV. It is explained in Appendix 1 of the supplementary materials how the structural model and the association model are combined to estimate the causal odds ratio .
SAS macros for implementing the GSMM approach are provided in the Appendix.
3. Simulations
The simulations for evaluating the GSMM, two-stage, and standard logistic approaches require a fully specified (parametric) model to generate the potential outcomes for Y and Z. We simulate data from the compliance class model9 to serve as the true model, as it is fully parametric and also accommodates the different assumptions that lead to different causal effects (e.g., LATE effect). Appendix III describes the parameters of the compliance class model used to simulate the data. The parameters that we vary in the simulation study include (1) the amount of unmeasured confounding, which is measured by , the difference on the logit scale in the probability that the outcome is 1 under no treatment between compliers and never-takers; (2) how frequent the outcome is; and (3) the strength of the instrument, which is measured by the proportion of compliers. We then analyzed each simulated dataset with the GSMM, two-stage, and standard logistic approaches. Our goal is to compare the bias, mean squared error, and confidence interval coverage of the GSMM estimator and compare it to the two-stage estimators.
4. Results
4.1 Simulation results
Table 1 presents simulation results of sample size 10,000 without always-takers. For a frequent outcome (true risk of outcome is 0.30), the percent absolute bias for the GSMM was very small (less than 0.35%) with no consistent increase with the magnitude of unmeasured confounding. For a rare outcome (true risk of outcome is 0.03), the percent absolute bias was larger than for the common outcome setting, ranging between 1.50% and 11.00%, with an increase in positive bias with the magnitude of confounding. When there was a large amount of unmeasured confounding that biased the standard logistic estimator upwards, a difference of 2 on the logit scale in the probability that the outcome is 1 under no treatment between compliers and never takers, the estimation routine did not converge for 36% of the simulations,which were discarded from the results. Accurate 95% confidence interval coverage occurred for all cases in Table 1. Table 2 presents the results of the simulations of sample size 10,000 that include always-takers. Similar to the non-always-taker case in Table 1, the bias of the treatment effect estimator was also very small (less than 0.26%) for the frequent outcome. As with the non-always-takers case for the rare outcome, the bias estimated from the simulation was larger than for the frequent outcome setting, but it was still very small (less than 4%). Unlike the situation with the non-always-takers, the magnitude of bias did not increase with the magnitude of confounding with always-takers. In these simulations, the 95% CI coverage was also close to 95%.
Table 3 compares bias, variance, and mean squared error of the 2SPS, 2SRI, and standard logistic regressions with the GSMM method without always-takers. In these simulations the sample size is N=3,000, and the proportion of compliers (strength of the instrument) includes 0.3, 0.5, and 0.7. For the GSMM approach, the estimated bias ranged up to 2%. In contrast, the 2SPS and 2SRI are more biased, both ranging above 50%. For these two approaches, bias increased with increased confounding and a weaker instrument (fewer compliers).
Comparing the variance of the three approaches, we can see that the GSMM had the smallest and 2SRI the greatest variance. The GSMM also had the smallest MSE among the three approaches. The results are mixed when comparing the MSE between the 2SRI and 2SPS procedures.
4.2 Bezafibrate Data Analysis
Following the methods of Flory et al.,18 we defined treatment as the initial fibrate treatment. We defined the IV as the prior prescription from the same practice as the patient. If a patient was the first one in the practice to be prescribed a fibrate, there was no IV defined for this patient, and so that patient was excluded. Using this subset of the data, we followed Flory et al. by performing analyses with and without adjusting for the following covariates: calendar year, age, sex, smoking status, BMI, hypertension, history of myocardial infarction (MI), history of stroke, use of potentially protective drugs (angiotensin-converting enzyme inhibitors), and common potentially diabetogenic drugs (beta blockers, thiazide diuretics, corticosteroids).15 The analyses were performed with and without these covariates.
The assessment of the association between the IV and treatment is presented in table 4. When the prior fibrate prescription from the same practice was bezafibrate, 79.4% of the GPRD patients actually had a bezafibrate prescription. In comparison, when the prior prescription from the same practice was a different fibrate, only 60.7% of patients had a bezafibrate prescription. The odds ratio (OR) is 2.49 with 95% confidence interval (2.31-2.69), indicating a statistically significant association between the IV and treatment.
We now consider the association between bezafibrate and diabetes. First, the unadjusted odds ratio of bezafibrate vs. other fibrates on the outcome of diabetes in the subset in which an IV could be defined was 0.67 (95% CI 0.53-0.85), indicating a strong inverse association. In Table 5, we compare the different estimates of the effect of bezafibrate on diabetes, with and without adjustment of covariates. Under standard logistic regression (corresponding to the above odds ratio based on actual counts), the odds ratio (95% confidence interval) was 0.67 (0.53,0.85) with or without covariates (differences are at the 1,000ths place to the right of the decimal). In contrast, the IV-based approaches yielded odds ratios that were not statistically different from one, which was due to both attenuated odds ratios and wider confidence intervals. Specifically, the 2SPS, 2SRI, and GSMM approaches yielded very similar estimates without covariates: 0.87 (0.22, 3.41), 0.90 (0.23, 3.53), and 0.82 (0.50,1.35), respectively. However, adjusting for covariates, the GSMM odds ratio 1.50 (0.72, 3.10) differed substantially from the 2SPS and 2SRI odds ratios of 0.77 (0.19, 3.16) and 0.78 (0.19, 3.21), respectively. However, in all cases, the IV odds ratios were not statistically different from one. We note that a crucial assumption for the GSMM approach is that each of the separate association models relating outcome to treatment at each level of the IV is specified correctly. None of the covariates that were originally selected by Flory et al. and used in the analysis were associated with the outcome conditional on the level of the IV. Consequently, it is reasonable to use the GSMM estimate that does not adjust for these covariates. Finally, as shown in Table 5, the GSMM standard error was smaller than the standard errors of the 2SPS and 2SRI approaches, as reflected in the narrower confidence intervals of the GSMM odds ratios. This was consistent with our simulation results.
5. Discussion
Under the IV assumptions, the IV approach estimates a causal odds ratio of treatment received (e.g., receiving a prescription for that treatment) among compliers (those who would take the treatment only under encouragement to do by the IV). A standard logistic regression adjusting for observed potential confounders shows a statistically significant inverse association between bezafibrate use and risk of diabetes. However, the different IV approaches yielded statistically non-significant associations due both to attenuated odds ratios and wider confidence intervals.
In spite of the consistency among the results of the IV approaches without adjusting for covariates, our simulations suggest that the GSMM approach is less biased and has narrower confidence intervals than the more standard IV two-stage approaches to the logistic regression context. Our implementation of the GSMM approach from its original randomized context (where the control group does not have access to treatment) to the current observational context (where patients with both levels of the IV have access to treatment) performed well in the simulations. We have more confidence in the results of the GSMM approach than in those of the two-stage 2SPS and 2SRI approaches, which have been shown to be biased.3,12
One does have to be careful with the GSMM approach, given its reliance on the assumption of correct association models for the treatment-outcome association within each level of the IV. Another difficulty with the GSMM approach is that in simulations in which a large amount of unmeasured confounding biased the standard logistic estimator upwards, the estimation routine sometimes did not converge.