Electronic supplementary material

This electronic supplementary material complements information in the journal paper. This supplement includes additional methodological details (inclusion criteria, search strategies, coding details, and statistical analysis),as well as findings regarding intervention components, fixed-effect analyses and single-group moderator analyses. Further discussion of, in particular, the methodological implications of the research, is also provided.

Subjects and methods

Small-sample studies were included because it is important to summarise across the broad scope of investigations.Many such studies lacked statistical power to detect treatment effects that may have been present. Small studies may report on subjects that are difficult to recruit or innovative interventions [1]. We weighted studies so that those with larger samples had proportionally more impact in ES analyses. Both published and unpublished studies were analysed [1], a strategy that allowed us to include a wider variety of interventions. Meta-analyses that only include published studies are likely to overestimate the magnitude of the true population effect, because the single biggest difference between published and unpublished research is the statistical significance of the results [1].

We included studies without control groups if they contained pre-intervention and post-intervention data needed to extract an ES. Withholding treatment may be unethical in studies with patient populations when treatments are thought to be more effective than control conditions [2]. We conducted separate analyses for two-group and single-group comparisons. When multiple outcome assessments were available, we coded the most distal data collection point because enduring changes will be more likely to improve health outcomes.

Search strategies

An experienced reference librarian conducted computerised searches in MEDLINE, PsychINFO, Healthstar, Combined Health Information Database, Sport Discus, EMBASE, Cochrane Controlled Trials Register, Dissertation Abstracts International, Nursing and Allied Health Database, EducationalResourcesInformationCenter, and Database of Abstracts of Reviews of Effectiveness. Broad search terms ensured a comprehensive search. Ancestry searches of eligible studies and of previous narrative reviews and quantitative synthesis papers were conducted. We searched the National Institutes of Health Computer Retrieval of Information on Scientific Projects database of funded studies back to 1973. Computerised database searches were conducted on all authors of retrieved studies that met the inclusion criteria and on principal investigators of funded studies. Internet searches yielded only those studies we had retrieved through other mechanisms [3]. Pre-selected words in titles and abstracts were used to identify an extensive set of potential studies from among those retrieved through the previously described strategies.

Coding

The coding frame was developed from suggestions by research synthesis experts, elements coded in published reviews of health behaviours, characteristics of interventions suggested by experts, components of interventions reported in empirical literature, and findings from preliminary studies. Before adopting an explicit description for any intervention component, we carefully pilot tested descriptions to be certain they fully characterised the interventions. Details about intervention components are found in ESM Table 1. We coded pre- and post-intervention means and SDs of treatment and comparison groups when the data were available. If means and SDs were not available, we coded other statistics that could be converted to the dindex of ES (e.g. t statistics). Multiple research reports of the same study were examined if necessary to obtain the required data.

Data extraction was not blinded since evidence indicates that this procedure does not decrease bias [4]. All derived study codes were duplicate entered in Access(Microsoft, Redmond, Washington, USA) to ensure accuracy. To be certain that we only analysed independent samples,we cross-checked author lists with all other authors to locate research reports that might contain overlapping samples. Senior authors were contacted to clarify the uniqueness of samples when necessary.

Analyses

We calculated a standardised mean difference (d) as each comparison’s ES and adjusted it for small-sample bias [5,6]. Normaltheory SEs were used to construct 95% CIs for the common (fixed-effects) or mean (random-effects) ES and to test whether the ES was 0. A conventional heterogeneity statistic (Q) was used to assessbetween-study homogeneity. Q is distributed approximately as 2 on k – 1 df, where k is the number of observed ESs.

Single-group ESs were analysed separately from two-group studies [7]. Studies with two treatment groups and one control group were included in the meta-analysis by accounting for dependence because of a shared control group. This involved generalised least-squares [8] for fixed-effects analyses and, for random-effects analyses, a two-stage approach whereby each study’s dependent ESs were combined into a single independent ES and then submitted to standard univariate random-effects analysis.

Funnel plots were examined to detect possible publication bias [9]. Sampling error should decrease as sample size increases. Symmetry of observed ds about the same overall average, regardless of sample size, is typical when no publication bias exists. The expected funnel or horseshoe shape with no publication bias occurs because varied ES values from small studies are dispersed at one end whereas ESs from larger samples studies tend to cluster closer together at the other end. Sparseness of small or negative ESs for smaller studies may indicate publication bias because studies with larger ESs are more readily published.

Potential outliers were identified by examining ESs both graphically and statistically. We omitted one ES at a time and checked for large externally standardised residuals or substantially reduced measures of heterogeneity [5]. One independent group comparison[10], four treatment group pre–post comparisons[11–14], and three control group pre–post comparisons [15–17] were excluded as outliers.

Heterogeneity issuesWe used both fixed- and random-effects models for the overall analyses. The fixed-effects model assumes that ESs from primary studies estimate a fixed population value [18]. An ES can be described as fixed when the only random influence on it is subjectlevel sampling error. The random-effects model, the basis for results described in the paper, probably more appropriately captures the additional sources of variance from diverse intervention trials to improve diabetes self-management. The random- and fixed-effects results are identical when the between-studies variance component is estimated as 0.

We emphasised random-effects analyses because we anticipated considerable primary study heterogeneity. Statistical and clinical heterogeneity is an ongoing challenge in behavioural research, where syntheses bring together studies that are both clinically and methodologically diverse[19,20]. Considerable variation in ES distributions are especially common in educational and behavioural interventions [21]. Even meta-analyses of well-defined behavioural interventions, where specific activities are prescribed components of the intervention, tend to find heterogeneity [22]. Differences in inclusion criteria, disease variations, differences in study execution, dose variations, range of follow-up and study quality contribute to heterogeneity [19,23,24]. In fact, heterogeneity is so common in some types of quantitative syntheses that meta-analysts have argued that it should be viewed as the expectation, rather than the exception [25].

Interventions to change diabetes self-management behaviours are not standardised. Although some diabetes educator groups have outlined key topics of diabetes education, neither the content depth and breadth within the topics nor the manner of educational delivery has been standardised [26–29]. Thus,heterogeneity is expected, should be measured and reported, and justifies the exploration of possible sources of heterogeneity in moderator analyses[30]. Recent Cochrane reviews of behavioural interventions have acknowledged heterogeneity, used random-effects models to allow for and quantify heterogeneity, and used subgroup analyses to explore heterogeneity [20,22,31,32].

We planned four strategies to address heterogeneity. First, for the random-effects analyses we reportednot only the location parameter (mean ES),but also a variability parameter that represents variability of ESs about the location parameter to characterise the extent of heterogeneity [23]. Quantifying the residual heterogeneity with this variance component is an important component of our analysis.

Second, analyses were conducted using a random-effects model, which explicitly allows for ES heterogeneity. The random-effects analysis is a heterogeneous model, which assumes thattrue population ESs vary among the universe of study situations [23]. Random-effects analyses take into account heterogeneity when it is not fully explained by moderator analyses [30]. In the presence of unexplained heterogeneity, the random-effectsanalysis tends to yield larger standard errors, wider confidence intervals, and less significant tests than the fixed-effect model [30]. Whereas fixed-effects analyses entail the pooling of results across studies to estimate a common ES, random-effectsanalyses do not pool results but rather estimate the mean and variance of ESsacross studies. The random-effectsmodel admits that the true effect may vary among studies as a result of variations in interventions, designs and sampled populations.

Third, we explored potential studylevel moderators as an approach to understanding potential sources of heterogeneity [19,31,33]. Identifying sources of variation contributes to the meaningful interpretation of existing findings, as well as the design of future studies [23,25]. Fourth, we interpreted our findings in light of existing heterogeneity. Testing for the presence of heterogeneity is less valuable than interpreting the extent to which heterogeneity affects meta-analysis conclusions [19].

Moderator analysesModerators for the exploratory analyses were selected based on frequency of available information in existing reports. Continuous moderators were analysed using:(1) a meta-analytic analogueof regression, when there were at least six studies reporting data for the moderator; and (2) for dichotomous moderators, a meta-analytic analogue of ANOVA,when there were at least three studies at each moderator level [34,35]. Moderator analyses compare the amount of ES variability among levels of a studylevel moderator with the amount of variability in observed ESs that would be expected by subjectlevel (and, for mixed-effects analyses, between-studies) sampling error alone. Although univariate moderator analyses were used instead of the more sophisticated multivariate mixed-effects approach preferred for non-independent ESs [36], the two or fewer multiple-treatment pairs were not expected to unduly distort estimates and tests. Associations between moderator variables were examined for all two-group studies (weighted and unweighted results available from V. S. Conn). Analyses were conducted using programs written in the interactive matrix language of Statistical Analysis Software (SAS/IML;SAS Institute, Cary, NC, USA).

Results

The studies included in the meta-analysis are listed after the supplementary material references. Sixteen reports contained two comparisons (two treatment groups compared with a control group, or two treatment-group outcomes compared with baseline values) [13,37–51]. Four reports contained three comparisons [52–55]. One report contained four comparisons [56] and one paper contained six comparisons [57]. Modified funnel plots of ES by sampling variance did not show obvious publication bias. Plots are available from the V. S. Conn. Intervention attributes appear in ESM Table1.

The fixed-effect model results are provided in ESM Table 2; the random-effects model results are presented in Table 2 in the published paper.The fixed- and random-effects analyses yielded similar point estimates. To enhance interpretability, the mean ESs were transformed to the HbA1c metric using results from appropriate reference groups pooled across available studies. For two-group comparisons, the mean effect in terms of the pooled treatment and control SD (1.54%) is an HbA1c raw mean difference of 0.45% (equivalent to 0.29×1.54%); relative to the pooled mean of 7.83% for control subjects, this further translates into an HbA1c mean of7.38% (equivalent to 7.83%–0.45%) for treatment subjects. For treatment single-group pre–post comparisons, the HbA1c raw mean difference is 0.56% (12=0.80 assumption); relative to the pooled mean baseline HbA1c of 8.03%, this further translates into a mean outcome HbA1c of 7.46%.

Moderator analyses using fixed-effects models were conducted for both two-group and single-group comparisons. Results from two-group moderator analyses are presented in ESM Tables 3 and 4. Moderator analyses of single-group treatment comparisons are presented in ESM Tables 5 and 6.

Discussion

Substantial primary study heterogeneity was consistently documented with Q values. In fact, if we assume that true ESs are normally distributed, then the estimated two-group ES (0.29) and corresponding root variance component (0.231) imply that approximately 10% of interventions yielded a negative true ES (i.e. degraded metabolic control). The dispersed spread of intervention effects around the typical value should motivate further primary research comparing interventions and comparing study design features to further understand the observed variability.

Randomised controlled trials are the gold standard for scientific evidence. Primary research that randomly assigned subjects reported much larger ESs than two-group studies that did not randomise subjects. Previous meta-analyses have reported mixed findings when comparing randomised and non-randomised studies. Random assignment of subjects may be a proxy for other aspects of methodological quality, such as treatment integrity, that affect outcomes[58]. Previous meta-analyses of diabetes literature [59–61] and other scientific areas [62–64] have reported inconsistent patterns linking study quality or design with ES estimates. These findings do not support the common supposition that the random assignment of subjects prevents bias in favour of treatment groups. We included single-group pre–post comparisons to complement the information provided by two-group controlled trials. Previous meta-analyses have found inconsistent differences between single-group and two-group ES estimates [60].

Funded studies reported significantly larger ESs than unfunded studies. Funding may allow stronger interventions, larger dose or more potent behaviour change strategies, which may be more effective at improving metabolic control [65]. It is possible that funded studies are more susceptible to publication bias, which could account for the larger ESs among funded studies. Although funded studies often have larger samples, this would not affect ES estimates. The absence of a difference in ES between industry-funded primary studies and other research is reasonable in behavioural research. Differences in findings between industry-funded vs government-funded research are probably more common in pharmaceutical or other for-profit product research[66–69].

Too few studies focused on ethnic minorities (e.g. persons of African descent residing in Europe or North America) topermit ethnicity moderator analyses. Studies with subjects from ethnic minoritiesmay be less successful in improving glycaemic control and quality of life [70]. Few studies have tested culturally sensitive interventions for subjects from minority subjects [12,53]. There is an urgent needfor more intervention studies thatincludethese vulnerable populations.

Earlier meta-analyses of the effects of interventions on metabolic control reported mean difference ESs of 0.26–0.84 [61,71–73]. The more comprehensive search completed for this meta-analysis may have resulted in more studies with smaller ESs because studies with larger ESs are more easily retrieved [1]. It is also possible that our review included studies of samples with better initial metabolic control than other reviews, thus limiting potential improvements. More primary studies with multiple outcome assessments are needed to examine interventions that most effectively initiate better metabolic control, as well as those most likely to result in long-term improvements.

By focusing on metabolic outcomes, our project did not address other important outcomes, including quality of life and actual changes in exercise. We did not assess changes in knowledge, because they are poorly correlated with health outcomes among people with diabetes [74]. We will examine other criterion variables in future syntheses.

The meta-analysis was limited by the availability of primary studiesand by the information that the research reports provided. Inadequate descriptions of interventions continue to plague the literature. The number and nature of the extant primary studies preclude definitive answers to many important questions. This is especially problematic for making decisions about replication studies, incorporating interventions into practice, and synthesising findings [12]. Further, incomplete reporting of primary studies presents challenges for synthesis. Primary studies generally contain too little information to evaluate key aspects of interventions that may strongly affect outcomes. These include detailed information about the content of interventions, the amount of time allocated to intervention components, quality of intervention delivery, content validity of intervention, setting features and interventionist characteristics [65]. These shortcomings limit the ability of researchers to interpret individual primary studies and to synthesise findings through meta-analysis.

References

1. Conn V, Valentine J, Cooper H, Rantz M (2003) Grey literature in meta-analyses. Nurs Res 52:256–261

2. Brown SA, Upchurch S, Anding R, Winter M, Ramirez G (1996) Promoting weight loss in type II diabetes. Diabetes Care 19:613–624

3. Conn V, Isaramalai S, Rath S, Jantarakupt P, Wadhawan R, Dash Y (2003) Beyond MEDLINE for literature searches. J Nurs Scholarsh 35:177–182

4. Berlin JA (1997) Does blinding of readers affect the results of meta-analyses? University of Pennsylvania Meta-analysis Blinding Study Group. Lancet 350:185–186

5. Hedges L, Olkin I (1985) Statistical methods for meta-analysis. Academic Press, Orlando

6. Morris SB (2000) Distribution of the standardized mean change effect size for meta-analysis on repeated measures. Br J Math Stat Psychol 53:17–29

7. Morris SB, DeShon RP (2002) Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychol Methods 2:105–125

8. Gleser LJ, Olkin I (1994) Stochastically dependent effect sizes. In: Cooper H, Hedges L (eds) The handbook of research synthesis. Russell Sage Foundation, New York, pp 339–355

9. Vevea JL, Hedges LV (1995) A general linear model for estimating effect size in the presence of publication bias. Psychometrika 60:419–435