GRADE and the Guideline Development Process

GRADE and the Guideline Development Process

The GRADE approach is part of the overall guideline development process. The figure makes

Clear that, before the GRADE system becomes relevant, the problem to be addressed by the guideline needs to be well defined. This process includes delineation of scope. For example, will the guideline seek to offer recommendation regarding diagnostic strategy or will it be restricted to issues of therapy and prevention? A second step in the process is to define the specific questions for which recommendations are to be offered. This is best done using the PICO format (Patients, Interventions, Comparisons, Outcomes). Putting the target questions into a PICO format will facilitate the assessment of the directness of evidence to the benefit of both developers and users.

The GRADE approach suggests that before grading the quality of evidence and strength of each recommendation, guideline developers should first identify a recent well-done systematic review of the appropriate evidence answering the relevant clinical question, or conduct one when there is none available or use another transparent process to identify relevant evidence.

Once the evidence has been synthesised, it should be summarised in a GRADE evidence profile. Evidence profiles allow panel members to base their judgments on commonly and concisely summarized evidence. GRADE Evidence Profiles include information about the effects of the intervention and the quality of evidence for each of the outcomes that are deemed critical or important for decision making.

Rating the quality of evidence relevant to each outcome

In the GRADE system quality of supporting evidence is classified in four categories. The suggested terms are “high”, “moderate”, “low” or “very low”, but some organizations prefer the use of symbols or letters to express the ranking the evidence:

High: Further research is very unlikely to change certainty regardingestimate of effect.
Moderate: Further research is likely to change certainty regarding theestimate of effect.
Low: Further research is very likely to change certainty regarding the estimate of effect.
Very low: Any estimate of effect is very uncertain.

When assessing the quality of evidence,GRADE considers the following factors:

1) study design and rigour of its execution

2) the extent to which available evidence can be directly applied to the target , interventions, comparisons and outcomes

3) consistency of the results

4) precision of the results

5) likelihood of publication bias

6) magnitude of the effect

7) demonstration of a dose-effect relationship

8) the likely direction of impact of all plausible confounding factors on the observed effect

The GRADE system uses a point scoring method to derive an assessment of the quality of evidence pertaining to each target outcome.The following summarizes this approach:

Study design / Initial quality of a body of evidence / Lower if / Higher if / Quality of a body of evidence
Randomised trials / High / Risk of Bias
- 1 Serious
-2 Very serious
Inconsistency
- 1 Serious
-2 Very serious
Indirectness
- 1 Serious
-2 Very serious
Imprecision
- 1 Serious
-2 Very serious
Publication bias
- 1 Likely
-2 Very likely / Large effect
+ 1 Large
+2 Very large
Dose response
+1 Evidence of a gradient
All plausible residual
confounding
+1 Would reduce a demonstrated effect
+1 Would suggest a spurious effect if no effect was observed / High (four plus: )
Moderate (three plus: )
Observational studies / Low / Low (two plus: )
Very low (one plus: )

The application of the above criteria involves exercise of judgement regarding their importance in the specific context. Assessment of ‘indirectness’ of evidence is potentially complex. A good initial approach is to compare the ‘PICO’ elements within the specific questions as formulated by the guideline developers to those embedded within the corresponding research evidence relevant too each outcome. To the extent that differences between the two sets of PICO characteristics would be likely to alter the observed effect, an issue of indirectness has been detected.

Grading the strength of recommendations

Guideline Panels use the information from the Evidence Profiles to develop recommendations. Panels explicitly consider:

quality of evidence
balance of benefits and harms/burdens
distribution of values and preferences
resource implications.

The overall quality of evidence is determined by the lowest quality of evidence for each of the critical outcomes. The terms “values and preferences” refer to the relative worth or importance of a health state or consequences of a decision to follow a particular course of action (benefits, harms, burdens, treatment and resources). Individuals usually assign less value to and have less preference for more impaired health states (e.g. death or impaired social functioning and work productivity) compared to other health states (e.g. full health or having very mild symptoms that do not interfere with daily life).

Based on the above factors, recommendations are classified as either “strong” or “conditional/weak”. The strength of recommendations depends on a balance between all desirable and all undesirable effects of an intervention (i.e. net clinical benefit), quality of available evidence, values and preferences, and resource utilization (cost and others). In general, the higher the quality of the supporting evidence, the more likely it is for the recommendation to be strong. Conversely, if the quality is low or very low a conditional/weak recommendation is more likely. Strong recommendations based on low or very low quality evidence are possible, in particular if they are made against the use of new technologies that are poorly investigated.Strong recommendations may also be expressed as “we recommend” and conditional recommendations as “we suggest”. Statements about the underlying values and preferences as well as the remarks are integral parts of the recommendations and serve to facilitate accurate interpretation of the recommendations and should not be omitted.

Interpretation of “strong” and “conditional/weak” recommendations
Strong recommendation / Conditional/Weak recommendation
Implications
For patients / Most individuals in this situation would want the recommended course of action and only a small proportion would not. Formal decision aids are not likely to be needed to help individuals make decisions consistent with their values and preferences. / The majority of individuals in this situation would want the suggested course of action, but many would not.
For clinicians / Most individuals should receive the intervention. Adherence to this recommendation according to the guideline could be used as a quality criterion or performance indicator. / Recognize that different choices will be appropriate for individual patients, and that you must help each patient arrive at a management decision consistent with his or her values and preferences. Decision aids may be useful helping individuals making decisions consistent with their values and preferences.
For policy makers / The recommendation can be adapted as policy in most situations / The policy will require substantial debates and involvement of various stakeholders before adaptation