Original ArticleAHRQ Series Paper 5: Grading the strength of a body of evidence when comparing medical interventions—Agency for Healthcare Research and Quality and the Effective Health-Care Program
Introduction
Comparative effectiveness reviews (CERs), like systematic reviews in general, are essential tools for summarizing information to help make well-informed decisions about health care options [1]. CERs explicitly compare two or more screening or diagnostic strategies or therapeutic interventions. The Evidence-based Practice Center (EPC) program, supported by the U.S. Agency for Healthcare Research and Quality (AHRQ), produces substantial numbers of evidence reports and CERs. These reports are designed to accurately and transparently summarize a body of literature with the goal of helping clinicians, policymakers, and patients make well-informed decisions about health care. Reviews should provide clear judgments about the strength of the evidence that underlies conclusions to enable decision makers to use them effectively [2].
In 2007, AHRQ supported a cross-EPC set of work groups to develop guidance on major elements of designing, conducting, and reporting CERs [3]. This paper reports the outcomes of the EPC work group on grading strength of evidence. We briefly explore the rationale for grading strength of evidence, define the domains of concern for evidence strength, and describe our recommended grading system for such reviews. Our main objective was to give guidance to EPCs for grading strength of evidence in CERs, but this guidance may also apply to other systematic reviews.
The EPCs prepare reports that are used by a variety of decision makers, but the EPCs do not themselves develop recommendations. Therefore, the goal of our evidence rating system was to facilitate use of the reports by decision makers who may have differing perspectives. This separation of the raters of the strength of evidence from the decision makers led to some differences in the system we propose relative to other rating systems that are designed to be used directly by decision makers.
The EPC approach is based in large measure on the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) working group approach [4], [5], [6]. We briefly discuss the differences in emphasis between the two systems, and note that EPC and GRADE experts will explore ways to harmonize the two methods and to offer reviewers and decision makers a coordinated model for grading strength of evidence. This paper presents the approach that EPCs are expected to implement for CERs in the meantime.
Section snippets
Strength of evidence: rationale
Among organizations that make practice guidelines or coverage decisions and among experts who develop systematic reviews, assessment of the strength of a body of evidence is widely accepted. In drawing conclusions about strength of evidence, a growing number of organizations adopt systematic approaches to making judgments about the strength of evidence. A wide variety of grading systems is available for this purpose [7], and different organizations may weigh features, or domains, of a body of
Strength of evidence: domains
The EPC approach to grading evidence begins with assessments of a set of agreed-upon domains pertaining to entire bodies of evidence about major outcomes (benefits and harms) and comparisons—i.e., outcomes and comparisons that are most important to decision makers in clinical practice and health policy. A determination of which outcomes and comparisons the EPCs consider important enough to warrant formal grading of the strength of the evidence will depend on the key questions, the clinical or
Four strengths of evidence levels
The overall grade for strength of evidence reflects a global assessment that takes the required domains directly into account and, as needed, incorporates judgments about the additional domains as well. For each comparison of interest, EPCs should rate strength of evidence for each major benefit (e.g., positive impact on health outcomes such as physical function or quality of life, or effects on laboratory measures or other surrogate variables) and for each major harm (ranging from rare,
Reporting strength of evidence
As noted above, CERs should present information about all comparisons of interest for the outcomes that are most important to patients and other decision makers. Thus, strength of evidence should relate to those important outcomes. Complete and perfect information is rarely available. For some treatments, data may be lacking about one or more of the outcomes. In other cases, the available evidence comes from studies that have important flaws, is imprecise, or is not applicable to some
Discussion
The EPC approach to rating the strength of evidence draws heavily on the international GRADE system; both conceptually and substantively, it is similar to GRADE. Our recommendations address specific circumstances of the EPC program, which differ from those of some groups that use GRADE. The EPC program produces systematic reviews, but it is not involved directly in development of recommendations or guidelines. Rather, EPC reports are used by a spectrum of government agencies, professional
Acknowledgments
This research was funded through contracts from the AHRQ to the following EPCs: ECRI Institute (290-02-0019); Johns Hopkins University (290-02-0018), Oregon Health & Science University (290-02-0009); RTI International (290-02-0016); and Stanford University (290-02-0017). The opinions expressed here are those of the authors and do not necessarily represent the views of the AHRQ, the Department of Health and Human Services, or the Department of Veterans Affairs. The authors thank Valerie King,
References (15)
- et al.
AHRQ Series Paper 2: Principles for Developing Guidance: AHRQ and the Effective Health-Care Program
J Clin Epidemiol
(2010) - et al.
Current methods of the US Preventive Services Task Force: a review of the process
Am J Prev Med
(2001) - et al.
Discrepancies among megatrials
J Clin Epidemiol
(2000) - et al.
A simple and valid tool distinguished efficacy from effectiveness studies
J Clin Epidemiol
(2006) Using evidence reports: progress and challenges in evidence-based decision making
Health Aff (Millwood)
(2005)- et al.
Better information for better health care: the Evidence-based Practice Center program and the Agency for Healthcare Research and Quality
Ann Intern Med
(2005) - et al.
Systems for grading the quality of evidence and the strength of recommendations I: critical appraisal of existing approaches. The GRADE Working Group
BMC Health Serv Res
(2004)
Cited by (364)
The kynurenine pathway in major depressive disorder under different disease states: A systematic review and meta-analysis
2023, Journal of Affective DisordersFrom guidelines to decision aids and adherence supports: Insights from the process of evidence translation
2023, Patient Education and CounselingEfficacy of early versus delayed spinal cord decompression in neurological recovery after traumatic spinal cord injury: Systematic review and meta-analysis
2023, Revista Espanola de Cirugia Ortopedica y TraumatologiaThe Sociotechnical Factors Associated With Burnout in Residents in Surgical Specialties: A Qualitative Systematic Review
2022, Journal of Surgical EducationEffectiveness of Contraceptive Decision Aids in Adolescents and Young Adults: A Systematic Review
2022, Journal of Pediatric and Adolescent Gynecology