Original articleEffect of applying different “levels of evidence” criteria on conclusions of Cochrane reviews of interventions for low back pain
Introduction
Low back pain is a large problem in most industrialised countries. Although the condition is associated with enormous social and economic costs there is not yet agreement on the most effective way to manage the condition. One factor contributing to this uncertainty is the contradiction in results from studies that have evaluated interventions for low back pain.
Many commentators argue that the strongest evidence for a therapy is provided by a systematic review of all relevant randomised controlled trials (RCTs) [1]. Systematic reviews provide a method to efficiently synthesize the results of individual studies. A single review may synthesize the results of many RCTs and reduce these down to a series of brief summary statements on the efficacy of the various therapies. For example, a review by Van Tulder et al. [2] reduced the findings of 12 RCTs to a single summary statement: “There is strong evidence that exercise therapy is not more effective for acute low back pain than inactive or other active treatments with which it has been compared” (p. 2784).
This qualitative approach to synthesis, sometimes called the “levels of evidence” approach, has been used in a large number of systematic reviews of therapy. Typically, in the conclusion of these reviews, one of four levels of evidence (“strong,” “moderate,” “limited,” or “conflicting,” and “no evidence”) is used to describe the strength of evidence for a therapy. These brief summaries are appealing because they provide a conclusion that is easy to recall. Researchers have justified the levels of evidence approach on the grounds that it provides a conclusion that incorporates both the outcomes and quality of included studies [3]. The approach has also been advocated as a method to pool results of individual RCTs when heterogeneity precludes quantitative pooling [4].
Although the levels of evidence approach to synthesis has become popular in systematic reviews, the criteria used to judge the level of evidence have not yet been standardized. Different authors of systematic reviews employ different criteria 2, 3, 5, and the same author may use different criteria in different studies 2, 6. Differences in criteria usually involve methodologic decisions such as whether or not to include nonrandomized trials [5], or how to weight low-quality studies 2, 3. However, although the rules for pooling vary across studies, very similar narrative descriptors are used to describe the results of pooling.
This lack of standardization in levels of evidence pooling rules means that there is the potential for different conclusions to be drawn from the same set of RCTs. If different pooling rules produced different conclusions on the efficacy of a therapy, serious questions would be raised about the validity of the pooling rules. If two systems produce markedly different conclusions from the same RCTs, both systems cannot be valid.
A number of systematic reviews from the Cochrane Collaboration Back Review Group have used a levels of evidence approach to pooling. There is a belief that Cochrane reviews are of higher methodologic quality than non-Cochrane reviews, and there is some data to support this belief [7]. The objective of the present study was to examine the consistency of conclusions of Cochrane systematic reviews of conservative treatment of low back pain when different levels of evidence criteria are applied.
Section snippets
Data extraction
Cochrane reviews of conservative treatment of LBP published in the period January 2000 – May 2001 were retrieved from the Cochrane database. Reviews were included in the study if the author had used a “levels of evidence” approach to pool results of individual RCTs.
We then inspected each of these Cochrane reviews and identified each treatment comparison for which pooling had been undertaken and recorded the conclusion on efficacy. For example, Van Tulder et al. [8] concluded that, for the
Results
Six Cochrane systematic reviews fulfilled the inclusion criteria. These systematic reviews included the following conservative interventions for LBP: exercise [2], acupunture [4], back school [13], lumbar supports [6], cognitive behavioral therapy [8], and massage [3]. From these six reviews a total of 60 treatment comparisons were identified.
The generalized kappa coefficient for the agreement between the four sets of levels of evidence criteria was 0.33 (95% CI = 0.28–0.38) indicating
Discussion
The results of the present study show that the overall agreement between the four sets of levels of evidence criteria was fair. Correcting the kappa values for an attenuation due to imperfect reliability improved our estimates of agreement between schemes but not nearly enough for clinicians and policy makers to have confidence that the systems measure level of evidence reliably. This means that there is potential for markedly different conclusions from systematic reviews if different sets of
Conclusion
Different levels of evidence criteria produce different conclusions on treatment efficacy. We advise readers to be cautious when interpreting conclusions of systematic reviews that use a levels of evidence approach.
References (15)
- et al.
Ultrasound therapy for musculoskeletal disordersa systematic review
Pain
(1999) How to use the evidenceassessment and application of scientific evidence
(2000)- et al.
Exercise therapy for low back paina systematic review within the framework of the Cochrane Collaboration Back Review Group
Spine
(2000) - et al.
Massage for low back pain (Cochrane review). Issue 4
(2000) - et al.
Acupunture for low back pain (Cochrane review). Issue 3
(2000) - et al.
A systematic review of controlled clinical trials on the prevention of back pain in industry
Occup Environ Med
(1997) - et al.
Lumbar supports for prevention and treatment of low back pain (Cochrane review). Issue 3
(2000)
Cited by (60)
A framework for best evidence approaches can improve the transparency of systematic reviews
2012, Journal of Clinical EpidemiologyAquatic exercise & balneotherapy in musculoskeletal conditions
2012, Best Practice and Research: Clinical RheumatologyCitation Excerpt :All three papers used similar methodology with comparable selection criteria. None of the articles reported data from statistical pooling, rather the authors provided a conclusion based on a level of evidence synthesis [22]. This has been an acceptable way of analysis for a period when statistical pooling was not possible due to heterogeneity concerning patient population interventions and outcomes.
Challenges in guideline methodology
2011, Journal of Clinical EpidemiologySpinal Curves and Health: A Systematic Critical Review of the Epidemiological Literature Dealing With Associations Between Sagittal Spinal Curves and Health
2008, Journal of Manipulative and Physiological TherapeuticsA Systematic, Critical Review of Manual Palpation for Identifying Myofascial Trigger Points: Evidence and Clinical Significance
2008, Archives of Physical Medicine and RehabilitationCitation Excerpt :We were aware that level of evidence (LOE) criteria could influence the results of this systematic review.25 Given the variability in trigger point reporting, we opted for what Ferreira et al25 termed a moderate LOE strategy.20 Strong evidence required generally consistent findings in more than 1 high-quality study, moderate evidence required generally consistent findings in 1 high-quality study and 1 or more low-quality studies, or in multiple low-quality studies and insufficient evidence when only 1 study was available or inconsistent findings were observed in multiple studies.
Some conservative strategies are effective when added to controlled mobilisation with external support after acute ankle sprain: A systematic review
2008, Australian Journal of Physiotherapy