Original Article
A simple and valid tool distinguished efficacy from effectiveness studies

https://doi.org/10.1016/j.jclinepi.2006.01.011Get rights and content

Abstract

Objective

To propose and test a simple instrument based on seven criteria of study design to distinguish effectiveness (pragmatic) from efficacy (explanatory) trials.

Study Design

Currently no validated definition of effectiveness studies exists. We asked the directors of 12 Evidence-based Practice Centers to select six studies each: four that they considered to be examples of effectiveness trials and two considered efficacy studies. We then applied our proposed criteria to test the construct validity using the selected studies as if they had been identified by a gold standard.

Results

Based on the rationale to identify effectiveness studies reliably with minimal false positives (i.e., a high specificity), a cutoff of six criteria produced the most desirable balance between sensitivity and specificity. This setting produced a specificity of 0.83 and a sensitivity of 0.72.

Conclusion

When applied in a standardized manner, our proposed criteria can provide a valid and simple tool to distinguish effectiveness from efficacy studies. The applicability of systematic reviews can improve when analysts place more emphasis on the generalizability of included studies. In addition, clinicians can also use our criteria to determine the external validity of individual studies, given an appropriate population of interest.

Introduction

Randomized controlled trials (RCTs) are the gold standard in evaluating the effects of treatments. To be clinically meaningful, results must be relevant to specific patient populations in specific settings [1]. Multiple factors determine the external validity (i.e., generalizability or applicability) of RCTs: patient characteristics, condition under investigation, drug regimens, costs, compliance, comorbidities, and concomitant treatments. For practical reasons, trials cannot always take these factors fully into consideration (e.g., costs, poor compliance). Also, certain aspects of study design—eligibility criteria, study duration, mode of intervention, outcomes, adverse events assessment, or type of statistical analysis—greatly influence the degree of generalizability, given an appropriate population of interest.

Clinicians and policymakers often distinguish between the efficacy and the effectiveness of an intervention. Efficacy trials (explanatory trials) determine whether an intervention produces the expected result under ideal circumstances. Effectiveness trials (pragmatic trials) measure the degree of beneficial effect under “real world” clinical settings [2]. Hence, hypotheses and study designs of an effectiveness trial are formulated based on conditions of routine clinical practice and on outcomes essential for clinical decisions.

Efficacy and effectiveness exist on a continuum. Generalizability depends largely on the viewpoint of the observer and the condition under investigation. Baseline patient characteristics (e.g., gender, age, severity of the disease, racial groups) are primary factors in generalizability; thus, depending on the population of interest, generalizability of the same study can range from low to high. Geographic settings (urban vs. rural) and health care systems can also be significant factors [1], although geography may have less influence on generalizability of drug trials than trials of other interventions (e.g., screening programs, behavioral therapy).

Ensuring generalizability may compromise internal validity. Under everyday clinical settings, factors such as patient or doctor preferences [3], [4], or patient–doctor relationships [5], [6] can influence response and compliance. Random allocation, allocation concealment, and blinding negate these factors, thereby increasing internal validity on the one hand and decreasing external validity on the other. Therefore, to some extent, the operational definition of “effectiveness trial” delineates the necessary trade-offs with internal validity. An ideal definition would balance this equilibrium at a point at which satisfactory internal validity accompanies a high degree of generalizability.

Systematic reviews, including meta-analyses, have become an increasingly important source of information for clinical practice. If well conducted, they synthesize large amounts of information and provide estimated effect sizes that have greater precision and generalizability than individual studies [7]. Distinguishing between efficacy and effectiveness contributes an important aspect to analyzing any body of clinical evidence. Furthermore, greater emphasis on effectiveness studies may lead to changes in presentation in systematic reviews and policy initiatives.

In this article, we propose and test seven hallmarks of study design to create a tool that can help researchers and those producing systematic reviews, as well as clinicians who are interested in the generalizability of study results, to distinguish more readily and more consistently between efficacy and effectiveness studies.

Section snippets

Methods

Based on clinical and methodological considerations and the published literature, we selected seven domains of study design that, in our view, demonstrably influence the generalizability of trial results (Table 1). We searched MEDLINE® to identify published literature on instruments to distinguish effectiveness from efficacy studies. We found various definitions of effectiveness studies [8], [9], [10], [11], [12] but no validated rating instruments. Additional searches on Web sites of the US

Results

The EPC directors identified 26 studies. Of these, 6 were intended to illustrate efficacy trials and 20, effectiveness studies. We excluded two studies from the latter group because they did not meet eligibility criteria [24], [25]: one was an observational follow-up of three RCTs [24] and the other a pooled analysis of clinical trials [25]. Of the remaining 24 studies (Table 2, alphabetical by author), 22 were RCTs, 3 with an open-label design [26], [27], [28]; the other 2 were a

Discussion

Our objective was to identify drug effectiveness studies reliably based on seven proposed criteria. We focused on studies of medications because they are common, but many of the same principles can be applied to other types of interventions. Because we attribute greater value to effectiveness studies than to efficacy studies, the specificity of this process had to be high. That is, we wanted to ensure that efficacy studies are not falsely rated as effectiveness studies. Erring on the side of

Acknowledgments

We would like to thankfully acknowledge the Directors of the United States and Canadian EPCs who selected the sample studies and provided insightful comments to the study protocol. We would also like to thank Jean Slutsky and Kenny Fink from AHRQ for their support of this methods project.

Disclaimer: The authors of this report are responsible for its content. Statements in the report should not be construed as endorsement by the AHRQ or the US Department of Health and Human Services of a

References (57)

  • J. Benson et al.

    Patients' decisions about whether or not to take antihypertensive drugs: qualitative study

    BMJ

    (2002)
  • D.A. Redelmeier et al.

    Understanding patients' decisions. Cognitive and emotional perspectives

    JAMA

    (1993)
  • K.B. Thomas

    General practice consultations: is there any point in being positive?

    Br Med J (Clin Res Ed)

    (1987)
  • Z. Di Blasi et al.

    Influence of context effects on health outcomes: a systematic review

    Lancet

    (2001)
  • C.D. Mulrow

    Rationale for systematic reviews

    BMJ

    (1994)
  • R.H. Brook et al.

    Efficacy, effectiveness, variations, and quality. Boundary-crossing research

    Med Care

    (1985)
  • K. Hoagwood et al.

    Introduction to the special section: efficacy and effectiveness in studies of child and adolescent psychotherapy

    J Consult Clin Psychol

    (1995)
  • R.M. Califf et al.

    Principles from clinical trials relevant to clinical practice: Part I

    Circulation

    (2002)
  • Anonymous

    Undertaking systematic reviews of research on effectiveness: CRD's guidance for those carrying out or commissioning reviews. CRD Report Number 4 (2nd edition)

    (2001)
  • C.P. Gross et al.

    Reporting the recruitment process in clinical trials: who are these patients and how did they get there?

    Ann Intern Med

    (2002)
  • T.R. Fleming et al.

    Surrogate end points in clinical trials: are we being misled?

    Ann Intern Med

    (1996)
  • J.A. Coutts et al.

    Measuring compliance with inhaled medication in asthma

    Arch Dis Child

    (1992)
  • N.A. Gibson et al.

    Compliance with inhaled asthma medication in preschool children

    Thorax

    (1995)
  • J.S. Kelloway et al.

    Comparison of patients' compliance with prescribed oral and inhaled asthma medications

    Arch Intern Med

    (1994)
  • G.R. Norman et al.

    Interpretation of changes in health-related quality of life: the remarkable universality of half a standard deviation

    Med Care

    (2003)
  • S. Hollis et al.

    What is meant by intention to treat analysis? Survey of published randomised controlled trials

    BMJ

    (1999)
  • S. West et al.

    Systems to rate the strength of scientific evidence

    Evid Rep Technol Assess (Summ)

    (2002)
  • D.S. Geldmacher et al.

    Donepezil is associated with delayed nursing home placement in patients with Alzheimer's disease

    J Am Geriatr Soc

    (2003)
  • Cited by (166)

    View all citing articles on Scopus

    Funding for this research was provided to the RTI-UNC EPC through a contract from the AHRQ to RTI International (contract number 290-02-0016). The funding source had no involvement in the design and conduct of the study, and in the analysis and interpretation of data.

    View full text