Broadening the view of evidence-based medicine
- Correspondence to: Dr D M Berwick Institute for Healthcare Improvement, 20 University Road, 7th Floor, Cambridge, MA 02138, USA;
New methods of learning, new guidelines for publication
Scholars in the last half of the 20th century forged our modern commitment to evidence in evaluating clinical practices. They were courageous people, iconoclasts for their time, insisting that the scientific method was a necessary and plausible tool for judging the value of what we did for and to patients. Scientific evaluation of clinical practice was necessary, they argued, because unguided human observers are frail meters of truth—too prone to see what they expect to see, too likely to confuse effort with results or to attribute outcomes to visible causes rather than hidden ones, too trusting in small numbers and local opinion. Only formal scientific designs and strong statistical methods, they claimed, can protect the human mind from its own biases and adjust for hidden uncontrolled influences, sorting signals from noise. Scientific evaluation of practice is plausible, they argued, because the hypothetico-deductive method and proper statistical theory can be applied, with only modest adjustments, to the world of clinical process, just as it can be in a laboratory. And they taught us how to do that.
Their arguments were not welcomed at first. Today they are heroes, honored in the history of clinical science, but Archie Cochrane,1 Alvin Feinstein,2 Frederick Mosteller,3 Tom Chalmers,4 David Sackett,5 and others had first to play the role of outsiders, essentially pestering the center of health care to get serious about evaluating its work. They had thick skins, these critics, because they were—and had to be—change agents.
But they had evidence for their assertions, and they systematically accumulated more as they built their case. The risks of unguided impression were documented well in studies of the emergence and persistence of clinical practice of little or no value, once studied. Gastric freezing,6 radical mastectomy,7 theophylline for asthma,8 and dozens of other common practices wilted under the microscope of properly designed clinical trials, proving no better than simpler practices or out-and-out harmful. Beliefs and evidence simply did not always correspond.
A normative framework emerged for judging the value of evidence, a heraldry made clear in works such as the monumental volumes on effective care in pregnancy and childbirth compiled by Iain Chalmers and his colleagues,9 the reports of the Canadian Task Force on preventive medicine,10 Mosteller’s magisterial Institute of Medicine report on assessing medical technologies,11 Feinstein’s texts,2 Sackett’s series in the Canadian Medical Association Journal,12 and in the United States Task Force on Clinical Preventive Services.13 The Crown Prince of methods was the randomized, double blind, prospective, controlled clinical trial—the “RCT”—which stood second to no other method in protecting the scientist and the reader against bias, confounding, and other generators of false conclusion. Below the RCT stood methods of less nobility, graded in their evidence value from the properly designed cohort and case-control studies of epidemiology to the lowly case series, the suspect expert opinion, and the bestial anecdote. Systematic reviews that allow concatenation of results across multiple studies of varying design and quality, and meta-analysis that uses high powered statistics to combine quantitative findings from across multiple comparison studies, have further ramped up the power of controlled protocol-driven evidence.14
The leaders of evaluative clinical science fostered their young, and a generation of new scholars emerged in healthcare academia, founding their careers on evaluation of practice and on the progression of methods for evaluation. One of the most impressive success stories of 20th century medicine was how these people and these views—the entire field of “clinical epidemiology” and “evaluative clinical sciences”—not only survived but thrived, and eventually placed its leaders—scholars of the caliber of John Eisenberg, Christine Cassel, Harold Sox, and many others—in positions of the highest influence in departments of medicine, journal editorships, and professional societies, honoring work that a few decades before would not even have been understood to be about health care.
The benefits of evidence-based medicine, thus defined, have been immense. Patients today can count on a growing proportion of the tests, diagnostic processes, surgical procedures, and other costs and risks in care to have been subjected to proper systematic evaluation. The very definition of “quality” in health care has now come to incorporate the use of scientific evidence in practice; that is what the Institute of Medicine meant in its call for improvement of “effectiveness” as a key aim for improving care.15 Gaps between science and practice remain wide, but we seem increasingly committed to closing them. That is good.
But, we now have a problem: we have overshot the mark. We have transformed the commitment to “evidence-based medicine” of a particular sort into an intellectual hegemony that can cost us dearly if we do not take stock and modify it. And because peer reviewed publication is the sine qua non of scientific discovery, it is arguably true that hegemony is exercised by the filter imposed by the publication process. The failure of the publication filter to accommodate the kind of discovery that drives most improvement in health care—and the failure of those working in healthcare improvement to reconfigure the filter appropriately—is the message of the paper on publication guidelines by Davidoff and Batalden in this issue of QSHC.16 This paper is important, not only because it addresses the narrower issue of publication standards but also because it provides important support for an epistemology of a new and broadened understanding of the evidence needed for the improvement of care.
The argument for that epistemology is not a simple one, but its intuitive force is somewhat easier to uncover with a simple question: “How much of the knowledge that you use in your successful negotiation of daily life did you acquire from formal scientific investigation—yours or someone else’s?”
Did you learn Spanish by conducting experiments? Did you master your bicycle or your skis using randomized trials? Are you a better parent because you did a laboratory study of parenting? Of course not. And yet, do you doubt what you have learned?
Broadly framed, much of human learning relies wisely on effective approaches to problem solving, learning, growth, and development that are different from the types of formal science so well explicated and defended by the scions of evidence-based medicine. Although they are far from RCTs in design, some of those approaches offer good defences against misinterpretation, bias, and confounding. In the world of clinical care, especially in the quest for improvement of clinical processes, is it plausible that those approaches—the ones we use in everyday life—might have value too, used well and consciously, to help us learn?
The answer is “Yes”. And yet, the very success of the movement toward formal scientific methods that has matured into the modern commitment to evidence-based medicine now creates a wall that excludes too much of the knowledge and practice that can be harvested from experience, itself, reflected upon. The iconoclasts of the past now have power, and they can define who will be seen as iconoclasts of the present.
There is a way out. It involves curiosity. The methods of observation and reflection on the basis of which most human learning occurs and, frankly, on the basis of which many modern industries and enterprises are building their futures, are systematic, theoretically grounded, often quantitative, and powerful. They do not include RCTs, but they honor RCTs in their proper place. They perhaps deserve some honor in return, or at least the serious open minded scrutiny that marks true scholarship.
My close friend and mentor Tom Nolan PhD uses a felicitous term to denote these methods of learning: “pragmatic science”.17 Here are a few of the elements of the methods of pragmatic science:
tracking effects over time, especially with graphs (rather than summarizing with statistics that do not retain the information involved in sequences);
using local knowledge—the knowledge of local workers—in measurement (rather than relegating measurement to people least familiar with the subject matter and work);
integrating detailed process knowledge into the work of interpretation (inviting observers to comment on what they notice rather than “blinding” them to protect them against what they know);
using small samples and short experimental cycles to learn quickly (rather than overpowering studies and delaying new theories with samples larger than needed at the time); and
employing powerful multifactorial designs (rather than univariate ones when the better questions for the time are formative, not summative).
Pragmatic science of this type is alive and well. It thrives in the halls of continual improvement of care now engaging the energies of thousands of healthcare leaders worldwide. It thrives in brilliant texts by theoreticians who have been teachers in sectors of the economy other than health care.18,19 But, to our great expense, it remains largely trapped on the far side of a publication wall well guarded by academicians who may, I think, have overlearned the crucial lessons of the courageous clinical methodologists of the past few decades. Today’s evaluation methodologists guard not only the portals of our journals, but also our curricula and the minds of our young professionals. Health care has much to gain if those portals now open again to a new wave of disciplined methods of learning from reflective practice, and disciplined methods of sharing the learning through transparent, accurate, and complete published reports—such as the use of publication guidelines—as explained and defended here by Davidoff and Batalden. Health care is too important and too fragile to deny it the benefits of disseminating the hard won fruits of systemic learning, however this learning takes place.
New methods of learning, new guidelines for publication