Article Text

Download PDFPDF

Toward stronger evidence on quality improvement. Draft publication guidelines: the beginning of a consensus project
  1. F Davidoff1,
  2. P Batalden2
  1. 1Institute for Healthcare Improvement, 143 Garden Street, Wethersfield, CT 06109, USA
  2. 2Health Care Improvement Leadership Development, Dartmouth Medical School, 7251 Strasenburgh Hall, Hanover, NH 03755, USA
  1. Correspondence to:
 Dr F Davidoff
 143 Garden Street, Wethersfield, CT 06109, USA;


In contrast with the primary goals of science, which are to discover and disseminate new knowledge, the primary goal of improvement is to change performance. Unfortunately, scholarly accounts of the methods, experiences, and results of most medical quality improvement work are not published, either in print or electronic form. In our view this failure to publish is a serious deficiency: it limits the available evidence on efficacy, prevents critical scrutiny, deprives staff of the opportunity and incentive to clarify thinking, slows dissemination of established improvements, inhibits discovery of innovations, and compromises the ethical obligation to return valuable information to the public.The reasons for this failure are many: competing service responsibilities of and lack of academic rewards for improvement staff; editors’ and peer reviewers’ unfamiliarity with improvement goals and methods; and lack of publication guidelines that are appropriate for rigorous, scholarly improvement work. We propose here a draft set of guidelines designed to help with writing, reviewing, editing, interpreting, and using such reports. We envisage this draft as the starting point for collaborative development of more definitive guidelines. We suggest that medical quality improvement will not reach its full potential unless accurate and transparent reports of improvement work are published frequently and widely.

  • quality improvement
  • evidence
  • publication
  • guidelines

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

A great deal of creative effort currently goes into making medical care safer and more reliable. The results of that improvement work are sometimes shared informally among those in the field and are occasionally published, primarily in the administrative and management literature and in a small number of journals devoted to the subject. Although much of that work is both useful and rigorous, most of it is unfortunately never made publicly available in either print or electronic form, and the reports that do appear vary considerably in accuracy, completeness, and transparency. The lack of such reports is particularly noticeable in the clinical literature, which is regrettable since, despite the crucial role they play, clinicians are notably reluctant to become involved in improvement efforts.

We have suggested elsewhere that both scientific discovery and experiential learning are required for improvement in medical care to flourish (F Davidoff, P Batalden, submitted for publication). The principal goal of science is to discover and disseminate new knowledge, a process that can be briefly summarized as “Plan-Do-Study-Publish” (T Nolan, personal communication). In contrast, the principal goal of experiential learning is to enhance performance. As a consequence, neither the action cycle used to characterize informal experiential learning (“Experience-Question-Conceptualize-Retry”1) nor the formal version of that cycle (“Plan-Do-Study-Act”) that is now a central component of medical quality improvement2 include a “Publish” step. As discussed below, however, we suggest that, just as the growth of scientific knowledge would be unthinkable without publication, the improvement process will not realize its full potential unless the experiential learning that makes up much of quality improvement is also widely shared through publication.


The canon of science absolutely requires that scientific work must be captured in written or graphic form. Indeed, the biologist E O Wilson has gone so far as to state that “One of the strictures of the scientific ethos is that a discovery does not exist until it is safely reviewed and in print”.3 It is true that publication in science does have an unfortunate tendency to be overvalued (the “publish or perish” phenomenon), since the principal instrument of social control in the scientific community is the exchange of information for professional and social status, funding, and power.4 Nonetheless, publication remains essential in science for many reasons:

  • Most importantly, publication is essential because of the central role it plays in both disproof and corroboration (lack of disproof), which lie at the heart of the logic of science. The philosopher Karl Popper puts it this way: “Those among us who are unwilling to expose their ideas to the hazard of refutation do not take part in the scientific game”.5 Only full and open publication provides the kind of access and transparency needed for the exercise of that logic.

  • Without the cumulative “collective memory” available in the published record, new findings cannot be interpreted in the context of prior work, which inevitably distorts their meaning.6,7

  • The lack of a published record also permits scarce resources to be wasted on work that has already been done.

  • It is arguably unethical to consume time, effort, and money in research (and, in the case of clinical research, expose participants to inconvenience, cost, and risk) and then not return some benefit to the public by sharing the knowledge gained from that research.8 Failure to publish clinically relevant data has recently even become a legal issue, as it is now being prosecuted in court as a form of fraud.9

Failure to publish new knowledge about improvement work also has a number of serious consequences:

  • Perhaps most importantly, the lack of well organized complete reports of quality improvement work makes it difficult to establish repeatability, the sine qua non of evidence regarding the efficacy of experiential learning. This lack is particularly frustrating for those interested in aggregating multiple published reports of similar improvement efforts in various ways, including study data banks10–12 and qualitative systematic reviews,13 which can strengthen causal inferences about efficacy.14,15

  • The dearth of published reports means that much quality improvement work is not open to serious critical public scrutiny and hence accountability, since peer review, editorial input and comment, letters from readers, and general debate about the specifics of improvement projects are prevented from taking place.

  • Without the expectation that they should publish their work, those involved in improvement lack the incentive and the opportunity to clarify their thinking, verify their observations, and justify their inferences that writing up their results provides.

  • Failure to publish improvement experiences, including negative results, slows the dissemination of known effective innovations and wastes the time, effort, and money that others spend independently rediscovering those same innovations—and making the same mistakes.

  • Failure to publish slows the development of improvement science, since dissemination of information about one innovation sparks others.16,17

  • As is true for scientific research, quality improvement uses public resources and exposes participants to inconvenience, cost, and risk. Failure to share publicly the results of improvement efforts, in return for those contributions, can therefore be challenged on ethical grounds.


Unfortunately, the strengthening of quality improvement evidence through publication has recently become entangled in the following bureaucratic paradox, with potentially serious consequences. The Common Rule that governs the conduct of federally supported human subject research in the US defines research as a “systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge”.18 Since the most widely used and respected criterion for the generalizability of knowledge is whether it has been published, quality improvement projects that have been (or even may be) published are now being considered “research” under the Common Rule definition. To complicate matters, since quality improvement in medicine virtually always involves human participants, quality improvement work that is published is now frequently considered to be a form of human subject research.19 Framed in those terms, virtually all quality improvement immediately becomes subject to the regulatory mechanisms that govern clinical research—most importantly, protection of human participants through ethics committee or Institutional Review Board (IRB) review.

Protection of participants in quality improvement is, of course, essential. But a conceptual shift of medical quality improvement from an intrinsic professional responsibility20,21 to a research activity seems both illogical and counterproductive on several grounds. First, the transitivity of the Common Rule’s logic is itself open to question. Although all research strives to be generalizable (hence publishable), it does not automatically follow that all investigations that are published (hence generalizable) are research—for example, case reports and case series, reviews, analytical studies, commentary and opinion pieces, many of which contain important, original, and generalizable knowledge, are not generally considered “research,” or at least not pre-planned “systematic investigations” in the usual sense. In fact, the Belmont Report itself, on which the Common Rule is based, recognizes that valuable generalizable knowledge can and does flow from the experience of health care delivery per se, not just from research; in its words:

Even when a procedure applied in practice may benefit some other person, it remains an intervention designed to enhance the well-being of a particular individual or groups of individuals; thus, it is a practice and need not be reviewed as research”.22

Thus, since quality improvement is fundamentally “a procedure applied in practice”, designed to enhance the well being of particular individuals or groups rather than to produce generalizable knowledge, there is no reason for it to be considered a priori as “research”. Of course, such work can and should be seen as research if the initial improvement plan contains formal elements designed specifically to generate new generalizable knowledge, over and above its intended immediate benefit to its local participants.

Secondly, when protection of participants is at stake, it does not matter whether quality improvement is characterized as “research” or as some other kind of activity (such as experiential learning); protection is necessary in any case. Indeed, it can be argued, ironically, that patients need more protection in medical care systems that are not actively engaged in quality improvement than in systems that are. Finally, most IRBs are overburdened, understaffed, and underfunded;23 formal IRB review is generally slow and cumbersome;23 IRB judgments are often inconsistent;23,24 and most IRBs have little familiarity with the nature and methodologies of quality improvement. Requiring all quality improvement efforts to undergo such review could therefore have the paradoxical and damaging result of actually discouraging improvements in care.

For all of these reasons, we suggest that no one should be deterred from working to improve the care of individual patients and groups simply because of concerns that the results of that work may ultimately turn out to warrant publication. Similarly, no one who has already done improvement work should be reluctant to publish their results if they recognize, after the fact, that what they have learned is generalizable. The critical issue here is not whether they are doing research; it is whether staff engaged in improvement have taken the appropriate steps to protect those people who participate in their efforts to improve care.


If publication of quality improvement evidence is so crucial, what explains the “publication gap”? To some extent it has to do with the nature of the people who do the work. Most are busy “front line” healthcare professionals—managers, administrators, planners, clinicians—with heavy competing service responsibilities. Many are neither oriented to, nor experienced in, academic work, including writing and publication; they also generally do not work in academic environments where they would “perish” if they failed to publish. And writing itself, particularly writing well, is hard.25 (A widely quoted saying among writers is: “Writing is easy. Just sit down at your desk and open a vein.”)

But other powerful intellectual and cultural forces are also at work here. Editors, peer reviewers, and the academic medical community generally control both biomedical and clinical publication, and all of those stakeholders are deeply immersed in the culture of scientific discovery whose primary purpose is generation of new knowledge. As a consequence, they may be unfamiliar with the goals and methodologies of experiential learning, which is the principal approach used in solving the complex non-linear problems of quality improvement. The editors of, and reviewers for, biomedical journals may therefore have difficulty recognizing the nature, importance, or even the existence of many of those problems, which consequently can interfere with publication of reports of quality improvement work, even when that work is clinically important and methodologically sound. Moreover, until recently, little guidance has been available for authors, editors, and reviewers on how best to write, review, and edit complete and precise accounts of quality improvement.


With these considerations in mind, we offer here a draft set of guidelines in the form of a checklist, designed to increase the completeness, accuracy, and transparency of original reports of quality improvement work (table 1). The guidelines proposed here are intended primarily to support publication of the strongest and most definitive evidence on quality improvement in the permanent peer reviewed journal literature. They may also be useful in preparing reports of quality improvement work, much of it still preliminary or in progress, that are presented in the many important but more transient venues used for disseminating that information such as meetings, white papers, and media reports.

Table 1

 Draft proposed guidelines for stronger quality improvement evidence*

These proposed guidelines build on an earlier and more limited set of publication standards.26 In our view, those earlier guidelines are most appropriate for reporting on small, relatively informal improvement projects or “quality improvement reports”—the equivalent, perhaps, of clinical case reports. In such reports the primary focus is on the specific clinical or delivery system problem rather than on quality improvement methods. The guidelines proposed here may be somewhat more appropriate for publications whose primary purpose is to demonstrate the efficacy of quality improvement methods. In our view, however, all original applied quality improvement work involves both real problems to be solved plus new and better ways of solving them; we therefore see the distinction between “quality improvement reports” and reports of quality improvement methods as a matter of emphasis rather than mutual exclusivity. We believe the guidelines proposed here should, in fact, be applicable to any well thought out improvement project, large or small, but particularly to complex, formal, planned interventions.

They have been developed with informal input from people with extensive experience in quality improvement, medical ethics, clinical research, and medical editing, and have been modified in response to feedback from people who have used them in writing and critiquing reports of quality improvement projects. They generally conform to the principles used in creating guidelines for reporting randomized clinical trials,27 and for studies that use other designs28–33 and studies in specific content areas.34,35 Importantly, the consistency and completeness of reporting has been found to improve in journals that have endorsed and used such guidelines.36

The guidelines in table 1 differ in a number of significant ways from the earlier set.26 For example, the new draft guidelines are organized according to the IMRaD (Introduction, Methods, Results, and Discussion) format. The earlier guidelines explicitly rejected the IMRaD format on the grounds that, unlike the invariant protocols of clinical research, the initial plans used in improvement projects are frequently (and intentionally) altered during the course of the projects, which was seen as making them intrinsically incompatible with that format.26 In our view, the IMRaD structure is generic, reflecting the flow of thinking that underlies all learning and discovery, which is why it is widely used to report study designs of all types. More specifically, we would argue that, far from violating the logic of discovery, the shifts in improvement plans are, in fact, among the more important outcomes of the experiential learning (improvement) process, and therefore fit comfortably into the Results section of the IMRaD structure.

Secondly, the number of items in these draft guidelines has been expanded from eight in the earlier set26 to 16, to accommodate several important additional topics including: prior information available on the problem area; failures, risks or harms encountered; assessment of the project’s limitations; evaluation of the project’s internal and external validity; and specific plans for assessing maintenance of the improvement. Although 16 items is a substantial number, it should be manageable. Authors, editors, and peer reviewers have not had difficulty using 20–25 items in other publication guidelines.


We view the draft reporting guidelines offered here as a reflection of the rapidly developing science of quality improvement, and hence as a further step in the evolution of publication standards, not as a finished product. It would be helpful if readers would send the editors of this journal their comments and criticisms regarding this version of the guidelines. Feedback on their strengths and limitations from people who use them to write and critique reports of quality improvement would be particularly valuable.

We also propose that, at the earliest opportunity, a group representing the many stakeholders in quality improvement—clinicians, administrators, health services researchers, social scientists, editors, ethicists, statisticians, patients, and others—should be convened to assume stewardship of the guidelines. Diffusion of an innovation such as reporting guidelines is a complex and intensely social process.37,38 Drawing on the experience gained in creating other publication guidelines27,28,30 and similar standards documents,39 this group would undertake a systematic critique of the completeness, clarity, and appropriateness of specific guideline items. It should also consider a number of general questions including:

  • How good is the evidence supporting the inclusion of items in the guidelines?

  • Is more such evidence needed and, if so, what studies would be most likely to produce it?

  • Would it be useful to develop other related guidelines, including possible variants or extensions of this set?

  • How can the guidelines be distributed and endorsed for maximum effectiveness?

We recognize that the use of reporting guidelines is not justified unless they improve the quality of reporting. We therefore urge, that, once a more definitive version of the guidelines has been formulated, their impact on quality improvement reports be formally and carefully assessed. For example, editors and peer reviewers could be asked for subjective judgments of their value in making editorial judgments; authors could judge their value in organizing and writing reports; and readers could judge their value in understanding and applying published papers. The impact of the guidelines could be assessed objectively and quantitatively by comparing papers prepared with and without them with regard to completeness, accuracy, and precision of reporting,40 and the suitability of such papers for inclusion in systematic reviews.

Finally, we recognize that, by itself, accurate and transparent reporting is of little value if the work being reported is fundamentally flawed. On the other hand, we would argue that reporting standards can actually play a role in improving improvement work. Thus, although the primary purpose of such guidelines is to improve the reporting rather than the planning and conduct of improvement work, they could very well have a positive secondary (or “backwash”) effect on the quality of the work itself by providing an explicit consistent framework for its design and execution. In that sense, the guidelines could serve an important educational purpose, particularly in conjunction with an “elaboration and explanation” document such as the one created to support the CONSORT guidelines.41 Systematic consideration of specific background elements and criteria related to each guideline item, such as those listed in table 2, is an example of such an educational function.

Table 2

 Examples of elements and criteria to be considered in reporting guideline items


In contrast to the integral role that publication plays in scientific discovery, publication in medical quality improvement has unfortunately had only a limited role to date. This lack of published reports has arguably deprived the healthcare system of rigorous scholarly evidence on improvement work and, hence, has slowed improvement of the improvement process.

The improvement-publication gap in part reflects the reality that most people who do the work of quality improvement are more interested in actually improving care than in writing about what they do. But widespread misunderstandings about the nature of experiential learning and experimental discovery—the perceptions, for example, that experiential learning deals with problems that are intractably “messy;” that the evidence for the efficacy of most experiential learning is intrinsically weak; and that “applied” disciplines are of less intellectual and social value than “pure” ones—also appear to play important roles in discouraging publication, particularly in the clinical literature. Moreover, because the current editors and peer reviewers of biomedical journals are relatively unfamiliar with the elements that are most worthwhile in making improvements, they have not encouraged authors to use an inclusive typology of those elements, thus possibly contributing to the lack of progress toward a “science” of improvement.

We therefore strongly encourage the widest possible reporting of quality improvement work in print and electronic form. In support of that goal, we urge the further development, adoption, and widespread use of publication standards such as those proposed here, that contain a systematic and comprehensive set of the elements of medical quality improvement.


Donald Berwick, Ann Davidoff, Joanne Lynn, Lloyd Provost, Jane Roessner, Mark Rzeszotarski, Mark Splaine, and David Stevens all contributed importantly to the development of ideas in this paper.



  • Financial sponsors: None

  • Competing interests: None

Linked Articles