Article Text

Download PDFPDF

Root-cause analysis: swatting at mosquitoes versus draining the swamp
  1. Patricia Trbovich1,2,
  2. Kaveh G Shojania1,3,4
  1. 1Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada
  2. 2North York General Hospital, Toronto, Ontario, Canada
  3. 3Department of Medicine, Sunnybrook Health Sciences Centre and the University of Toronto, Toronto, Ontario, Canada
  4. 4University of Toronto Centre for Quality Improvement and Patient Safety, Toronto, Ontario, Canada
  1. Correspondence to Dr Patricia Trbovich, 155 College suite 425, Toronto, ON, Canada M5T 3M6; patricia.trbovich{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Many healthcare systems recommend root-cause analysis (RCA) as a key method for investigating critical incidents and developing recommendations for preventing future events. In practice, however, RCAs vary widely in terms of their conduct and the utility of the recommendations they produce.1 ,2 RCAs often fail to explore deep system problems that contributed to safety events3 due to the limited methods used, constrained time and meagre financial/human resources to conduct RCAs.4 Furthermore, healthcare organisations often lack the mandate and authority required to develop and implement sophisticated and effective corrective actions.4 Consequently, corrective actions primarily aim at changing human behaviour rather than system-based changes.5 ,6

In this issue of BMJ Quality and Safety, Kellogg et al7 confirm these concerns about RCAs. Reviewing 302 RCAs conducted over an 8-year period at a US academic medical centre, the authors report the most common solution types as training, process change and policy reinforcement. Serious events (eg, retained surgical sponges) recurred repeatedly despite conducting RCAs. These findings highlight the long overdue need to enhance the effectiveness of RCAs.

Swatting mosquitoes versus draining the swamp

James Reason (of the Swiss Cheese Model8) once characterised the goal of error investigations as draining the swamp not swatting mosquitoes.8 Critical incidents arise from the interplay between active failures (eg, not double checking for allergies before administering a medication) and latent conditions9 (eg, workload for the nurse and reliance on human memory for a critical safeguard when electronic systems with built-in reminders exist). Returning to Reason's analogy, we do not want to spend our time and expend our resources swatting at the mosquitoes of ‘not double checking’. Rather, we want to drain the swamp of the many latent conditions that make not double checking more likely to occur. Too often, RCA teams focus on the first causal factor identified (eg, staff violation of the allergy-checking policy) rather than considering such factors holistically as parts of a sociotechnical system (ie, interactions between people and technology embedded in an organisational structure).

Further investigation needed between RCAs and recommended corrective actions

RCA represents a hypothesis-generating approach. The investigative team develops a detailed chronology of events that informs the identification of active errors and latent conditions that likely contributed to the incident. There are two senses in which RCAs are hypothesis generating. First, RCAs often expose system problems for which the existing literature does not provide a clear solution. For example, following the RCA of an incident involving medications electronically ordered for the wrong patient, a hospital might consider a forcing function that mandates re-entry of patient identifiers before allowing order entry.10 It might also consider placing the patient’s photograph on the order entry screen.11 When a problem has two relatively recently identified solutions evaluated only in limited settings, we can hypothesise that implementing one of the interventions locally will reduce further wrong patient orders. But we cannot regard such a corrective action with so much confidence that it requires no evaluation. Moreover, for some RCAs, the existing literature does not furnish any clearly effective interventions to address latent conditions identified (eg, to substantially improve organisational culture, address diverse types of communication failures, effectively reduce fragmentation of care, etc). In such situations, we must regard any proposed corrective action as speculative—a hypothesis that requires testing.

A second sense in which RCAs are hypothesis generating stems from the inherent limitations of any analysis based on a single case. We do not know to what extent the putative contributing factors in fact caused the incident. Hindsight bias can make staff perceptions of the event unreliable.12 We also may not have succeeded in identifying all relevant factors. The meetings and interviews that constitute the core methods of traditional RCA cannot easily identify or adequately characterise certain types of problems.13 For example, meetings and interviews often fail to reveal pre-existing behaviour-shaping factors, such as task complexity, problematic workflow or flawed equipment design, that enabled the error to occur. Missing such information, we will often fail to create a complete enough picture of the deep system-based causes of events to inform effective system redesign.

To foster more complete pictures of contributing factors (and thus enhance the effectiveness of RCAs), we need to harness pragmatic observational14 and simulation15 techniques to help identify system-based causes of events or organisational structures that inhibit desired behaviours by individuals. For example, staff commonly identify communication and workflow problems in interviews. Before suggesting a solution for either category, we can better characterise these problems using direct observation of care. When staff identify possible equipment design issues, simulation can confirm them and provide deeper insights into equipment deficiencies.

Jumping to corrective actions on the basis of a single case analysed using a single method (staff recall of the event during interviews and methods) can hardly be regarded as a robust strategy for improvement. Participants may overestimate the importance of some factors, we may miss other important contributing factors altogether, implement changes that do not address the causal factors as intended and may even introduce new risks. RCA teams should focus on generating hypotheses and use diverse methods to broaden the scope of investigation. The key to identifying effective corrective actions lies in aligning corrective actions to causal factors.13

Aligning corrective actions to causal factors

Kellogg et al7 report policy reinforcement among the most prevalent corrective actions stemming from RCAs. For example, after investigating a case in which a surgical sponge was left inside the patient, the RCA team concluded that the organisation's policy for counting equipment was effective and that human error was to blame. The ‘corrective action’ thus consisted of simply reiterating the current policy. This represents a common pitfall of RCAs: investigators complete their analysis after identifying a human error (eg, breached policy) rather than digging deeper into system problems.5 ,6 Interventions necessitating substantial financial resources are rarely considered. Rather, investigators resort to corrective actions that involve person-based solutions (ie, imposing actions on the individuals), focused on what is possible as opposed to what is needed.13 ,16 Successful identification of corrective actions often depends on acknowledging that clinicians rarely intentionally violate policies. Lack of compliance usually reflects impractical policies in the context of poorly designed systems.13 ,17 The RCA team must conduct in-depth investigations (again, using broader methods than just staff interviews) to identify underlying cognitive, task, environmental, workflow, organisational or other system factors that contributed to policy noncompliance.

Returning to the example highlighted by Kellogg et al,7 the multiple cases of retained surgical sponges over the 8-year period raise the question: is counting not being performed well or does counting just not work well? Disguised observation can address the first option. If staff clearly do not make much effort to count equipment, then the RCA should determine why staff simply go through the motions of counting and developing interventions that prompt more earnest execution of the current policy.18 But, if disguised observation suggests no dereliction of the counting policy, then the policy probably does not work. The hospital needs to consider investing in a new solution, such as more intensive, multifaceted monitoring strategy19 or a technology solution such as radiofrequency identifications20 or other expensive but potentially worthwhile solutions.18

Policies that seem workable can be misapplied, violated or simply lack triggers to prompt their use.21 Implementing a new policy requires a baseline assessment to identify the gap between recommended and current practice to identify the barriers to change and the practical actions required to implement the change. Ideally, corrective actions should make the ‘right thing to do the easy thing to do’.

A commonly depicted hierarchy for corrective actions (figure 1) ranks person-based corrective actions (eg, remedial training, policy/procedure reinforcement, use of warnings) as less effective than system-level changes (eg, automating a safety check, forcing functions, changing culture).22 ,23 This ranking comes at a cost of greater effort. The frequent choice of education and policies as corrective actions following RCAs reflects their greater ease compared with automation and forcing functions. Not only are such system-level solutions costly and labour intensive to pursue, they require vigilant monitoring for implementation problems and new hazards intrinsic to the technology.24 Yet, they are far more likely to succeed in the end than are education and policies—developing new ones or reinforcing existing ones.

Figure 1

The hierarchy of effectiveness depicts the various strategies for modifying behaviour ranked by their effectiveness. This framework for ranking types of corrective actions deems person-based approaches to behaviour change (that rely on individual attention and vigilance) as weaker than ones targeted at the system level. As indicated in the figure, however, stronger corrective actions come at the cost of increased effort. Culture change and forcing functions will have greater and longer-lasting effects than education and new policies, but they also require much greater effort to achieve. Figure modified from graphics produced by the Institute of Safe Medication Practices and the UK National Patient Safety Agency.

Can we expect hospitals to ‘cure’ system problems?

In the rest of biomedical science, we do not expect individual hospitals to discover effective new treatments as part of routine operating activities. Academic medical centres have researchers who may conduct research to develop new therapies, but they do so with dedicated funding from government, industry and/or philanthropy. Some academic centres do conduct research on how to address some of the recurring quality/safety system problems in healthcare. However, this happens with far less funding and to a miniscule degree compared with traditional biomedical research enterprise. And, most hospitals have no research capabilities (in biomedicine or patient safety). So, the reality consists of hospitals trying to ‘cure’ these systems problems on the basis of a multidisciplinary group of volunteer clinicians meeting a few times after an RCA, often with little relevant expertise. That hospitals mostly choose to resolve RCAs merely by promoting a policy or recommending more education and training7 should thus not come as a surprise.

Many challenges with RCAs in healthcare reflect superficial application of the intended method. But even well-conducted RCAs can produce frustrating outcomes. The whole point of RCAs lies in surfacing deep system problems, yet these are precisely the problems that are most difficult to solve. It sounds fine to say that we should drain the swamp of latent conditions and not swat at the mosquitoes of superficial active errors. Yet, draining swamps is hard and costly work, hence the relative rare few examples of actual swamp draining to rid communities of mosquitoes.

Without substantial investments—for instance, to fund teams with expertise in human factors and safety science to assist hospitals with RCAs or possibly an independent investigative body analogous to national transportation agencies25—it seems unrealistic to expect problems with RCA to go away. In the absence of greater investment in and support for RCAs, continued swatting at mosquitoes with education, reminders and new policies may well be all we can expect.



  • Contributors PT and KGS contributed to the conception of the paper; they critically read and modified subsequent drafts and approved the final version. They are both editors at BMJ Quality and Safety.

  • Competing interests None declared.

  • Provenance and peer review Commissioned; internally peer reviewed.

Linked Articles