Article Text

Download PDFPDF

Measuring safety of healthcare: an exercise in futility?
  1. Khara Sauro1,
  2. William A Ghali2,
  3. Henry Thomas Stelfox3
  1. 1 Departments of Community Health Sciences, Surgery & Oncology, the O'Brien Institute for Public Health & the Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Alberta, Canada
  2. 2 Departments of Community Health Sciences & Medicine, and the O'Brien Institute for Public Health, University of Calgary, Calgary, Alberta, Canada
  3. 3 Departments of Critical Care Medicine & Community Health Sciences, and the O'Brien Institute for Public Health, Universty of Calgary, Calgary, Alberta, Canada
  1. Correspondence to Dr Khara Sauro, University of Calgary, Calgary, AB T2N 1N4, Canada; kmsauro{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Insanity—doing the same thing over and over again, and expecting different results.1

Anyone who has received or delivered care understands it is vulnerable to fail. After each failure, the narrative is familiar and recited often—continuous quality improvement is invoked, and performance measurement is touted as a core strategy. Yet, does any of this make a difference?

Almost 20 years ago the Institute of Medicine published ‘To Err is Human’, a widely cited report that highlighted the all-too-frequent occurrence of adverse events, negative and unintended consequences of healthcare.2 An estimated 1 in 10 hospital admissions results in an adverse event3 and 98 000 deaths occur per year as a consequence of adverse events.2 4 In addition to the human cost, adverse events burden the healthcare system—they increase hospital length of stay by an average of 10 days and cost in excess of $414 million per year.5 It is hard to know just how safe care is. Measuring safety is imperfect and there is little evidence that it makes care safer. But we have an ethical imperative to do no harm, which requires us to understand how safe care actually is. Measurement is therefore needed, because after all, we cannot fix or improve what we do not measure.

Despite several commentaries discussing the advantages of existing methods to measure adverse events, controversy about the best method remains.6–16 Many resources have been devoted to determining the most valid method for detecting adverse events, and even more resources have gone towards implementing these measurement approaches within organisations. Discouragingly, however, these approaches have done little to improve the safety of care.17 18 Unlike previous discussions, we submit that the volume and complexity of patient–healthcare system interactions necessitates the development of new, more efficient yet accurate methods for detecting adverse events so healthcare systems are not paralysed by the resources required to detect adverse events. Healthcare systems can then devote more resources towards evidence-informed strategies to improve safety.

Studying adverse event rates is essential for improvement—or is it?

There has been an exponential rise over time in the annual number of publications on adverse events in healthcare.19 One might then logically assume that all of those publications, and the new knowledge embedded therein, have led to safer care. Right? Discouragingly, this does not appear to be the case. There is a rapidly increasing number of peer-reviewed publications exploring adverse events alongside stubbornly flat rates of adverse events for nearly four decades.11 12 16 18 20

This begs the question: What is the return on investment for studying patient safety? Are the collective efforts of scientists and healthcare leaders in the domain of patient safety amounting to little more than Brownian motion?

Is it simply that we are trying to measure the unmeasurable?

One of the fundamental problems is that adverse event detection and measurement is inherently difficult and there is no fail-safe method to capture adverse events. In fact, different methods used to detect adverse events can paint very different pictures of patient safety.10 21 For example, a study comparing adverse events detected through patient interview and chart review found considerable differences between the two methods; 11% of serious preventable adverse events were detected by chart review and an additional 23% of serious preventable adverse events were detected by interviewing patients after discharge.22 If all methods of detecting adverse events are flawed, which approach do we adopt? Available methods include:

  1. Voluntary reporting is often used and can be done in multiple ways. Incident reporting systems, an approach adopted from the aviation industry, was one of the first strategies implemented to address the issue of patient safety and has been the most widely integrated method of detecting and measuring adverse events.23 Alternative voluntary reporting approaches include patient complaints, adverse event reporting systems and mortality and morbidity rounds.24–26 Voluntary reporting is appealing because it can occur in real time, reducing recall bias, and the resources for reporting are minimal. However, it consistently underestimates adverse events,7 8 21 27 28 likely because busy front-line clinicians need to find time and courage to report.

  2. Manual review of medical records is the most commonly used method. Prospective reviews appear to be more valid than retrospective reviews when classifying preventability of adverse events, but otherwise both approaches are similarly accurate.29 The main challenge is that reviewing medical records is laborious, raising questions about feasibility, sustainability and opportunity costs.

  3. Prospective surveillance using multimodal approaches can include triggers, nurse observers, chart review and incident reporting.30 This method has been found to identify more adverse events. Further, the detection of adverse events is close to real time providing an opportunity to mitigate adverse events in a timely manner. As with manual review of medical records, however, this method is resource intensive.

  4. Mining big routinely collected data (administrative data) is promoted as an approach to improving the safety of healthcare. Algorithms derived from International Classification of Diseases (ICD) codes can be used to conduct standardised and efficient queries of population-based data to identify rare adverse events and associations with patient, clinician and system-level factors to target quality improvement interventions.31–33 Uncertainty about how well routinely collected big data reflect actual clinical events has raised questions about their validity perhaps preventing widespread adoption as a viable method for detecting adverse events.34 35 Moreover, limitations in data detail hinder assessments of preventability, severity and temporality. While the 10th revision of ICD has not meaningfully improved accuracy for detecting adverse events relative to ICD-9,36–38 the increased coding specifically related to adverse events in the newly released ICD-11 is promising and may overcome some of these limitations.39 40

  5. Natural language processing of electronic medical records is the newest measurement frontier and leverages the strengths of medical records and big data. Early studies are promising and suggest that these methods perform well compared with manual chart review, but time will tell whether humans are still required to interpret complex documentation of patient care.41–43

This list of methods is not exhaustive; there are also several less frequently employed methods of measuring adverse events, such as: patient-reported methods (interviews, surveys and voice response system),22 26 44 45 ethnographic evaluations of patient care46 47 and malpractice claims21 that may also be useful in the quest to optimise detection of adverse events.

There is another important consideration in relation to measuring adverse events using the various methods discussed: Are we measuring to detect adverse events, or are we measuring to improve safety? The goal should be to improve the safety of care.

Detecting and measuring adverse events—necessary but not sufficient

Evidence of the effect of measuring adverse events on the safety of care is scant, but not entirely discouraging. A systematic review suggests measurement can have positive effects, but with a notable caveat.48 Measurement alone is minimally effective,25 but measuring the quality and safety of care and reporting data back to clinicians in microsystems is more impactful.48 Even better yet are strategies that pair measurement with intervention.49 Measuring and reporting adverse events is necessary, but not sufficient to improve the safety of care—a core finding that may explain the persistently high adverse event rates over time.11 17 18 Ultimately, it is likely more about what is done alongside measurement to actually intervene on quality of care. There are many examples of strategies that have improved the safety of care. For example, the rate of catheter-associated bloodstream infections was reduced by 66% in the Keystone ICU project50 and implementation of surgical safety checklists has reduced complications and mortality after surgery.51 These examples show the importance of using valid measures to identify adverse events, mapping adverse events to failures in care processes and then testing interventions to improve safety.

The fundamental and existential question

If measuring adverse events is imperfect and we have only modest evidence to suggest it improves the safety of care, is it time to raise the white flag? Is all the time, money and effort spent on measuring safety futile?

In response, we would assert that abandoning adverse event measurement is not the answer. Rather, because adverse event measurement to date has done little to make care safer, we need new approaches. The growing volume and complexity of patient interactions within the healthcare system means that these approaches need to be efficient and accurate so that healthcare resources can be allocated to initiatives to improve the safety of care. Combining methodologies can leverage the strengths of different approaches to improve the accuracy of adverse event detection.30 Similarly, efficiency may be improved by pairing sensitive data mining methods with subsequent manual chart review of flagged records. Adverse event measurement needs to be incorporated as a key component of learning healthcare systems rather than a tokenistic performance metric. As adverse events are detected they need to be fed into continuous quality improvement activities.7 9 23 25 52 Morbidity and mortality rounds are one example of how adverse event detection can directly feed into safety improvement; although this approach tends to focus on physician and technical factors with less emphasis on system-level factors.25 53 Finally, there is a need for ongoing research to iteratively refine efficient methodologies for improved adverse event measurement, whether this be through new electronic medical record natural language processing and data mining methods, or enhanced ontologies for disease classification (eg, ICD-11 and the International Classification of Health Interventions) and medical terminology (eg, SNOMED).54 55

We assert that these efforts should continue because absence of proof of benefit from adverse event detection and reporting does not equate to proof of absence of benefit. Despite the famous quote on insanity, it is not insane to keep trying. There is merit to soldiering on in our attempts to produce evidence and data to inform our pursuit of safer care for all.



  • Twitter @kharasauro

  • Contributors All authors contributed to the conception, design, drafting and editing of the manuscript and approved the version of the manuscript being submitted.

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Competing interests None declared.

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; internally peer reviewed.

  • Data availability statement There are no data in this work.

Linked Articles