Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
The Harvard Medical Practice Study brought the issue of patient safety into the public eye and demonstrated that patients are often harmed by the care they receive.1 It used retrospective chart review to identify adverse events. Since its publication in 1991, considerable focus has been placed on trying to improve the methods for understanding the prevalence of harm in hospitals. These efforts have led to deeper understanding of the relative strengths and weaknesses of the tools we currently have for adverse event identification. Still, most organisations do not have robust approaches for tracking all types of harm routinely. Other efforts have sought to assess safety not just in hospitals but across national health systems, and at one point in time, and to track and trend.
Developing better approaches for measuring safety routinely is critical if we are to understand how many patients are being harmed, what the primary causes are and whether care is getting safer or less safe. However, it is also work that needs to be contextualised and the limitations of our tools must be appreciated.2 3
The Irish National Adverse Event Study 2 (INAES-2) is presented in this issue.4 In this study, Connolly and colleagues used retrospective chart review to find adverse events at eight Irish hospitals in 2015 and compare these to previously reported data from 2009. Retrospective chart review was the first method used in this space5 6 and is still a mainstay for national studies assessing rates of adverse events,7–12 although approaches using claims data are also used widely and are much less expensive though much less sensitive.13 The original approach using retrospective chart review relied on information exclusively gathered from retrospective review of randomly selected medical records, but it has since been bolstered by the creation of standardised triggers,14 and more rigorous methods for chart review which make it more sensitive for finding adverse events, and more reliable. Despite this, retrospective chart review has many limitations, most notably the level of agreement between abstractors and its reliance on the completeness of documentation in medical charts.15
The issue of reliance on documentation is especially important. There have been well-conceived critiques that have raised concern related to underdocumentation of errors that occur in hospitals, as well as those that have raised concern that the findings from longitudinal studies looking at trends may be confounded by improved documentation resulting in an overestimation of the true (comparative) incidence of events. These are both legitimate concerns. The INAES-2 study, as in prior similar work looking at multi-institution adverse event rates over time,16 17 showed an increase in events over time but no change in preventable harm. We are left not knowing if this represents a change in safety or a change in documentation.
These concerns have led other investigators to develop adverse event identification approaches to enable more real-time identification, leveraging a broader set of data for the interpretation of the preventability and impact of these events.18 19 Prospective event identification, or the near real-time application of triggers, can also incorporate the perspectives of staff in the clinical environment around the time of the event to provide additional insights. Even with this more comprehensive, contemporaneous collection of data however, agreement continues to be variable between reviewers.20–22
Looking to spontaneous reporting from front-line staff, rather than retrospectively or prospectively monitoring for triggers, is another method that has been proposed as a mechanism for identifying the prevalence of adverse events over time. Similar to documentation, however, concerns exist about the under-reporting of events by front-line staff in safety reporting systems.23 24 Moreover, spontaneous reporting routinely underestimates the incidence of adverse events for some types of events by a factor of 20.25
The inverse is also likely true that advances in safety culture may increase reporting, without any change in the frequency of actual events. Indeed, in the INAES-2 study, the researchers found that although safety reports increased threefold, adverse event rates did not change. This highlights the challenge of using safety reports alone as a proxy for adverse events. Instead, the insights from safety reporting may hold promise for other uses in the safety space, such as providing a signal for the degree of staff engagement in safety, enabling the identification of near misses and facilitating the identification of significant events that require root cause analysis.
Because of the variability that exists in the methods mentioned, many investigators have attempted to identify more reliable ways to identify adverse events. Several studies have employed reimbursement codes (in the USA, International Classification of Diseases Ninth Revision codes) as a mechanism to screen for adverse events.26–28 These systems, which aim to identify complications of medical care by looking for codes that are highly associated with adverse events, have largely been shown to be ineffective.29 30 This is likely to be multifactorial, with an inability to identify which conditions predated the current healthcare encounter, a lack of incentives to use coding to identify adverse events and their limited ability to accurately capture the full clinical picture all contributing to their limited efficacy.31
Other approaches have leveraged information systems to screen for adverse events, which is almost certainly how this will be done in the future.32 This works better for some categories of events than for others. Identification for some events is relatively straightforward, for example, for the development of acute kidney injury in which there is a biomarker to track (rise in creatinine), which routinely appears when the event is present. However, the identification of newly altered mental status, for example, is much more challenging. For events such as falls, which are almost always documented in electronic health record (EHR) systems, this also works well. Commercial products that sift through data from the EHR are available to find adverse events for inpatients, while the situation regarding adverse event detection is much less advanced in the ambulatory setting, even though EHR use is widespread in developed countries. Among the main types of inpatient adverse events, hospital-acquired infections, adverse drug events and falls can readily be detected in inpatients, while the situation is more complex for deep venous thromboses/pulmonary emboli, surgical injuries, specific types of pressure ulcers and missed diagnoses.32 Novel approaches that are highly effective for identifying wrong patient errors have been developed, such as ‘retract and reorder’ detection, which identifies these errors effectively.33 This has led to interventions such as showing the photograph of a patient to the ordering clinician, which reduced the likelihood of a wrong patient order by 43% in one study.34 Still, most organisations do not have a robust sense of how often their patients experience adverse events across the spectrum of care.
The challenge of adverse event identification is multiplied by the importance of understanding one moment in time and, as the authors in the INAES-2 study aim to do, trying to look at trends. This will be essential as we continue to mobilise large efforts to improve safety and as these compete with other priorities. As with all work in quality, having robust metrics is vital. In safety, however, we have in many ways been ‘flying blind’—initiating large-scale efforts to decrease the rate of adverse events without having reliable ways to measure their prevalence over time.
It is important to emphasise that this lack of insight into performance is not equally distributed across all categories of adverse events.3 In fact, as proposed recently by Shojania and Marang-van de Mheen, the incidence of adverse events may be best understood as a composite measure—with all of the limitations that come with looking at a measure with many composite parts.35 When broken apart, what we come to understand is that some of our mechanisms for identifying certain types of events are likely much more reliable than others. In the USA, for example, where the Agency for Healthcare Research and Quality has leveraged standardised methods for collecting and reporting national performance on a set of specific healthcare-associated infections, we have much better insight into performance over time related to such healthcare-associated infections than we do, for instance, with diagnostic error.
Lastly, the challenge of interpreting national adverse event data over time is complicated by the nuances associated with the interfaces between politics and science. In our personal experience, we have encountered challenges reporting results of safety studies that are tied to ministries of health.36 Related to the INAES-2 study specifically, Ireland has a long history of sensationalised media coverage of data pointing to opportunities for improved care, further complicating researchers’ ability to conduct this work free of influence.37
Ultimately, the work presented by Connolly and colleagues is critically important work and we suggest that all health systems should be monitoring adverse event rates over time. The mechanisms for doing this, though, should rapidly evolve. With hospitals increasingly leveraging EHRs, data being collected in more uniform ways and advances in natural language processing and artificial intelligence, a future in which we have reliable measures of adverse events that are stable over time is likely within our reach. To get from here to there, an ongoing investment in research with evaluation including leveraging artificial intelligence and natural language processing, and a commitment to transparent data reporting and enabling collaboration between organisations and governments focused on this work is essential.38 If we can achieve this, we could reasonably expect a future in which we have access to publicly available meaningful data on how many people are being harmed, and in what context, which could in turn transform safety.
Twitter @AaronsonMD, @dbatessafety
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Commissioned; internally peer reviewed.