Background Medical errors are endemic in healthcare. Patient safety reporting systems (PSRSs) have been developed and implemented to identify and reduce medical errors. Although they have succeeded in identifying errors (over 1 million reports in the NHS), there are limited methods by which to analyse this large number of events.
Methods Adapting the safety theory of risk resiliency, the authors developed the Harm Susceptibility Model (HSM) as a method of quantifying the variation in risk of harm within an organisation and the Harm Susceptibility Ratio (HSR) as a statistic to compare and rank harm across trusts or work areas. The HSM was applied to data from 20 trusts reporting events to the National Reporting and Learning System (NRLS) between 2004 and 2006.
Findings A total of 104 674 incident reports from 12 distinct work areas were analysed. Fifty-five per cent of the variation in harm was attributed to differences among trusts, suggesting that HSR would best be used within trusts. Within a specific trust, the HSR ranged from 0.25 to 4.30, with the pharmacy having the highest HSR (4.30, 1.89 to 9.68). The A&E, therapy department and radiology had the highest probability of a high HSR across the majority of trusts.
Interpretation The HSM can be used to analyse a large number of incident reports from PSRSs. It provides a quantifiable way for organisations to identify areas where defences against errors are weak and prioritise limited resources directed at improving patient safety.
- Patient safety
- incident reporting
- national reporting and learning system
- multilevel analysis
Statistics from Altmetric.com
Medical errors are common, costly and often lethal. In the UK, approximately one in 10 hospitalised patients experience a medical error.1 In the USA, medical errors account for up to 98 000 preventable patient deaths per year.2 In response to this alarming problem, the Institute of Medicine recommended using patient safety reporting systems (PSRS) to identify and learn from medical errors.2
PSRS are valuable surveillance tools to identify and help mitigate hasards.3 These systems are typically used to study individual adverse events (generally those that result in significant harm or death) in detail, identify causes and contributing factors, and sometimes design interventions to prevent recurrence. This approach may be beneficial at a local level, within a hospital or healthcare system, where events are relatively infrequent.4
However, applying this approach to the organisation where the PSRS may contain thousands or millions of events is unrealistic.5 Furthermore, such analyses which typically occur at the local level do not take advantage of the information contained in a collection of adverse events across several hospitals. Methods are needed to analyse large and complex quantities of information to help health systems and government agencies target their limited resources on the key risks, and design interventions to reduce harm.
Several countries currently have national reporting systems, and others, including the USA, are implementing them. Unlike other industries that defend against infrequent catastrophic failures (eg, airline, nuclear), healthcare needs methods to prioritise frequent though less catastrophic failures. We propose a quantitative model, the Harm Susceptibility Model (HSM), for the analysis of large PSRSs that adapts the conceptual theory of risk resiliency developed by Carl Macrae for aviation.6 The HSM provides (1) a summary of the variation in risk within the organisation in order to prioritise resources at the organisation or local level and (2) a statistic, the Harm Susceptibility Ratio (HSR) that can be used to identify and rank work areas, medical specialities or locations in terms of risk. In this paper we (1) review the Macrae risk resiliency concept for aviation, (2) describe how we translate this concept to the HSM, (3) apply the HSM to a large sample of errors from the UK's National Reporting and Learning System (NRLS) and (4) offer a discussion of the proposed HSM.
Risk resiliency concept
The HSM builds upon the airline industry's concept of safety, in which errors are viewed as a feature of daily operations. Equipment will inevitably fail, and individuals will make mistakes. ‘The only means of guaranteeing absolute safety…is to keep the aircraft locked in a hangar.’7 Since this is impossible in aviation and in medicine, an alternative approach is to focus on the safety of the organisational processes involved in daily operations. Under this concept of safety, Macrae introduced ‘risk resiliency,’ or the ability of an organisation ‘to protect operations from the potential of minor mishaps, fluctuations, or anomalies, developing into major organisational breakdowns.’7 Organisations build defences to capture errors and prevent them from leading to harm. Where defences are weak or non-existent may be the locus for safety improvement efforts. In other words, safety is achieved not by reducing the number of errors but by increasing an organisation's ability to catch, correct and prevent errors from harming patients.
To move in the direction of building defences, Macrae provided a qualitative classification of risk resiliency to rank relative levels of safety into the following:
Acceptable Risk Resiliency: Effective defences and controls are in place to routinely catch and correct errors, ensuring that there is only limited susceptibility to harm from any errors that occur. This is as close to an approximation of safety that can be achieved.
Reduced Risk Resiliency: Errors may be contained, but defences and controls for dealing with errors are not as effective or extensive as they could be. Patients on occasion are susceptible to harm resulting from these mishaps.
Degraded Risk Resiliency: There are few or no systematic defences or controls to stop an error from escalating into a harmful event. This is a serious situation where there is clear potential for the occurrence of a major accident or serious harm.
Healthcare needs systems that capture and correct errors, and ensure they do not lead to preventable morbidity, mortality, emotional distress and costs. The concept of risk resiliency could be applied in healthcare to help organisations focus their patient safety improvement efforts on systems that are not effectively trapping errors. The HSM was developed for this reason.
Harm Susceptibility Model
We use the degree of harm reported within PSRS incidents to measure resiliency within the organisation. We developed a statistical model that allows for (1) identifying the level (eg, organisation or hospitals) where it is most appropriate to evaluate the resiliency of the organisation and (2) calculating a statistic, the HSR that can be used to rank and compare resiliency within the appropriate organisational level. See the data analysis section and appendix for full details of the statistical model.
The statistical model estimates and compares the potential sources of variation in resiliency or the degree of harm within the organisation. For instance, suppose it is of interest to assess the degree of harm across work areas; then the approach must account for variation in the degree of harm reported (1) across hospitals (averaged over work areas), (2) across work areas (averaged over hospitals) and (3) across work areas within a hospital. If the variance in the degree of harm is greater across hospitals than across 127 work areas (case 1), then it is most appropriate to rank hospitals in terms of their degree of harm. If variance in the degree of harm is greater across work areas than across hospitals (case 2), then it is appropriate to pool information across hospitals and to rank work areas in terms of their degree of harm. If variance in the degree of harm is greatest across work areas for any given hospital, then the work areas rank differently in their degree of harm within any given hospital (case three), and it is appropriate to rank work areas within each hospital of interest.
Having identified the appropriate organisational level for analysis, the HSR can be used to measure and rank the degree of harm or resiliency across units (eg, hospitals or work areas). A unit's HSR is defined as the odds of harm reported (ie, the proportion of harmful versus non-harmful events) compared with the average odds of harm for all the units being compared. For example, if the odds of harmful events in a specific work area are the same as the average odds across all work areas, then the HSR will be equal to 1.0. If harm were more common, the HSR would be greater than 1.0 (higher odds of harm for that area).
National Reporting and Learning System (NRLS)
The NRLS is a web-based PSRS managed by the National Patient Safety Agency (NPSA) and used throughout the UK. Currently, it is the largest repository of patient safety incidents reported by front-line care givers in a single country, and contains important information for the UK and the world. Incidents are reported to the NRLS through local risk-management systems from individual trusts (a set of hospitals). Each report contains information about the incident, including date, time and work area of occurrence, medical specialty attributed and whether or not the incident resulted in harm. Detailed technical reports of the data elements and methods of reporting to the NRLS are available at http://www.npsa.nhs.uk.
HSM applied to the NRLS data
We reviewed the data from 2004 to 2006 and identified 20 trusts that were consistent reporters (more than 1000 incident reports/year). We analysed the most recent year of incident reports (2006) and focused on fitting the HSM to assess harm across trusts and work areas. We excluded reports from work areas with <100 reports in 2006 to minimise the introduction of additional uncertainty due to small sample sizes, and reports from ‘Hospital grounds (outside)’ or ‘Hospital buildings (inside)’ to focus on patient care areas. Work areas included were pharmacy, radiology, anaesthetic room, Accident and Emergency (A&E), intensive care unit (ICU), operating theatre, ward (unit), (physical) therapy department, recovery room, laboratory, unknown and other. Harm was self-reported using the following categories: minimal harm, patient required extra observation or minor treatment; short-term harm, patient required further treatment or procedure; severe harm, permanent or long-term harm or death from the incident.
For the HSM, we used a Bayesian hierarchical model (see the appendix for technical details). We utilised the Gaussian approximation to the log odds of a harmful report and assumed that the three sources of variation were independent and normally distributed random effects. Ninety-five per cent CIs for the HSR were used to assess significance; CIs without 1 and p values <0.05 were deemed significant. The statistical software R (Vienna, Austria) and WinBugs (Cambridge, UK) were used to complete the analysis.
A total of 104 674 incident reports from 20 trusts and 12 distinct work areas were analysed.
Table 1 summarises the number of incident reports by trust (ignoring work area). The probability of a harmful incident report within a trust ranged from 16% to 63%. Figure 1 displays the distribution of the odds of harm across 179 work areas within each trust. The median odds of harm across work areas within a trust are represented by the dark vertical bars. Trusts 15, 16 and 17 had the highest median odds of harm relative to the other trusts. However, this ranking does not account for the three sources of variation described above. Fifty-five per cent of the risk of harm was attributed to differences among trusts, 15% was attributed to differences across work areas, and 30% was attributed to differences across work areas within a trust.
Figure 2 displays the estimated HSRs and the 95% CI for each trust. The HSR for each trust is calculated by taking the ratio of the average odds of harm for a given trust on average across work areas divided by the overall odds of harm (on average across the trusts and work areas).
The estimated overall odds of harm was 0.43 (95% CI 0.27 to 0.70), which translates to an estimated overall probability of harm of 0.30 (0.21 to 0.41). Trusts 1, 7, 16 and 17 each had statistically higher odds of harm than the overall odds of harm. For trust 1, the estimated HSR was 2.40 (95% CI 1.30 to 4.30), so that the odds of harm for trust 1 (averaged over work areas) were 2.40 times greater than the overall average odds of harm. For trusts 16 and 17, the odds of harm were 2.87 (1.56 to 5.55) and 6.29 (3.28 to 12.54) times the overall odds of harm, respectively. Trusts 3, 4, 8, 10, 15 and 19 each had statistically lower odds of harm than the overall average.
There was variation in the HSR rankings of work areas within a given trust (30% of the total variance). To demonstrate the use of the HSR for ranking work areas within a specific trust, we selected trust 14 that had a trust level HSR approximately equal to 1, and trust 17 that had the largest trust level HSR. Figure 3 displays the estimated HSR for each work area in these trusts.
In trust 14, all work areas had similar odds of harm relative to the average odds of harm within this trust. In trust 17, the incidents attributable to the operating theatre (HSR 2.30, 1.04 to 5.08), A&E department (4.01, 1.58 to 11.63) and pharmacy (4.30, 1.89 to 9.68) had a higher odds of harm relative to the average odds of harm for this trust.
The ICU, A&E, therapy department, and radiology were assessed to determine the probability of a high HSR across trusts. The therapy department and radiology were more likely to have a higher-than-average HSR (top five worst work areas) for roughly 60% of the trusts (11 of 18 therapy departments, and 11 of 19 radiology areas reported patient safety incidents). The A&E department and the ICU were more likely to have a higher-than-average HSR in 55% (11 of 20) and 42% (eight of 19) trusts, respectively. For trusts 1, 5 and 6, A&E, the therapy department and radiology were all likely to have a high HSR.
We introduced a statistical model called the HSM to help analyse large numbers of medical errors from a national PSRS. We found in the NRLS that the majority of variability in harm occurred among trusts (55%), with differences in odds of harm across work areas within a trust accounting for 30%. For this analysis, the HSR was extremely useful to identify trusts with unusually high or low odds of harm. In addition, we demonstrated that the HSR can identify work areas within a trust that may have limited risk resiliency. For example, in trust 17, the pharmacy, accident and emergency department, and the operating theatre exhibited high HSRs. Upon completion of this type of analysis, subsequent review and linkage of trust level culture and quality data should be done to validate the results and to help the organisation and trusts prioritise the patient safety improvement efforts.
Our approach builds on the conceptual framework of risk resiliency used in aviation8 by converting it into a statistical model of susceptibility to harm. This approach contrasts sharply with current views of healthcare in which error-free performance is expected and sometimes demanded. Because healthcare still fails to acknowledge that ‘to err is human,’ it has, to a large extent, failed to redesign systems to prevent the errors that will inevitably occur and harm patients.
The HSM overcomes some of the challenges associated with assessing risk levels and prioritising resources using PSRS. Typically, quantitative methods for assessing the risk of safety incidents have focused on outcome measures, such as event severity and frequency, because they are easily calculated.9 10 Outcome-based measures, however, do not provide a particularly sensitive indication of the underlying resilience in a specific work area. To measure resilience, the ability to catch and contain errors, we must assess the underlying organisational processes. To date, this has been done with safety experts through a detailed and time-consuming qualitative analysis of individual incidents.11 Such analyses are impractical for organisations with large repositories of reported events, such as the NPSA.
The HSM surmounts these problems by relating the probability of near-miss events to the probability of harmful events. It does not depend on detailed causal analysis or extensive incident-specific data. The HSR offers a simple quantifiable measure of a hospital or work area's ability to catch and defend against errors, provides a relative ranking of risk resilience and is broadly applicable across a range of reporting systems. Researchers can apply this concept to any variable in a PSRS, such as event type or contributing factor. They can discover where resilience in a work process is low, or unravel why a work area has a high HSR. Reliable comparison of the relative safety of work areas is essential for focussing attention and resources on the most pressing patient safety issues.
On a local level, individual trusts could use the HSR to identify work areas with limited ability to prevent errors from leading to patient harm. In trust 17, for example, the pharmacy had a high HSR, which suggests a low-risk resiliency in this area. On a national level, government agencies could evaluate the HSR distribution among trusts, and focus improvement efforts on trusts with high HSR. In our analysis, trust 17 had the highest HSR among all trusts compared with the overall odds of harm. Resources may be directed at an in-depth analysis to study the risk resiliency of this trust.
Particular work areas or adverse event types with uniformly high HSRs across trusts could be the focus of national improvement efforts or policy. The agency could create a panel of clinician and patient safety experts, and potentially partner with professional societies to make national recommendations to improve safety. This panel could review events in detail, design interventions to defend against these errors, pilot test and evaluate the effectiveness of the interventions in a handful of trusts, and then implement the interventions and evaluate the effects nationally. To date, no country has implemented such a robust system of learning from mistakes. If such a model were successfully applied, the NPSA could be the world's leader in using PSRS to mitigate hazards to patients.
There are several potential limitations to the HSM. Foremost is that the variation in reported harm may be affected by reporting biases (eg, differences among event types, work areas, or trusts, in reporting harmful versus non-harmful errors), bias as to which incidents get labelled as harmful (ie, true verses reported harm), culture of error reporting and trust characteristics (academic versus non-academic, urban versus rural, high-volume vs low-volume, general versus specialised). Unfortunately, we do not know the magnitude or direction of these biases, which likely vary across event types, work areas and trusts. These biases are likely significant in current PSRSs.
Second, we chose an arbitrary cut-off of harm versus no harm in determining the HSR. Harm occurs across a continuum, and its categorisation is generally a subjective practice. Therefore, we chose no harm as the closest objective approximation to risk resilience. Consequently, this treats all harmful events with equal weight. However, it limits our ability to detect events where a risk-resilient system mitigated but did not prevent harm. In alternative applications of the HSM, the HSR could be calculated using a variety of harm level cut-offs.
Third, the HSM makes two important assumptions about harm: (1) all incidents contain the potential for harm, and (2) all no-harm incidents were captured through a active safety intervention. These assumptions may not be true in some instances. For example, some events are inherently more likely to be harmful (ie, patient falls), whereas other events may be harm-free for fortuitous reasons (ie, type O blood inadvertently transfused to a blood type A patient) as opposed to a risk resilient infrastructure. There is a large difference between no-harm due to some form of systematic safety intervention (implying acceptable risk resiliency) and no-harm due to mere chance or luck on the day (implying completely degraded risk resiliency—the ‘as bad as it gets but somehow we didn't hurt anyone’ situation). The current HSM cannot distinguish between these two scenarios, but it represents the best approximation that can currently be achieved given the data available. Until we have more sensitive assessment of events at the point of reporting (for instance, assessing the potential as well as actual consequences of an incident), we will be limited in our ability to perform more sensitive risk-analysis methods.
We do not propose that the HSRs be used as absolute truths but rather that they be used in conjunction with other sources of data including local level quality and claims data. However, as PSRSs mature, and definitions of harm and reporting become standardised, the HSM will become an invaluable tool in identifying hazards to patients.
Another limitation is our underdeveloped taxonomy for classification of medical errors. This limited taxonomy is likely biased towards a higher HSR for event types that are inherently harmful by classification. For example, patient falls are an error event type. However, the fall may not be the actual error. The error likely occurred before the fall (eg, bed guard rails were not raised, bed pan was out of patient's reach, medication-induced patient imbalance). The fall was the outcome of an error. This limitation applies to any PSRS analysis.
Finally, we only developed and pilot-tested the HSM. Further efforts are needed to determine whether organisations, such as the NPSA, and individual trusts can use the HSM to prioritise patient safety efforts, and whether use of this model is associated with improvements in patient safety. This could be a research focus for the NHS.
In summary, current methods of analysing individual events from PSRS are resource-intensive, tend to focus on individual events rather than utilising the totality of available data and fail to identify areas with poor resiliency against risk. The HSM is a novel model that can help organisations, trusts and countries identify where defences against errors are weakest. Once identified, a more detailed investigation could be triggered and safety improvement efforts implemented on a local or national scale. If coupled with efforts to mitigate risks, this approach can help organisations such as the NPSA focus scarce resources to improve patient safety effectively and efficiently.
We thank CG Holzmueller, for her assistance in editing this paper; she was not compensated for this work beyond her normal salary as the medical writer/editor in the Department of Anesthesiology and Critical Care Medicine at the Johns Hopkins University School of Medicine.
Building upon the conceptual model of Risk Resiliency,6 we develop a statistical approach in which errors that result in patient harm are viewed as an indicator of a poorly developed organisational structure or processes for trapping minor and unavoidable errors, and allowing them to reach the patient. Errors that do not result in harm are viewed as part of the intrinsic fluctuation of complex systems and are an indicator of a developed organisational structure for trapping errors and preventing patient harm. The goal of the statistical model is to estimate the risk of harm associated with the reported medical errors and to assess their variability across trusts and across work areas.
Let Yij be the number of harmful incident reports from trust i and work area j, and nij be the total number of incident reports from trust i and work area j. Let pij be the probability that a harmful incident has occurred within trust i and work area j. We constructed the following model for the log odds of a harmful incident:(1)
where θ is the average log odds of harm across all trusts and work areas (overall average), θ+αi denotes the log odds of harm for trust i on average across work areas, and θ+βj denotes the log odds of harm for work area j on average across trusts. For any given trust (i), we also allow the log-odds of harm to vary across work areas (j): the parameter εij represents the deviation of the log odds of harm for work area j within trust i with respect to the trust average. Finally, we assume that trust effects (αi), work area effects βj and the work area effect within each trust (εij) are independent random effects, that is
In summary, we partition the total variation in the log odds of harm into three parts: (1) differences across trusts (γ2), (2) differences across work areas (σ2) and (3) differences across work areas within a given trust (η2). The estimation of these three variance components will identify the organisational level (eg, trust or work area) where the information can be reasonably pooled and work areas where most harm occur can be identified. For instance, if γ2 is large relative to σ2 and η2 (case 1), then the heterogeneity in the log odds of harm across trusts is much greater than what we observe among areas on average across trusts and among work areas within a given trust. This indicates that we should focus on the national level and rank trusts with respect to their risk of harm. Alternatively, if γ2 is small relative to σ2 and η2 (case 2), then the heterogeneity in the log odds of harm across work areas is much greater than what we observe among trusts on average across work areas and among trusts within a given area. In this case, we may pool information across trusts and rank work areas with respect to their risk of harm on average across all trusts. The utility of the structure of this model is the ability to borrow information from all trusts and work areas to improve the precision of the estimates of the odds of harm, especially for trusts or work areas where there may be only a few incident reports.
Assume the appropriate analysis should be conducted at the trust level (case 1). For trust i, we define the Harm Susceptibility Ratio (HSR) for trust i as HSR(i)=Ai/B, where Ai is defined as the odds of harm for trust i averaged over work areas, and B denotes the odds of harm on average across trusts and work areas.
Assuming that the majority of the heterogeneity in the odds of harm is attributable to differences across work areas (case 2), we define the HSR for work area j as HSR(j)=Aj/B, where Aj is defined as the odds of harm for work area j averaged over all trusts, and B denotes the odds of harm on average across trusts and work areas.
In either case, at the local level a trust may evaluate the degree of harm across work areas by calculating the HSR for work area j within trust i as HSR(i,j)=Aij/Bi, where Aij is defined as the odds of harm for work area j within trust i, and Bi denotes the odds of harm for trust i averaged across work areas within trust i.
Funding This research was fully funded by the National Health Service, National Patient Safety Agency in the UK.
Competing interests CAG has received honoraria from hospitals, healthcare affiliates and government agencies to speak on topics related to quality and patient safety, and has received support from a grant from the National Health Service National Patient Safety Agency. PJP has received honoraria from hospitals and hospital associations to speak on quality and patient safety, and received a grant from the National, Health Service National Patient Safety Agency, and the WHO to study and improve quality of care. The following authors report no conflicts of interest: JCP, EC, FD, AS, CM, MF and KC.
Ethics approval Ethics approval was provided by the Johns Hopkins School of Medicine.
Provenance and peer review Commissioned; externally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.