Objectives: To estimate the extent, nature and consequences of adverse events in a large National Health Service (NHS) hospital, and to evaluate the reliability of a two-stage casenote review method in identifying adverse events.
Design: A two-stage structured retrospective patient casenote review.
Setting: A large NHS hospital in England.
Population: A random sample of 1006 hospital admissions between January and May 2004: surgery (n = 311), general medicine (n = 251), elderly (n = 184), orthopaedics (n = 131), urology (n = 61) and three other specialties (n = 68).
Main outcome measures: Proportion of admissions with adverse events, the proportion of preventable adverse events, and the types and consequences of adverse events.
Results: 8.7% (n = 87) of the 1006 admissions had at least one adverse event (95% CI 7.0% to 10.4%), of which 31% (n = 27) were preventable. 15% of adverse events led to impairment or disability which lasted more than 6 months and another 10% contributed to patient death. Adverse events led to a mean increased length of stay of 8 days (95% CI 6.5 to 9). The sensitivity of the screening criteria in identifying adverse events was 92% (95% CI 87% to 96%) and the specificity was 62% (95% CI 53% to 71%). Inter-rater reliability for determination of adverse events was good (κ = 0.64), but for the assessment of preventability it was only moderate (κ = 0.44).
Conclusion: This study confirms that adverse events are common, serious and potentially preventable source of harm to patients in NHS hospitals. The accuracy and reliability of a structured two-stage casenote review in identifying adverse events in the UK was confirmed.
Statistics from Altmetric.com
Studies across the world have shown that between 3% and 17% of hospital admissions result in an adverse event (defined as any unintended event caused at least partly by healthcare and which resulted in harm), and that between 28 and 75 percent of them are preventable.1–8 Only one study, conducted in 1998, has estimated adverse events and their preventability in the UK.1 The research reported here updates the findings of the UK study on the scale, nature and consequences of adverse events and also addresses several other important issues not previously researched in the UK, such as the accuracy of the screening tool and the inter-rater reliability of casenote review instruments.
Setting and sampling
We carried out the present study a large National Health Service (NHS) hospital trust in England between January and June 2005. A sampling frame of all admissions lasting more than 24 h between January and May 2004 in eight specialties was obtained from the hospital information system. A sample of 1000 admissions was calculated to be sufficient to estimate the prevalence of adverse events with a 95% CI of ±2%.9 We selected a stratified random sample of 1050 admissions from the eight specialties: surgery, urology, orthopaedics, general medicine, medicine for the elderly, oncology, ENT and ophthalmology (table 1).
Process of medical record review
In stage 1 of the casenote review, five trained nurses screened the patient records, using a tool (review form (RF) 1) consisting of a list of 18 explicit criteria (table 1). Identification of one or more positive criteria was used as an indicator of a potential adverse event and these medical records underwent further scrutiny (stage 2). To assess inter-rater reliability, a 10% sample of these admissions was independently reviewed by another reviewer using the same tool (fig 1, b). In addition, 20% of admissions where no positive criteria were identified were fully reviewed by medical staff to detect false negatives and calculate the sensitivity of the screening tool in identifying adverse events (fig 1, e).
In stage 2 of the casenote review, three trained hospital doctors reviewed the records found to have at least one positive criterion in stage 1 (fig 1, c). The doctors used a structured review form (RF2) to judge whether adverse event had occurred, and to assess the type, preventability and consequences of the adverse event. An adverse event was considered to have occurred if the reviewer was confident that:
there was an unintended event;
the event resulted in patient harm (prolongation of hospital stay, disability at discharge and/or extra cost of treatment);
it was caused at least partly by healthcare rather than by disease process alone.
We used a six-point scale to assess the likelihood of a causation link between the event and injury. A similar six-point scale was used to assess the likelihood of preventability. The preventability of an adverse event was assessed on the basis of the standard of care expected from an average practitioner in that area.1 To check the inter-rater reliability, 90 medical records were independently reviewed by another doctor (fig 1, d).
We obtained ethical and research governance approval. All data extracted were anonymised and kept confidential.
We calculated the proportion of admissions with adverse events and preventable adverse events, and the sensitivity and specificity of the screening criteria in identifying adverse events, were calculated along with the 95% confidence interval. The number and proportion of adverse events which led to an increased length of hospital stay or a subsequent hospital admission and the number of extra bed days resulting from each adverse event were also calculated. The Cohen κ coefficient was used to assess the inter-rater reliability.10 Multivariate logistic regression was used to assess the relationship between the presence of positive screening criteria and occurrence of an adverse event.
The sensitivity of the estimated rate and preventability of adverse events to variation in the raters’ confidence in the likelihood of causality and preventability was explored by comparing the results obtained when using a likelihood score of ⩾2 (any evidence for preventability or management causation) and of ⩾4 (likelihood of ⩾50%).
Prevalence of adverse events
Full data were extracted for 1006 admissions (fig 1, a). Using the lower threshold of confidence in causality (any evidence for management causation),34 we found that 110/1006 (10.9%; 95% CI 9% to 12.8%) admissions had at least one adverse event (total = 136 adverse events). Using the higher threshold of confidence in causality, as in the previous UK study (likelihood score of AE of ⩾4),1 87 (8.6%; 95% CI 6.9% to 10.3%) admissions had at least one adverse event (total = 107 adverse events) (table 2). The agreement between doctors on the presence of adverse events was 86% (κ = 0.64). The clinical categories of adverse events are presented in table 3.
Box 1: Examples of preventable adverse events
Oesophageal dilation wrongly using an 18 mm balloon (rather than 12 mm), leading to muscle rupture and bleeding
Avoidable delay in diagnosis of malignant condition
High-risk patient with no prophylaxis. Developed deep vein thrombosis and pulmonary embolism.
Diathermy burn to skin during elective cholecystectomy.
Bleeding from penis after urinary catheter removed without first balloon deflating.
Surgical team took over patient care with avoidable delay, did not arrange review after admission, did not call the consultant, inadequate treatment; patient died.
Common bile duct was perforated during endoscopic retrograde cholangiopancreatography, required urgent open operation with pancreaticoduodenectomy. Patient developed postoperative complications and died.
Injury to the penile urethra during operation requiring repair and leading to readmission and another operation.
Intravenous pyelogram taken of a renal patient with impairment the condition, worsening of situation (technical error).
Inadequate postoperative monitoring, leading to hypovolaemia, collapse and renal failure.
Spleen was torn during nephrectomy resulting in loss of 6 l blood, removal of spleen, blood transfusion, and antibiotic prophylaxis for life.
Patient required high dose of opioid throughout the admission. Still taking high dose of opioid on discharge. Drug continued after discharge with high dose. Repeated by general practitioner. Became addicted.
Consequences of adverse events
Increased length of stay
An adverse event increased the length of hospital stay or led to a subsequent hospital stay in 85% of 136 adverse events (88% of 107 adverse events, when likelihood score ⩾4). Adverse events were responsible for a total of 896 extra bed days, ranging from 0 to 45 (0–31) days. On average, adverse events prolonged the length of hospital stay 6.5 (6.0) days per adverse event (SE = 0.6 (0.58)); and 8.0 (7.4) days per patient who experienced an adverse event (SE = 0.78 (0.72)) (table 3).
Impairment or disability which was resolved within a month was caused by 56% of the 136 adverse events (57% of 107 adverse events, when likelihood score ⩾4); 17% (18%) adverse events led to a more serious impairment or disability which was resolved within 6 months; 4% (4%) led to impairment or disability which was resolved between 6 and 12 months; 11% (11%) led to permanent disability; and 10% (9%) contributed to patient death. In 2% of adverse events the reviewers could not assess the consequences of adverse events.
Preventability of adverse events
The estimated proportion of preventable adverse events varied depending on the degree of confidence expressed by the clinical reviewers. When using a likelihood of more than 50% for causation link and preventability (likelihood score ⩾4) 29/107 (27%) adverse events were considered preventable (31% of the 87 admissions). However, using a lower confidence threshold (any evidence for causation link and preventability—likelihood score ⩾2) 69/136 (51%) adverse events were considered preventable (55% of the 100 admissions). Table 2 shows the number and percentage of preventable adverse events when different thresholds were used for identifying adverse events and assessing the preventability. Some examples of preventable adverse events are shown in box 1. The agreement between doctors on presence of a preventable adverse event was 83% (κ = 0.44).
Accuracy and inter-rater reliability of screening criteria (RF1)
In 44.5% (448) of the 1006 admissions at least one of the 18 selection criteria was recorded as positive (fig 1, b) (95% CI 41.4% to 47.6%, table 1). The sensitivity of the screening tool in identifying adverse events (based on the 10% sample of notes with no positive criteria) was 92% (95% CI 87% to 96%) and the specificity (based on all the records with an RF2) was 62% (95% CI 53% to 71%). Criteria 2, 3, 4, 5, 6, 7, 10, 12, 17 and 18 in table 1 were all independently associated with a significantly increased risk of the occurrence of an adverse event (table 1).
The RF1 screening form was independently completed by two nurse reviewers for 107 records. The agreement for presence of positive criteria was 84% (κ = 0.68). The κ for each individual criterion is shown in table 1.
What is already known on this topic
Adverse events are common in inpatients and a considerable proportion are preventable and lead to serious patient harm.
Little work has been done on the extent and nature of adverse events in the UK.
The accuracy of the screening tools and the reliability of the review forms in identifying adverse events have not been evaluated in the UK.
What this study adds
Our study confirms the findings of previous studies particularly the previous UK study on the rate, preventability and consequences of adverse events.
The accuracy of the 18 screening tool and the inter-rater reliability of the review forms were good in the UK context.
Inter-rater reliability of assessment of preventability of adverse event is moderate and needs more research.
Despite several limitations, casenote review is suitable for identifying the scale and nature of adverse events and monitoring safety improvement programmes.
We found that between 8.6% and 11% of hospital admissions were associated with adverse events depending on the degree of reviewer’s confidence. This is comparable with rates found in studies using similar casenote review methods in the UK (10.8%)1 and internationally (7.5% to 12.5%).4–7 One US study reported a lower estimate (3.7%).2 This study used a causation threshold of ⩾4 (likelihood of adverse event >50) to identify adverse events. It did not include relevant adverse events which were detected after the index admission.2 With a similar high threshold, and by excluding adverse events detected after discharge, our study showed a rate of 7.2%. An Australian study found a higher rate of 16.6%.3 It used a low threshold for causation and the rate included those adverse events which occurred before the index admission but which manifested or were detected in the index admission.3
The reliability of instruments for identifying adverse events depends on the quality of rater training and ongoing monitoring.11 The reviewers in this study were specifically trained and we found a reasonably good agreement between doctors on presence of adverse events (κ = 0.64). In previous research this has ranged from moderate (0.4–0.6)2–61213 to good (>0.6).14–16
The proportion of adverse events estimated to be preventable depends on the degree of certainty expressed by the clinical reviewers. Our estimates are consistent with several other studies which used similar methods (28% to 75%).1–6 Davis et al,4 Forster et al5 and Baker et al,6 using a likelihood score of ⩾4 found that around 37% of adverse events were preventable compared with 27% in our study. Like other studies we found that operative adverse events were more common but less preventable, and diagnostic adverse events were less common but more preventable.1
McDonald et al suggested that the proportion of preventable adverse events, particularly preventable deaths, has generally been overestimated because of inadequate consideration of other factors such as the severity and complexity of patient disease.17 Brennan has suggested that in a reasonable proportion of deaths associated with medical error, death would have occurred even in the absence of error.18 Hayward and Hofer found that “many preventable deaths occurred at the end of life or in critically ill patients in whom death was the most likely outcome either during the hospitalisation or in the coming months, regardless of the care received”.19
There was less agreement in assessing preventability of adverse events, similar to several previous studies (κ<0.5).231319 However, others have reported a higher agreement—Bates et al (κ = 0.7)20 and Thomas et al (κ = 0.8)8—although surprisingly Thomas et al8 reported a low agreement (κ = 0.4) for identifying adverse events. The difference in reported agreement might be because of differences in the methods and criteria used for assessing preventability or the way the criteria were applied. In Bates et al20 and Thomas et al8 the investigators used summaries of adverse events to assess preventability, whereas in our study the record reviewers reviewed the whole medical record to assess preventability. Bates et al20 may have reported a higher inter-rater agreement also because: (1) they focused on one type of adverse event—drug event; (2) they used a combination of three methods (reporting, speaking with nurses and pharmacists and record review) to collect data; and (3) they used a four-point rather than a six-point scale to assess preventability, so reducing the spread of scores. Further research is needed to assess the effect of using summaries or the full medical record to assess preventability and these other factors on the degree of agreement in assessing preventability of adverse events.
Accuracy and reliability of the screening instrument (RF1)
Using the results of the full casenote review as a gold standard,2321 the sensitivity of the 18-item screening criteria was 92%, which is in the range of estimates from similar studies from the USA (84%)12 and Australia (97.6%).3 The agreement between nurses on presence of screening criteria was good (κ = 0.68) and similar to the Australian study (0.67).3
Our study was carried out in one UK hospital, which may limit the generalisability of its findings. Nevertheless the rate and type of adverse events we found were comparable to a similar study in the UK1 and ones elsewhere.4–7
Medical record review seems to be an efficient and reliable method to provide data, inform and monitor these safety improvement programmes and do so more accurately than current systems for routine incident reporting.22 However, since quality improvement focuses more on those adverse events which are preventable, more research is needed to better conceptualise and reliably assess preventability. Research needs to focus more on how the data collected on adverse events can be fedback effectively and used to inform the design and evaluation of strategies to reduce the scale and consequences of adverse events. Interventions to promote safety, nested alongside longitudinal series of casenote reviews using this same method, may provide useful information for improving strategies.
In the light of the findings from this study and the previous UK study it is now clear that around 8–10% of patients in NHS hospitals may experience some kind of adverse events, of which between 30% and 55% are to some extent preventable. Structured retrospective casenote review was found to be a reliable method to identify adverse events in a large NHS hospital, but the reliability of assessing the preventability of adverse events is typically poor and needs further research. Despite several limitations of casenote review, until a cheaper and more reliable method is designed it should be used for estimating the rate, preventability and consequences of adverse events, and for monitoring safety improvement strategies.
We thank Professor Alan Maynard, Dr Mike White and Dr Michael Porte for their support and advice. We also thank for their advice Professor Richard Lilford, Professor Charles Vincent, Professor Graham Neale, Dr Maria Woloshynowych, Dr Ian Woods, Dr Ann McEvoy, Dr Donald Richardson, Dr Glen Miller, Dr David Worth and Ms Caroline Mosely. We are also grateful to Lorraine Wright for screening medical records in stage 1.
ABS designed and managed the project, wrote the research proposal, collected and analysed data and wrote the final report. TAS supervised the project, commented on the protocol, data collection and analysis, and was responsible for the quality control and assisted on the report. AC and AST piloted the instruments and process, assisted on stage 2 casenote review, provided advice and commented on the report. WG, CG, ER and YD screened the medical records and collected data in stage I and discussed the findings. ABS is the guarantor.
Ethics approval: reference number: 04/Q1108/7.
Funding: ABS was supported by a scholarship from the Iranian Ministry of Health and carried out this work while a studying for a PhD at the University of York.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.