Article Text
Abstract
Background Diagnostic errors (DxEs) are an understudied source of patient harm in children rarely captured in current adverse event reporting systems. Applying electronic triggers (e-triggers) to electronic health records shows promise in identifying DxEs but has not been used in the emergency department (ED) setting.
Objectives To assess the performance of an e-trigger and subsequent manual screening for identifying probable DxEs among children with unplanned admission following a prior ED visit and to compare performance to existing incident reporting systems.
Design/methods Retrospective single-centre cohort study of children ages 0–22 admitted within 14 days of a previous ED visit between 1 January 2018 and 31 December 2019. Subjects were identified by e-trigger, screened to identify cases where index visit and hospital discharge diagnoses were potentially related but pathophysiologically distinct, and then these screened-in cases were reviewed for DxE using the SaferDx Instrument. Cases of DxE identified by e-trigger were cross-referenced against existing institutional incident reporting systems.
Results An e-trigger identified 1915 unplanned admissions (7.7% of 24 849 total admissions) with a preceding index visit. 453 (23.7%) were screened in and underwent review using SaferDx. 92 cases were classified as likely DxEs, representing 0.4% of all hospital admissions, 4.8% among those selected by e-trigger and 20.3% among those screened in for review. Half of cases were reviewed by two reviewers using SaferDx with substantial inter-rater reliability (Cohen’s κ=0.65 (95% CI 0.54 to 0.75)). Six (6.5%) cases had been reported elsewhere: two to the hospital’s incident reporting system and five to the ED case review team (one reported to both).
Conclusion An e-trigger coupled with manual screening enriched a cohort of patients at risk for DxEs. Fewer than 10% of DxEs were identified through existing surveillance systems, suggesting that they miss a large proportion of DxEs. Further study is required to identify specific clinical presentations at risk of DxEs.
- diagnostic errors
- adverse events, epidemiology and detection
- emergency department
- incident reporting
- paediatrics
Data availability statement
All data relevant to the study are included in the article or uploaded as supplemental information. The data for this study are not available publicly as they are owned by Children’s Hospital Colorado, contain sensitive patient information, and could be used for medico-legal litigation.
Statistics from Altmetric.com
- diagnostic errors
- adverse events, epidemiology and detection
- emergency department
- incident reporting
- paediatrics
Introduction
In 2015, Improving Diagnosis in Healthcare called attention to diagnostic errors (DxEs) and urged healthcare institutions to ‘monitor the diagnostic process and identify, learn from and reduce harm from diagnostic errors and near misses’.1 Yet DxEs, especially in paediatrics, remain understudied. Paediatricians report frequently committing DxEs, with half of errors resulting in patient harm.2 Breakdowns in the diagnostic process lead to significantly more harm when compared with other adverse events in ambulatory settings.3 DxEs comprise up to one-third of paediatric malpractice lawsuits.4 While DxEs have significant clinical, financial and legal repercussions, the true incidence of DxEs remains an imperfect estimate.5 Accurate calculations of the incidence of DxEs and their contribution to patient harm is necessary to improve safety.
One challenging aspect of studying DxEs is the resource requirement to accurately identify them.6–8 DxEs are contextual in nature, existing between the lines of a patient’s clinical presentation, diagnostic evaluation and response to treatment. Unlike medical errors related to discrete events or risk factors (eg, central lines, falls, surgery), DxEs can affect any patient in any setting, as the exposure of interest is the diagnostic encounter itself. DxEs are infrequently reported in traditional adverse event surveillance systems,9 especially those that are not resulting in significant morbidity and mortality.10 Indeed, conclusions on DxE drawn from other sources of data, such as malpractice insurers,11 are inherently limited, as not all errors result in easily measurable or overt consequences. Current frameworks for identifying DxE are inefficient and incomplete.
However, with electronic health records (EHR), computerised tools may provide a remedy. Davalos et al described using automated electronic triggers (e-triggers) to create enriched EHR datasets for which structured chart review is manageable.12 e-Trigger tools may also capture DxEs that do not result in significant harm. Consensus panels concluded that unscheduled return visits resulting in admission represent a high priority for DxE detection via e-triggers.13 The aim of this study was to use an e-trigger in combination with detailed chart review to identify a subset of paediatric patients presenting to the emergency department or urgent care centre (ED/UC) who experienced a likely DxE, defined as missed opportunities to make a correct or timely diagnosis with the information available.14 Additionally, we sought to identify demographic and clinical factors associated with DxEs.
Methods
Patient data from 1 January 2018 through 31 December 2019 were collected from a single university-affiliated paediatric Level 1 trauma and tertiary care centre (TCC) ED and five satellite ED/UC locations. Two satellite locations are free-standing TCC-affiliated UC facilities staffed by general paediatricians, physician assistants and nurse practitioners. Patients presenting to UC requiring emergency care or admission are transferred to the TCC. Two ED sites operate in hospitals owned and staffed by the TCC with provider coverage similar to the TCC. The third ED operates in a separately owned community hospital employing staff from the TCC. All three satellite EDs admit patients locally but admissions are limited to low acuity medical/surgical conditions (eg, bronchiolitis, cellulitis, appendicitis). Patients with high acuity, complex conditions or possible need for intensive care or subspecialty consultation are transferred to the TCC. Across all sites, there are approximately 165 000 patient ED/UC visits per year with a 7%–8% admission rate. Patients aged 0–22 years old admitted to the TCC within 14 days of a previous ED/UC (index) visit during the study period were eligible; this combination of visits defined the episode of care (EOC). Patients admitted to psychiatric or eating disorder units through the ED were excluded, as the diagnostic process for these patients involved providers independent of the ED/UC. Patients admitted to satellite inpatient services were excluded. Preparatory work for this research demonstrated that most of these patients were admitted for progression of diseases correctly diagnosed at the index encounter. Index visits where no diagnosis was made (eg, medical screening examinations, patient left before evaluation) were excluded. Patients admitted for planned procedures treating conditions identified at the index visit were also excluded. The local institutional review board deemed this study exempt from review.
The process described by Murphy et al guided e-trigger development.15 A clinical research informaticist designed the e-trigger algorithm to identify patients with qualifying EOCs using discrete demographic, time stamp and clinical location variables available in the EHR. An extract-transform-load (ETL) procedure then populated a REDCap16 database with those data elements. For EOCs with more than one ED/UC visit within the 14-day window, the algorithm extracted data from the visit most temporally proximate to the admission. Extracted data included patient demographic information, date/time stamps for both encounters, triage acuity and encounter diagnoses from index ED encounter notes, hospital discharge summaries and hospital billing information. The e-trigger and ETL procedure were iteratively refined using data from May 2018 to ensure that all eligible EOCs met inclusion criteria, did not meet exclusion criteria, and that no EOC was missed nor duplicated. As final confirmation that the e-trigger criteria and screening process appropriately identified cases of interest, all intensive care unit (ICU) admissions meeting e-trigger criteria were assessed for accurate performance regarding automated diagnosis capture, relevant EOC dates/times and application of screening definitions.
EOCs identified by the e-trigger (triggered cases) were then screened by a paediatric emergency physician (JAG) or ED nurse (FD) to determine if the EOC represented a possible missed diagnostic opportunity. Patients were screened out if the diagnoses at the index visit and hospital discharge: (1) represented unrelated conditions (eg, ankle fracture followed by asthma exacerbation); (2) represented the same condition (eg, recurrent episodes of diabetic ketoacidosis in a patient with known type 1 diabetes); or (3) indicated progression of a correctly diagnosed condition (eg, development of hypoxaemia in an infant with bronchiolitis). Cases were screened in for detailed review if the diagnoses at the index visit and subsequent admission’s discharge summary differed but were potentially related (eg, shared symptoms, similarly involved organ systems). Prior to data collection, these categories were iteratively defined by the screeners using the first 75 patients meeting e-trigger criteria and experiencing ICU admission until agreement and inter-rater reliability (Cohen’s κ) were consistently >90% and 0.8, respectively. Additionally, patients diagnosed with any of the following at hospital discharge were screened in by default, regardless of index visit diagnosis: acute intracranial pathology (ie, tumour, abscess, haemorrhage, meningitis, encephalitis), appendicitis, physical abuse, sepsis or testicular/ovarian torsion. These diagnoses represent conditions at high risk of malpractice litigation,17 carry risk of significant morbidity and mortality with delayed diagnosis or conditions for which independent hospital initiatives already exist to improve diagnosis. Both screeners reviewed a random sample of 20% of all triggered cases (n=383) with 93% agreement and substantial inter-rater reliability on the decision to screen in or out (Cohen’s κ=0.82 (95% CI 0.73 to 0.91)).
For EOCs selected for detailed review, index visit characteristics were manually abstracted. These included change of shift handoffs; use of consultants; categories of diagnostic studies obtained; documentation of a differential diagnosis; and whether the correct diagnosis was considered at the index visit.
The primary outcome of a missed diagnostic opportunity was evaluated using the Revised SaferDx instrument, a 13-item tool that assesses each phase of the diagnostic process and identifies opportunities where an earlier or correct diagnosis could have been made.14 Each SaferDx item receives an ordinal scale value from 1 to 7, with higher numbers indicating that component of the diagnostic process likely contributed to a DxE. The final item summarises the overall likelihood of a DxE for the EOC. Any case with a score of 5–7 on the final summary item indicated a probable DxE.
All chart reviews were performed by a physician (DL, JAG). To ensure consistency in reviews, inter-rater reliability was calculated on a subset of cases randomly selected using a random number generator applied on a monthly basis. The expected agreement between reviewers is likely to be higher than 50% under the null hypothesis (H0) when following clinical standards of care. Assuming a marginal prevalence of DxE of 15% and chance agreement (H0) of 75%, achieving a statistically significant increase in agreement at α>0.05 required a sample size of 225 charts to confirm the alternative hypothesis (H1) for inter-rater reliability.18 EOCs for which the two reviewers disagreed were reviewed by a third paediatric emergency medicine physician (AW). Reviewers recused themselves from reviewing EOCs when they were part of the index encounter treatment team.
Medical record numbers and dates of service for cases of likely diagnostic errors were cross-referenced against existing departmental and hospital incident reporting systems. These included the ED case review committee, the hospital quality and safety reporting system, and grievances submitted to the patient and family liaison office. Only cases reported for concerns of diagnostic accuracy were counted.
The data were summarised using standard descriptive statistics: frequencies and proportions for categorical variables and median and IQR for non-normally distributed continuous variables. Demographics were compared between patients who met e-trigger criteria and those who did not. Demographics, initial care events and outcomes were compared between patients with and without a likely DxE. These bivariate comparisons were made using Pearson’s χ2, Fisher’s exact and Wilcoxon-Mann-Whitney tests. Multivariable logistic regression was performed on the patients selected by the e-trigger to identify covariates that were independently associated with a likely DxE. Covariates from the bivariate analysis with a p value less than 0.2 were entered into the multivariable logistic regression and remained in the model (p<0.1) using backward selection. All analyses were performed using SAS, V.9.4 (SAS Institute).
Results
During the 2-year study period, there were 313 760 ED/UC visits and 24 849 admissions from the ED. Of these admissions, 1915 (7.7%) met e-trigger criteria occurring within 14 days of a prior ED/UC visit. Triggered cases screened as ineligible for detailed review (n=1462) were classified as progression of illness (35.8%), having the same condition at both visits (35.2%), or having separate, unrelated conditions (29.1%). These patients were younger, more often insured by Medicaid, had slightly more ED visits and hospitalisations in the prior 6 months, and exhibited longer times between index and admission encounters. Additionally, they were similar with respect to race, ethnicity, preferred language, index visit arrival time, index visit location (ED vs UC) and ICU admissions (table 1).
Of the 453 cases that underwent detailed chart review using SaferDx, we classified 92 (20.3%) as likely DxEs. The two primary reviewers (DL, JAG) reviewed 229 out of 453 of the same cases (50.6%), with 85.2% agreement regarding the presence of DxE with Cohen’s κ=0.65 (95% CI 0.54 to 0.75). The proportion of EOCs with likely DxEs among all patients experiencing a hospital admission during the 2-year study period was at least 92 out of 24 849 or 0.4% (95% CI 0.29% to 0.45%). In contrast, the proportion of DxEs in the e-trigger enriched sample was 92 out of 1915 or 4.8% (95% CI 3.9% to 5.8%). In three cases (0.7%) where primary reviewers disagreed, the third reviewer was unable to break the tie; these were excluded from further analysis. Among cases where no DxE existed (n=358), reviewers classified 63.2% as progression of illness, 22.6% as unrelated conditions and 11.2% as the same diagnosis.
High-risk conditions that were automatically screened in comprised 6.2% (118/1915) of all triggered cases. These 118 cases included sepsis (n=47, 39.8%), acute intracranial pathology (n=32, 27.1%), appendicitis (n=31, 26.3%) and physical abuse (n=8, 6.8%); there were no cases of testicular/ovarian torsion. No difference existed in the proportion of likely DxEs between high-risk conditions and those selected by the screeners (16.1% vs 22.0%; p=0.17). Subjects experiencing a likely DxE were older and received a non-emergent Emergency Severity Index (ESI; level 3–5) triage acuity more frequently (table 2). No significant differences existed with respect to race, ethnicity, preferred language, index visit time of arrival, encounter location (ED vs UC or tertiary care site vs satellite hospital), insurance provider or number of ED visits or hospitalisations in the prior 6 months.
Care events during the index encounter were similar for both groups with respect to provider handoffs, subspecialty consultation (including number and mode of consult), documentation of a differential diagnosis and test acquisition. Documentation included the correct final diagnosis for approximately 55% in both groups (table 3). Median ED/UC length of stay was 220 min (IQR 150–332) for those experiencing a likely missed diagnostic opportunity compared with 188 min (IQR 129–297; p=0.03) for those who did not.
Patients experiencing likely DxEs returned, on average, 1 day sooner than those who did not (2.3 days (IQR 1.6–3.7) vs 3.2 days (IQR 1.8–6.5); p=0.002). Both groups exhibited similar proportions of ICU admission, length of stay and death (table 4).
In multiple logistic regression analysis, the odds of experiencing a likely DxE was 2.22 times higher in patients 12 years and older (adjusted OR (aOR) 2.22 (95% CI 1.15 to 4.30)) compared with patients under 2 years old. No difference existed in the odds of experiencing a likely DxE for the children 2–5 years old or 6–<12 years old (aOR 1.21 (0.60–2.45) and 1.40 (0.69–2.85) respectively) compared with patients under 2 years. More inpatient admissions in the 6 months preceding the index visit were associated with a decreased odds of experiencing a likely DxE (aOR 0.67 (95% CI 0.45 to 0.98)). Lower ESI triage acuity levels (level 3–5 vs level 1–2) had a significant association with identifying a likely DxE (aOR 2.15 (95% CI 1.01 to 4.60)) (table 5).
Six (6.5%) cases were reported to existing reporting systems: five to the ED case review committee and two to the quality and safety reporting system. One case was identified in both. No grievances submitted to the patient and family liaison office involved diagnostic concerns.
The most commonly repeated DxEs included three conditions automatically included for SaferDx review. Fourteen cases involved intracranial pathology (15.2%). Eight were intracranial masses or cavernous sinus thrombus; six were episodes of meningitis. During review, distinct patterns of diagnostic miscues emerged for these two subsets of intracranial pathology. Six cases involved sepsis (6.5%). Five cases were appendicitis (5.4%). Musculoskeletal infections represented the most common diagnoses we did not automatically include for review (10.8%; seven osteomyelitis, six septic arthritis, one pyomyositis). No other missed diagnosis occurred more than two times. Table 6 provides narrative summaries of the diagnostic missteps in these conditions.
Discussion
Application of an e-trigger for unplanned admissions within 14 days of an index ED/UC visit followed by clinician screening enriched a sample of EOCs at increased risk of likely DxEs. The proportion of admitted patients with a likely error increased from 0.4% among all admissions to 4.8% among EOCs meeting e-trigger criteria. This increased to 20.3% when the e-trigger was coupled with a highly reliable screening process for discordant diagnoses at the encounters comprising the EOC. This represents a substantial increase in detection from that expected from a random search among all admissions.
Proposed triggers to identify quality of care concerns during ED encounters have included unscheduled returns within specific timeframes, most commonly 72 hours. However, unscheduled 72-hour returns are not associated with a demonstrably increased risk of hospital admission or mortality, suggesting that return within 72 hours alone fails to capture quality of care concerns.19 Refining such triggers by coupling a return visit to an unplanned admission may enhance detection of suboptimal care. Abualenain et al demonstrated that 5.2% of patients admitted 72 hours after an index ED visit experienced deviations from standard of care at the index encounter.20 More recently, triggers have been adapted to assess how these deviations are related to the diagnostic process. For example, care deviations occurred in 2.9% of patients admitted within 72 hours of an index ED visit, and 96% of these deviations specifically involved a DxE.21 An e-trigger combining an unexpected admission with a clinic visit in the preceding 14 days increased the proportion of cases involving a DxE approximately 10-fold from 2.1% for control cases to 20.9% meeting trigger criteria.8 Among children admitted to an ICU within 14 days of any outpatient/ED encounter, 2.4% experienced a DxE.12 DxEs occurred in 4.8% of children meeting e-trigger criteria in our study, and the majority of them returned within 72 hours. These findings are consistent with past literature and suggest that an e-trigger can filter EOCs to a subset that is more manageable for the detailed review necessary to identify diagnostic concerns that may otherwise go unrecognised in the absence of significant morbidity or mortality.
To further improve the yield of EOCs at increased risk for a DxE, others have examined linking specific final diagnoses to related chief complaints or diagnostic test orders at a prior encounter.22 23 These investigations demonstrated diagnostic delays, one type of DxE, in 23%–73% of cases. Although the present study focused broadly on discordant but potentially related diagnoses at the index encounter and hospital discharge, a brief yet highly reliable screening process further improved the yield of cases at risk for DxE. Rather than reviewing 1915 cases meeting the e-trigger criteria, attention focused on the 453 cases screened in for detailed review using the SaferDx, reducing reviewer workload by over 75%. When comparing cases screened in by clinician review to specific diagnoses screened in by default, no difference existed in DxE frequency, signalling that both processes hold value for improving the yield of EOCs at risk for DxE.
Existing institutional medical error reporting systems identified only six likely DxEs concurrently found with our method, consistent with prior observations that traditional adverse event surveillance systems underestimate DxE frequency.5 Unit-based review committees and morbidity and mortality (M&M) conferences may more thoroughly investigate DxEs, but are inherently limited. M&M conferences often only include errors that result in significant morbidity or mortality, are pathologically interesting or are referred from another department.24 Passive surveillance typical of such systems further limits their utility. Many providers are reluctant to disclose a DxE because it may bring attention to individual mistakes or deficiencies or have perceived implications for future litigation.25 Additionally, similar to M&M conferences, reporting events requires that DxEs be apparent to providers, something that is challenging in the ED, where patients are rarely treated by the same clinician on a subsequent encounter. The active surveillance for identifying possible DxEs described in this study overcomes passive surveillance limitations including lack of visibility to clinicians, reluctance to report and reporting only those EOCs that involve the most significant patient harm. Systematic surveillance is more likely to reveal recurring factors or patient presentations that place patients at risk for a likely DxE, consistent with creating a learning health system as recommended in Improving Diagnosis in Healthcare. The narrative summaries of the most frequent delayed diagnoses in our cohort suggest that faulty reasoning in the diagnostic process occurs repeatedly. It is imaginable that patient safety officers could use this e-trigger to provide more immediate and individualised clinician feedback as well as unit-based education to iteratively improve diagnostic accuracy.
Patient factors associated with experiencing a likely DxE in univariate analysis included older age and lower triage acuity at the index encounter. Four additional demographic variables neared significance: sex, preferred language, insurer and number of hospitalisations in the prior 6 months. After multiple logistic regression, increased age and lower triage acuity were independently associated with increased risk of an error, while more previous hospitalisations were associated with a decreased risk. The association with age mirrors the observation that patients who met e-trigger criteria were also significantly younger than those who did not. We suspect this finding results from two situations. First, young infants presenting with infectious respiratory symptoms who returned with escalation of symptoms (eg, hypoxaemia) or a known complication of a viral illness (eg, pneumonia) comprised a significant subset of our cohort. Second, diagnostic accuracy is more difficult in younger, preverbal children who cannot describe their symptoms sufficiently. Over time, the condition becomes more discernible and a specific diagnosis apparent. In contrast, patients with multiple prior hospitalisations likely represent a population with complex medical histories. Our experience suggests that these patients undergo more extensive initial diagnostic evaluations that may be protective against DxEs. Considering the extent of diagnostic evaluation, the association between lower acuity and likely DxE is noteworthy. While ESI acuity level has been shown to correlate well with variables such as ED length of stay and resource utilisation,26 it has not previously been described as contributing to DxEs. Clinically, higher acuity patients frequently appear more ill and are therefore likely to receive more thorough diagnostic evaluations, reducing the likelihood of a DxE at the index encounter.
Care at the index encounter did not differ between groups with respect to ED handoffs, consultations, acquisition of tests and differential diagnosis generation. This is noteworthy as these clinical factors have been previously posited as high-risk opportunities for medical errors. Handoffs are well known across all clinical contexts to be a substantial source of communication errors.27 28 Recent efforts to standardise handoffs have been shown to decrease patient harm.29 Use of consultants and failure of test acquisition and follow-up have been described as risk factors contributing to DxEs.30 Additionally, in a national survey study of self-reported DxEs, hypothesis generation and a delay in considering the correct diagnosis were the largest contributors to errors related to clinical assessment.31 Length of hospitalisation has been previously demonstrated as an indicator of DxE in studies using e-triggers.32 In the present study, no difference existed in outcomes including length of hospitalisation, ICU admission or death. The comparable outcomes between those with and without likely DxE provide indirect evidence that other factors such as failure to elicit key historical information or examination findings, failure to order the necessary tests, and misinterpretation of results may be drivers of DxE suggesting additional avenues of investigation.
We identified the following limitations. We analysed data from a single paediatric hospital system, thus, limiting the generalisability of our findings. However, the high patient volume and relatively large catchment area served by our institution provides a sample of patients and diseases unlikely to be unique among paediatric emergency and urgent care visits. As we only studied care delivered by paediatric specialists, it is likely that our findings still underestimate the frequency of DxEs occurring in children receiving care at community EDs and UCs, where providers may have less experience with paediatric medicine. Because patients discharged at the subsequent ED/UC encounter and those not meeting e-trigger criteria did not undergo detailed chart review, we cannot calculate the sensitivity, specificity or negative predictive value of the screening process. Nonetheless, the significant increase in proportion of cases identified through the e-trigger and subsequent screening highlights the utility of such a process for increasing case identification. EOCs demonstrating a likely DxE were not reviewed by a committee of ED/UC providers as recommended by the creators of the SaferDx when a care concern is identified. However, the high inter-rater reliability demonstrated between primary reviewers suggests a larger committee may have identified similar concerns.
Conclusion
An e-trigger coupled with screening for discordant diagnoses among children with unanticipated admission within 14 days of an ED/UC visit enriched a sample of patients at risk for a DxE from 0.4% among all admissions to 20.3% of admissions selected by our process. Fewer than 10% of EOCs involving a likely DxE were detected by existing reporting systems at the same institution. These results indicate that paediatric patients experiencing a DxE in the ED with a preceding encounter within 14 days represent a distinct population that is currently overlooked by present patient harm surveillance systems. No specific care events predicted an increased risk of DxE, suggesting further study of these EOCs is necessary to identify phases in the diagnostic process and specific clinical presentations that increase the risk of committing a DxE. Finally, more could be learnt by comparing patients who do not meet trigger criteria (eg, not admitted) or were screened out to identify additional opportunities to reduce harm from DxE.
Data availability statement
All data relevant to the study are included in the article or uploaded as supplemental information. The data for this study are not available publicly as they are owned by Children’s Hospital Colorado, contain sensitive patient information, and could be used for medico-legal litigation.
Ethics statements
Patient consent for publication
Ethics approval
Colorado Multiple Institutional Review Board (20-0798). This project was deemed exempt from review.
References
Footnotes
Contributors JAG conceived and designed the study, obtained IRB approval, assisted with data acquisition, results interpretation and critically revised the manuscript. DL assisted with data acquisition, results interpretation, drafted the initial and revised manuscript and is guarantor for this work. JL aided in study design conducted the statistical analysis, drafted the statistics portion of the Methods and critically revised the manuscript. AW assisted with data acquisition and critically revised the manuscript. FD assisted with study design, provided critical oversight in the conduct of the study, aided with data acquisition and critically revised the manuscript.
Funding This study was funded by Colorado Clinical and Translational Sciences Institute (CTSA Grant UL1 TR002535).
Competing interests None declared.
Provenance and peer review Not commissioned; externally peer reviewed.