Background A project sponsored by the University Health System Consortium has addressed the inaccuracy and high variability across institutions concerning the use of the failure to rescue (FTR) quality indicator defined by the Agency for Healthcare Research and Quality (AHRQ). Results indicated that of the complications identified by the quality indicator, 29.5% were pre-existing upon hospital admission.
Objective The purpose of our study was to investigate the possible bias to FTR measures by including cases of complications that were pre-existing at admission.
Methods Hospital discharges between 1 January 1996 and 30 September 2007 were retrospectively gathered from administrative databases. Using definitions outlined by the AHRQ and the National Quality Forum (NQF), FTR rates were calculated. Using present on admission coding, FTR rates were recalculated to differentiate between the rates of pre-existing and that of acquired cases.
Results Using the AHRQ definition, the overall FTR rate was 11.60%. The FTR rate for patients with pre-existing complications was 8.85%, whereas patients with complications acquired during hospitalisation had an FTR rate of 18.46% (p<0.001). The NQF FTR rate was 9.93%. Pre-existing and acquired FTR rates using the NQF measure were 9.42% and 12.77%, respectively (p<0.001).
Conclusions Current definitions of FTR measures meant to identify inhospital complications appear biased by the inclusion of problems at admission. Furthermore, many patients with these complications are excluded from the algorithms. When taking into account the timing of the “complications”, these measures can be useful for internal quality control. However, it should be stressed that the usefulness of the measures to compare institutions will be dependent on coding practices of institutions. Validation using chart review may be required.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
A project sponsored by the University HealthSystem Consortium has addressed the inaccuracy and high variability across institutions concerning the failure to rescue (FTR) quality measure defined by the Agency for Healthcare Research and Quality (AHRQ). The AHRQ-defined measure is intended for hospitals to utilise administrative data for efforts in quality monitoring and improvement. Horwitz et al1 used chart review as their gold standard to compare with administrative records. The results showed that 29.5% of the FTR-qualifying complications were found to be present upon admission. By definition, these cases should not be counted in the AHRQ FTR measure.
The FTR concept was originally developed by Silber et al2 in a study looking at hospital death rates following specified surgical complications. The AHRQ used a definition that focuses on patients having developed at least one of six complications: acute renal failure, deep vein thrombosis and/or pulmonary embolism, pneumonia, sepsis, shock and/or cardiac arrest and gastrointestinal bleeding and/or ulcer. The National Quality Forum (NQF) also uses its own definition of FTR as one of their nursing sensitive quality measures. Although the AHRQ and NQF measures are both intended to identify death rates of patients having complications during hospitalisation, there are two major differences: (1) The NQF measure does not include acute renal failure as a qualifying complication. (2) The NQF measure is also restricted to only surgical cases. In March 2008, the AHRQ adopted the NQF measure changing the denominator to include only surgical patients, dropping renal failure and relabelling the measure to “Death among Surgical Inpatients with Serious Treatable Complications” (http://www.qualityindicators.ahrq.gov/psi_download.htm).
With rising healthcare costs and the constant demand for high quality care, utilising pay-for-performance as a means of improving patient care is drawing increased interest.3–6 Many measures designed for monitoring the quality of care for institutions, such as the AHRQ and NQF FTR measures, use hospital administrative data. These data are a low cost option of identifying possible areas in need of improvement. However, using pay-for-performance based on administrative data could result in rewarding the wrong medical institutions if the data are inaccurate or if measures are not correctly specified. The literature has noted many limitations in using administrative data for comparisons between institutions, namely the accuracy and consistency of diagnosis codes,7–10 and the mixture of comorbidities and complications as secondary diagnoses.11–13 For the case of FTR, if the measure was ever to be linked to pay-for-performance it would be critical that the measure identified the proper patients.
Our goals in this study were to determine (1) the accuracy of the FTR algorithms (extent of pre-existing cases being included (false positives) and acquired cases being excluded (false negatives)) and (2) whether these inaccuracies would bias the FTR measures. We hypothesised that any bias would be most problematic among medical patients, only included in the AHRQ measure.1 14 This would further support previous findings of the inaccuracy of the AHRQ measure.1
We considered all hospital discharges occurring at Mayo Clinic, Rochester, between 1 January 1996 and 30 September 2007 in our analyses. Data after this timeframe uses the new Medicare Severity Diagnosis-Related Group system of classification and would not be compatible with the FTR definitions used in this study. All medical and surgical inpatients were eligible for inclusion in the AHRQ FTR population, whereas only surgical inpatients having their principal procedure within 2 days of admission were considered in the NQF population. In accordance with Minnesota statutes, all patients who denied access to their records for research were excluded. This study was approved by the Mayo Institutional Review Board.
All data used in this study were retrospectively gathered from administrative databases of patient discharges. Variables for analysis included principal and secondary diagnoses, principal and secondary procedures, diagnostic related group and other descriptors of the hospitalisation. Among these descriptors are indicators pertaining to the timing of each secondary diagnosis. These indicators were collected by data abstractors whose methods have been shown to be reliable in determining the timing of secondary diagnoses.15 Before 2007, our definition for coding present on admission was similar to current national standards except that pre-existing conditions discovered after admission were coded as acquired. Less than 0.1% of relevant complications for both FTR measures were coded with uncertain timing.
The algorithm used for identifying the patients in the AHRQ FTR measure was downloaded from its website. Version 2.1 Revision 1 of its Patient Safety Indicators SAS programmes was used. No downloadable algorithm is supplied by NQF for identifying FTR patients. Their inclusion and exclusion criteria were used to design an algorithm to calculate their measure.16
The unaltered algorithms produced FTR populations for both the AHRQ and NQF measures. A new set of FTR patients for the AHRQ and NQF measures was created by identifying all patients with a qualifying complication coded as a secondary diagnosis that was acquired during hospitalisation while ignoring all exclusions designed to minimise pre-existing problems. These two new groups of FTR patients serve as our gold standards.
The AHRQ FTR populations were stratified by the six specific complications. Using the χ2 test, we compared FTR rates for pre-existing complications identified by the AHRQ algorithm with those for acquired complications identified by our gold standard. The AHRQ FTR rates could not be directly compared statistically with the gold standard FTR rates because of the populations overlapping. The AHRQ FTR populations were also divided into surgical/medical status based on the surgery definition of the NQF algorithm. Analyses comparing pre-existing and acquired status by surgical/medical status were performed.
The same calculations and statistical tests were performed with the NQF measure population. Furthermore, the overall FTR rates for the unaltered and gold standard FTR populations for both measures were plotted over time.
Patient groups identified by all algorithms are depicted in fig 1. Using the unaltered AHRQ algorithm, a total of 24 633 discharges were identified during the study timeframe as FTR cases. Using the previously mentioned alterations to the algorithm resulted in the gold standard FTR population of 16 972. A total of 8106 of the FTR patients were identified by both approaches. Figure 1 shows that the AHRQ algorithm identified 16 527 (67%) patients with pre-existing conditions while excluding 8866 patients having acquired conditions. The unaltered NQF algorithm produced an FTR population of 7217. The altered algorithm identified a population of 5707. A total of 4107 FTR patients were identified by both approaches. Figure 1 shows that NQF algorithm identified 3110 (43%) patients with pre-existing conditions while excluding 1600 patients having acquired conditions.
The total FTR rates for the AHRQ measure and rates of the specific complications are given in table 1 stratified by acquired and pre-existing status followed by the FTR rates produced by the unaltered algorithm. The corresponding p values of the comparison of acquired to pre-existing rates are provided in the table. The FTR rate among acquired cases for the overall measure and five of the complications were significantly higher than rates for pre-existing cases.
A total of 22 001 medical cases (9316 acquired and 12 685 pre-existing) were identified by the AHRQ algorithms. The significance levels by complication for the medical patients were unchanged from the group as a whole (results not shown). There were, in general, slightly higher rates among acquired cases and slightly lower rates among pre-existing problems compared with the full AHRQ population.
The remaining 11 498 AHRQ cases were classified as surgical patients (7656 were acquired cases and 3842 were pre-existing). The same general trend of significantly higher rates in the acquired patients was found among the surgical patients. However, the p-values for renal failure, pneumonia and GI-ulcer were less significant, between 0.01 and 0.05. In general, among surgical patients, the differences in FTR rates between acquired and pre-existing complications were smaller than among medical patients.
The corresponding FTR rates for the NQF populations are shown in table 2. As with the AHRQ measure, the overall FTR rate was significantly different between pre-existing and acquired complications. Similar to the AHRQ results, the trend was for the acquired cases to have significantly higher rates. However, no significant difference was observed among postoperative pneumonia patients.
Figure 2 presents the gold standard FTR rate and unaltered FTR rate for both the AHRQ and NQF measures by calendar year. There appear to be substantial improvements in both measures over time with greater gains among hospital associated cases in the NQF measure.
Our results showed both substantial misidentification of patients with serious treatable complications and significant differences in FTR rates between the pre-existing versus acquired cases for both the AHRQ and NQF measures. More than half of AHRQ-identified patients had complications labelled as present on admission, whereas half of patients with those complications developing in the hospital were excluded from the algorithms. These findings imply that unaltered measures underestimate FTR rates. This bias could affect the FTR measures enough to make the rates and any conclusions based on the unaltered algorithms inaccurate. When focusing on surgical cases for the AHRQ population the differences were not as large, mirroring the NQF results.
Our results support previous findings that the AHRQ FTR measure does not identify the proper cases.1 One reassuring finding is that focusing on surgical cases lessens the difference of the acquired and pre-existing rates. We also saw substantial improvements over time in our practice with both the AHRQ and NQF gold standard rates. Although our practice could have improved, these “trends” could also be because of the effects of ICD-9 coding changes across years.
Recently, the validity of both the AHRQ and NQF measures has been questioned. Using Medicare Provider Analysis and Review data, a study was conducted comparing the original FTR definition with the AHRQ and NQF FTR definitions.17 It was found that these two definitions omitted more than 40% of all deaths and were less correlated with the adjusted mortality rank of hospitals. However, it has been noted that, even though all three definitions look at “failure to rescue”, the AHRQ and NQF definitions and the original FTR definition have fundamental differences in determining which patients are considered eligible for the denominator of the respective FTR rates.18 Because the original FTR measure includes all patients who died, even those without a complication, a direct comparison of the AHRQ and NQF measures with the original measure may not be appropriate.
The results of our study show that using either unaltered FTR measures could be misleading. Even if used internally to identify possible cases of quality failure, attempts to improve the quality of care may be hindered by including pre-existing complications and unnecessarily excluding cases of interest. Root causes behind preventable deaths among acquired versus pre-existing complications are likely to be different. In this particular case, the FTR rate using either measure was lower by including pre-existing cases. This could lead to inappropriately shifting priorities in addressing improvement efforts. Further research would be warranted to determine if other institutions have similar findings. In a previous study, we reported higher rates of hospital acquired conditions among hospital transfers and among physician-referred versus self-referred or primary care patients.19 In the same data, these referred patients also had more pre-existing comorbidities than local, primary care patients. We suspect any bias may be more prevalent in institutions seeing a large number of referrals and/or providing specialty care.
AHRQ has begun to incorporate the NQF algorithms into the most recent AHRQ release (V.3.2, March 2008). Furthermore, the newest release is reported to have the capability to incorporate present on admission coding.
There are several limitations to this study that must be noted. No cases were verified by chart review. We are relying on the accuracy of the timing indicators that have been shown to be adequate in previous studies.13 15 Also, since our data was gathered across several years, the possibility of inconsistent coding of administrative data could lead to inaccuracies in both the diagnosis codes as well as the timing indicators of these codes. Furthermore, we did not apply any risk adjustment approach to the analysis.
Results are based on only one version of the AHRQ algorithm. Additional published changes to the algorithms were not used for this study. More recent algorithms exclude patients younger than 18 years. These patients were used in the analyses but likely had little impact as they totalled only 1494 cases over the study period. Finally, this study was based on a single academic centre with a high proportion of referrals. Patient populations of healthcare institutions having few referrals may not have as high a proportion of pre-existing conditions as were found in this study.
Current definitions of FTR measures meant to identify inhospital complications appear biased by the inclusion of problems at admission. Furthermore, many patients with these complications are excluded from the algorithms. When taking into account the timing of the “complications”, these measures can be useful for internal quality control. However, we stress that the usefulness of the measures to compare institutions will be dependent on coding practices of institutions. Validation using chart review may be required.
The authors acknowledge Sara Hobbs Kohrt for her help in manuscript preparation and Leora Horwitz, MD, MHS, for insightful comments and suggestions.
Competing interests None.