Ranking hospitals: do we gain reliability by using composite rather than individual indicators?

Stefanie N Hofstede; Iris E Ceyisakar; Hester F Lingsma; Dionne S Kringos; Perla J Marang-van de Mheen

doi:10.1136/bmjqs-2017-007669

Article Text

PDF

Original research

Ranking hospitals: do we gain reliability by using composite rather than individual indicators?

Stefanie N Hofstede1,
Iris E Ceyisakar2,
Hester F Lingsma2,
Dionne S Kringos3,
Perla J Marang-van de Mheen1

¹ Department of Biomedical Data Sciences, Medical Decision Making, Leiden University Medical Centre, Leiden, The Netherlands
² Department of Public Health, Erasmus MC, Rotterdam, The Netherlands
³ Department of Public Health, Academic Medical Centre, Amsterdam, The Netherlands

Correspondence to Dr Perla J Marang-van de Mheen, Department of Biomedical Data Sciences, J10-S, Leiden University Medical Centre, Leiden 2300 RC, The Netherlands; p.j.marang{at}lumc.nl

Abstract

Background Despite widespread use of quality indicators, it remains unclear to what extent they can reliably distinguish hospitals on true differences in performance. Rankability measures what part of variation in performance reflects ‘true’ hospital differences in outcomes versus random noise.

Objective This study sought to assess whether combining data into composites or including data from multiple years improves the reliability of ranking quality indicators for hospital care.

Methods Using the Dutch National Medical Registration (2007–2012) for stroke, colorectal carcinoma, heart failure, acute myocardial infarction and total hiparthroplasty (THA)/ total knee arthroplasty (TKA) in osteoarthritis (OA), we calculated the rankability for in-hospital mortality, 30-day acute readmission and prolonged length of stay (LOS) for single years and 3-year periods and for a dichotomous and ordinal composite measure in which mortality, readmission and prolonged LOS were combined. Rankability, defined as (between-hospital variation/between-hospital+within hospital variation)×100% is classified as low (<50%), moderate (50%–75%) and high (>75%).

Results Admissions from 555 053 patients treated in 95 hospitals were included. The rankability for mortality was generally low or moderate, varying from less than 1% for patients with OA undergoing THA/TKA in 2011 to 71% for stroke in 2010. Rankability for acute readmission was low, except for acute myocardial infarction in 2009 (51%) and 2012 (62%). Rankability for prolonged LOS was at least moderate. Combining multiple years improved rankability but still remained low in eight cases for both mortality and acute readmission. Combining the individual indicators into the dichotomous composite, all diagnoses had at least moderate rankability (range: 51%–96%). For the ordinal composite, only heart failure had low rankability (46% in 2008) (range: 46%–95%).

Conclusion Combining multiple years or into multiple indicators results in more reliable ranking of hospitals, particularly compared with mortality and acute readmission in single years, thereby improving the ability to detect true hospital differences. The composite measures provide more information and more reliable rankings than combining multiple years of individual indicators.

quality improvement methodologies
quality measurement
continuous quality improvement
healthcare quality improvement
mortality (standardized mortality ratios)

https://doi.org/10.1136/bmjqs-2017-007669

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Commonly used and routinely collected quality indicators for hospital care include in-hospital mortality, 30-day acute readmissions and long length of stay (LOS). These indicators may provide information for healthcare providers and hospital managers to improve quality of care, for patients to choose between-hospitals, for healthcare insurers to purchase health services and for policy makers to monitor the performance of the healthcare system. There is, however, ongoing debate on whether single indicators adequately reflect quality of care but also whether they truly enable us to discriminate between hospitals in terms of their performance, that is, the reliability of hospital rankings.1–4 The reliability of ranking hospital performance can be assessed by determining the rankability of indicators. Previous research showed that the rankability of individual indicators differs2 4 and that the rankability is lower when estimates are imprecise, which is often the case when outcomes have fewer events in some patient group, for example, mortality after hip or knee replacement.5 Therefore, we hypothesise that increasing the number of events per hospital, by combining data, may result in more reliable hospital rankings.

Increasing the number of events to be included in quality measurements may be done in various ways. The first is to combine data from multiple years. However, even if this increases reliability of hospital performance ranking, information may reflect treatment outcomes that have been improved over time and do not reflect current practice anymore. Furthermore, it does not provide sufficient information for professionals trying to improve the quality of hospital care since short-term results of quality improvements will not be visible. Combining different indicators may be another solution, with the additional benefit that more information is captured and thus a more complete picture of quality of care is provided because indicators may be interrelated.6 Over the years, different initiatives have been taken to combine indicators and thereby provide a more complete view on hospital performance but often focused on a specific condition.7–10 A commonly used indicator in the Netherlands consisting of combined indicators is the ‘textbook outcome’, which is a dichotomised outcome representing the proportion of patients for whom all desired short-term outcomes of different indicators are realised.11 12 For instance, using the indicators in-hospital mortality, 30-day acute readmissions and long LOS, this would mean the ‘textbook outcome’ is a patient who is discharged alive, with no long LOS and no readmission. Such a ‘textbook outcome’ may be easier to interpret for patients than a single outcome indicator, but different adverse outcomes are lumped together so it may not provide sufficient information to be used for quality improvements in hospitals due to the dichotomisation. Ordering the various combinations and creating an ordinal composite measure may be a better suited alternative approach for quality improvement purposes in hospital.

Even though different initiatives have been undertaken to combine data, it is unknown whether this actually results in better reliability of ranking hospital performance than individual indicators. Therefore, this study aims to assess whether combining data into composites or including data from multiple years improves the reliability of ranking quality indicators for hospital care.

Methods

Study population

We used routinely collected administrative admission data from the Dutch National Medical Registration (LMR) from 2007 to 2012 retrieved from Statistics Netherlands,13 as more recent data were not publicly available due to conversion from International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) to International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM). These data capture all hospital patients rather than only specific patient groups. The LMR contains administrative data of approximately 88% of the hospital admissions in 2007 to 76% in 2012.14 This includes patient-specific data such as patient characteristics, as well as medical data such as diagnosis, surgical procedures and hospital stay. Multiple admissions within one patient in different hospitals are identified using an anonymous unique patient identifier.15 Each patient admission is assigned one primary diagnosis code according to ICD-9-CM based on the discharge letter and other available information in the medical records of patients and secondary diagnosis codes if applicable.6 16 We included clinical admissions with at least a primary diagnosis code to enable identification of specific patient groups (defined by the Clinical Classifications Software (CCS)). We excluded admissions for hospitals with incomplete follow-up, defined as months without any coded admissions or admissions from hospitals without a previous month of measurements, because these missing data made it impossible to assess readmission within 30 days. We selected patient groups with typical different LOS, readmission and mortality patterns to ensure sufficient variation and thereby enabling generalisation to other diagnoses.6 These patient groups are: stroke (CCS 109, high mortality and long LOS), colorectal carcinoma (CCS 14 and 15 and long LOS), heart failure (HF) (CCS 108 and high readmission), acute myocardial infarction (AMI) (CCS 100 and high mortality) and hip and knee replacements (THA/TKA) in patients with osteoarthritis (OA) (CCS 203 and high readmission). Procedures (THA/TKA) of 2012 were excluded since 45% was missing.

Definitions

We studied the following indicators for each year (2007–2012) and 3-year periods (2007–2009 and 2010–2012):

In-hospital mortality: defined as death in hospital during the index admission.
Acute readmission: an emergency readmission within 30 days after discharge.
Long LOS: defined as a LOS in the top 25% LOS of the specific diagnosis (CCS group) or procedure group (for THA/TKA).
The textbook outcome: defined as patients discharged alive, no long LOS and no acute readmission.
The ordinal composite measure (Textbook Outcome Plus (TOP) defined as (from best to worst):

Alive, no long LOS and no acute readmission.
Alive, long LOS and no acute readmission.
Alive, no long LOS and acute readmission.
Alive, long LOS and acute readmission.
In-hospital mortality.

The ordering of this ordinal measure was described in a previous study17 based on patients views from the existing literature where patients considered complications after discharge (often resulting in readmissions) as worse quality of care than complications during admission (resulting in longer LOS).18 The measure was presented at a meeting to about 100 quality of care experts (including physicians and CEOs) from different countries involved in the Global Comparators Project and considered adequate.

Statistical analysis

Rankability

We chose the rankability as a summary measure of the reliability of ranking hospitals, as this is most frequently used in the Netherlands, validated in previous research2 19–22 and similar to the methods used by Dimick et al.23 The rankability is a measure of the signal to noise ratio. The signal is the true differences between hospitals and the noise is the imprecision induced by small numbers (eg, low hospital volume).4 The intraclass correlation coefficient (ICC) is a similar measure when used as a measure of the discriminative power, as it reflects the proportion of the total variance that can be attributed to, for example, between hospital differences using multilevel modelling.24 The rankability is expressed as a percentage, for example, a rankability of 70% means that 70% of the variation is explained by ‘true’ hospital differences, while 30% is noise. The rankability is computed based on two components: the within-hospital variation and the between-hospital variation. The within-hospital variation was estimated using a fixed effects logistic regression model (individual indicators and ‘textbook outcome’) and a fixed effects ordinal logistic regression model for the ordinal composite measure, including hospitals and case-mix variables as fixed factors. The median squared SE of the coefficient for the hospital variable was used to estimate the within-hospital variation.4 Hospital volume is thus reflected in the precision of the hospital coefficient, so that low hospital volume will result in less precision (ie, larger within-hospital variation), which will make it harder to detect between-hospital differences (ie, lower reliability of ranking). The between-hospital variation was estimated using the heterogeneity from a random effects logistic regression model (individual indicators and ‘textbook outcome’) and a random effects ordinal logistic regression model for the ordinal composite measure, in which hospitals were included as a random factor and case-mix variables as fixed factors. The rankability was calculated using the following formula:

We classified rankability as follows: low (<50%), moderate (50%–75%) and high (>75%).4 This was done as an attempt to identify the relevance of increasing reliability. Increase in reliability is particularly needed for those indicators with low reliability of ranking, because in these circumstances, we are less likely to detect any true hospital differences due to the relatively high amount of noise.

Case-mix adjustment

The following patient level variables were included to adjust hospital outcomes for differences in case-mix: 5-year age groups, sex, socioeconomic status based on the postal area of the patient’s address (six categories), year of admission, diagnosis or procedure group (for THA/TKA), method of admission (acute/not acute), transferred in from other hospital, urgent admission in previous month (yes/no) and the Charlson comorbidity score to correct for severity of relevant comorbidities based on the secondary diagnosis codes.25 Age groups with fewer than 10 events were iteratively combined with the immediately older group. Statistical interactions between age and Charlson comorbidity score, and between method of admission and transfer were included based on previous findings.26

All analyses were executed using the software package STATA (V.14).

Results

Population

Admissions from 555 053 patients treated in 95 Dutch hospitals were included. Table 1 shows the median estimates for each indicator and condition per year as well as the range in the median across different years. Most patients were admitted for a THA/TKA due to OA with a median of 376 admissions per hospital per year but the smallest number of hospitals reporting these admissions (median of 61 hospitals across years). Patients who had HF were treated in most hospitals (median of 82.5 hospitals across years) with a median of 253.5 patients per hospital per year. The highest median in-hospital mortality was found for stroke (14.9%) and lowest for THA/TKA (0%). Acute readmissions most often occurred in patients who had HF (median 12.9%) and least in patients with OA undergoing a THA/TKA (median 3.5%).

View this table:

Table 1

Indicators at a hospital level, median and median range of the 1-year periods

Rankability of individual indicators for each year

Figure 1A–E shows the rankability of individual indicators for stroke, colorectal carcinoma, HF, AMI and THA/TKA for single years (in total 29 year–diagnosis combinations tested). The rankability for in-hospital mortality varied between less than 1% for patients with OA undergoing THA/TKA in 2011 to 71% for stroke in 2010. For 20 (69%) of the tested combinations, the rankability was low for mortality and moderate for 9 (31%) combinations (table 2). The highest rankability of acute readmission was found for AMI in 2012: 62%. Except for AMI in 2009 (51%) and 2012 (62%), the rankability for acute readmission was low. For long LOS, the rankability varied from moderate in 17 of the tested combinations, with the lowest rankability of 59% for colorectal carcinoma in 2010, and a high rankability for the remaining 12 combinations, with the highest rankability of 97% in 2008 for patients with OA undergoing THA/TKA. Given the frequent low rankability for mortality and acute readmission (in 69% and 93% of the tested combinations), most gain in reliability is likely for these indicators.

View this table:

Table 2

Rankability of indicators classified as low (<50%), moderate (50%–75%) or high (>75%)

Figure 1

Rankability of indicators over 1-year and 3-year periods. LOS, length of stay; OA, osteoarthritis.

Combining data in 3-year periods

Combining single years into 3-year periods resulted in higher rankabilities compared with individual years for all indicators, except for THA/TKA in patients with OA (figure 1A–E). This results in fewer combinations with a low rankability for 3-year periods. From the 20 combinations with low rankability on mortality, 8 (40%) remained low when combining data in 3-year periods, 11 (55%) became moderate and 1 high (5%). From the nine combinations with moderate rankability, four (44%) remained moderate when combining data and five (56%) became high. For acute readmissions with 27 combinations having low rankability, 19 (70%) became moderate when combining data in 3-year periods. The two combinations with moderate rankability remained moderate when data were combined. For long LOS, combining multiple years resulted in a high rankability for all 29 combinations, whereas the rankability was moderate in 17 and high in 12 combinations of the single years. Thus, combining data in 3-year periods improves rankability for mortality and acute readmission and even results in high rankability for long LOS for all 29 combinations but remains low for 8 combinations of mortality and acute readmission.

Combining data into composite measures

Over the years 2007–2012, 57.6% of the patients who had stroke, 65.1% of the patients with colorectal carcinoma, 56.5% of the patients who had HF, 59.8% of the patients with AMI and 72.4% of OA patients with a THA/TKA had a textbook outcome. The rankability of the ‘textbook outcome’ was moderate for 15 (52%) and high for 14 (48%) of the tested year–diagnosis combinations (table 2). The lowest rankability was 51% for colorectal carcinoma in 2010, 96% as highest for patients with OA undergoing THA/TKA in 2008 (figure 1B,E). From the 20 year–diagnosis combinations having low rankability on mortality, combining data into the ‘textbook outcome’ improved the rankability to moderate in 13 (65%) combinations and high in 7 (35%) combinations. From the nine combinations having moderate rankability, six (67%) improved to high and three (33%) remained moderate. From the 27 combinations having low rankability on acute readmission, 15 (56%) improved to moderate and 12 (44%) improved to high when combining data into the textbook outcome. For the two combinations having moderate rankability, one improved to high and one remained moderate. From the 17 combinations having moderate rankability on long LOS, combining data into the ‘textbook outcome’ improved the rankability to high in 4 (24%) combinations and remained moderate in 13 (76%) combinations. From the 12 combinations having high rankability, 9 (75%) remained high, but 3 (25%) decreased to moderate. Therefore, combining data into the ‘textbook outcome’ improves reliability in all years having low rankability for either mortality or acute readmission. For long LOS, rankability decreased in some cases but was still at least moderate.

For the ordinal composite measure, the rankability was mostly moderate (66%) or even high (31%) (table 2). The lowest rankability was 46% for HF in 2008 and highest was 95% for patients with OA undergoing THA/TKA in 2007 (figure 1C,E). Looking at single years, the rankability of the composite measure improved in all combinations compared with the single indicators. From the 20 year–diagnosis combinations with low rankability for mortality, 14 (70%) improved to moderate and 5 to high (25%) with only 1 combination remaining low. From the nine combinations with moderate rankability, four (44%) improved to high and five (56%) remained moderate. In comparison with acute readmission, the picture was even more pronounced as from the 27 year–diagnosis combinations having low rankability for acute readmission, 17 (63%) improved to moderate when combining data into the ordinal composite, 9 (33%) improved to high and 1 remained low. The two combinations having moderate rankability remained moderate. Less was gained compared with long LOS: from the 17 year–diagnosis combinations having moderate rankability for long LOS, 4 (24%) improved to high and 12 (71%) remained moderate but 1 decreased to low rankability. From the 12 combinations having high rankability, 5 (42%) remained high but 7 (58%) decreased to moderate. So combining data into the ordinal composite measure improves reliability in most years when compared with mortality and acute readmission, but one combination remains with low rankability. Compared with long LOS, rankability remains the same or improves, resulting in at least moderate rankability except for one combination where rankability decreased to low rankability.

Within-hospital and between-hospital variations

To understand why rankability is increased or reduced, we studied the components of the calculation. The rankability may increase either if the within-hospital variation becomes smaller (eg, by increasing the number of events) but also if the between-hospital variation becomes larger. Figure 2A–C shows the between-hospital variation (Tau, x-axis) against the rankability (y-axis). The different lines represent the median SE, used to calculate the within-hospital variance (σ²). So by combining indicators, we aim to move towards a line with smaller SE and upwards to increase rankability assuming the same between-hospital variance. However, the question is whether between-hospital variation stays the same when indicators are combined or that it averages out because hospitals have relatively good scores on one indicator and worse on another.

Figure 2

Gain in rankability using the ordinal composite measure versus an individual indicator (for 1-year periods, 2007–2012). Colour: red: stroke; blue: colorectal carcinoma; orange: heart failure; purple: acute myocardial infarction; green: THA/TKA in OA.

Figure 2A shows that for mortality, the intended direction is achieved as we move upwards going from + to □ for the different conditions in different colours. The composite measure has lower within-hospital variation (smaller median SE) than in-hospital mortality. The between-hospital variation of the composite measure is also slightly lower than that for mortality as we move slightly to the left (except for THA/TKA). Since the within-hospital variation of the composite measure decreases more than the between-hospital variation, this results in a higher rankability for the composite measure compared with in-hospital mortality for all conditions and each single year. The huge improvement in rankability for THA/TKA is caused by the combination of a large increase in between-hospital variation together with a decrease in within-hospital variation. Looking at acute readmission (figure 2B), we see a similar picture. For long LOS (figure 2C), all the symbols are much closer together and already have relatively low within-hospital variation (small median SE). It is shown that the within-hospital variation remains approximately the same, and the between-hospital variation decreases when data are combined in the ordinal composite indicating more uniform outcomes for the composite measure across hospitals. This results in a reduced rankability of the composite measure than for long LOS for 8 (28%) of the 29 year–diagnosis combinations.

Discussion

This study aimed to assess whether increasing the number of events per hospital by combining data into composites or including data from multiple years improves the reliability of hospital rankings (rankability). We found that the rankability of mortality and acute readmission was mostly low. Combining multiple years generally improves rankability because of the higher number of events but remains low for both mortality and acute readmission in eight cases where these outcomes are infrequent. Combining data into the ‘textbook outcome’ improves rankability, except in comparison with long LOS where the within-hospital variation is already relatively low because of the higher number of events and the between-hospital variation decreased when combining outcomes. Similarly, combining data into the ordinal composite measure improves rankability, but less if the within-hospital variation is already small and the between-hospital variation reduces because of the combination of indicators. Given that rankability still remained low for some conditions when combining multiple years for mortality and acute readmission, it seems that combining data into composite measures may be a better solution to improve reliability of hospital rankings as these can be calculated for single years and therefore provide more actionable indicators as well as providing a more complete picture on quality of care.

The choice for which composite measure will be used may depend on the purpose and the end-users. Evidence suggests that patients would be more likely to use information on differences in quality of care when presented as a summary measure, such as a textbook outcome.27 28 The rankability of the ‘textbook outcome’ was shown to be moderate or high for all conditions and single years, is easy to interpret and an event-free hospital admission is what patients aim for. However, for hospital professionals or insurers, it does not provide sufficient information to be used for quality improvements as it does not show which of the outcomes should be improved. The ordinal composite measure provides this information combined with moderate or high rankabilities for most conditions and single years and is therefore better suited for quality improvements in hospitals or for insurers.

Comparison with previous studies in the literature

Our results are consistent with previous research showing that composite measures are more informative than existing quality indicators.8 9 29 The present study adds that the ordinal composite measure combines indicators and orders outcomes. This may affect hospital comparison as different combinations are now separated and ordered while using the ‘textbook outcome’ measures are lumped together, weighted equally and it is thus unknown which of the adverse outcomes perform worse. Including mortality in the composite measure is also important as it accounts for potential survivor bias30 given that an individual who dies can never be readmitted. Survival bias may exist when hospitals are compared based on readmission rates or long LOS without considering differences in mortality. A previous study found that hospital performance on readmissions significantly differed from hospital performance on a composite metric based on readmissions and mortality.31 In our study, we used a more extended composite measure including long LOS and also found that rankability of the composite measure improved compared with a single readmission indicator but added that for indicators like long LOS, this is not necessarily the case. In addition, more insight was provided into reasons for improved rankability by showing both the gain (or not) in within-hospital and between-hospital variations, as well as that it is often not valid to assume that between-hospital variation will remain the same when combining data into composites. Furthermore, we showed that low rankability occurs less frequently when data are combined in composite measures than when multiple years were combined.

Other studies focused on individual indicators and showed that the rankability of individual indicators differs, especially since it also depends on case-mix correction.2 4 21 For example, van Dishoeck et al 21 found a rankability of 80% for surgical-site infection (SSI) after colonic resection but 0% for caesarean section. Rankability was 8% in all operations combined, as the differences in SSI rates were explained mainly by case mix. Furthermore, Henneman et al 2 found a rankability of 38% for mortality after colorectal surgery in the period 2009–2011. We found a rankability of 30% (2009), 28% (2010) and 21% (2011) for mortality in patients with colorectal carcinoma, but we showed that if years are combined, the rankability increases and we found a rankability of 51% (2007–2009) and 49% (2010–2012). The rankability for colorectal carcinoma increases far more when indicators are combined (51%–68% across single years for ‘textbook outcome’ and 50%–67% for the ordinal composite measure). Another study found a rankability of 58% for in-hospital mortality for AMI and 51% for readmission after HF after correction for age in 2007.4 We found a rankability of 50% for in-hospital mortality and 41% for acute readmission after HF, which is probably lower because we used more variables for case-mix correction. Another possibility to improve the reliability of ranking is to cluster hospitals, as a previous study found that clustered intensive care units increased the rankability.32 Although this results in a higher rankability, the question is how this can be used to track changes in performance over time for individual hospitals as well as for patients to choose a particular hospital (based on their performance rather than that of a cluster).

Strengths and weaknesses

An adequate sample size is necessary to obtain a reliable ranking.1 A strength of this study is that the chosen indicators are routinely collected so that a large sample size was available including almost all hospitals in the Netherlands. However, a limitation is that only data from 2007 to 2012 were available as more recent data were not publicly available due to conversion from ICD-9-CM to ICD-10-CM. We analysed multiple 1-year periods to determine whether rankability was stable over the years and did not find large differences between years. Therefore, we think that our results are generalisable for more recent years. Furthermore, we were limited in the ability for case-mix adjustments because we used administrative data, while previous studies showed that the rankability of indicators depends on case-mix correction.2 4 21 If more detailed case-mix variables would have been available in the data, this may explain additional differences between-hospitals and thus may have resulted in the between-hospital variance being overestimated in the present study and thereby also the higher rankability. In addition, we were not able to distinguish different types of hospitals (eg, academic or public) since we used anonymous data (both on patient and hospital levels). This may increase between-hospital differences and results in a higher rankability if the within-hospital variation remains stable. However, since we compared rankability of individual and combined indicators, the improvement in rankability is likely to be less affected by both these case-mix adjustment issues because the lack of adjustment applies to both individual as combined indicators. Our data did not include information on mortality after discharge so that the results of our study only reflect a selection of mortality cases. Future studies should therefore also include postdischarge mortality to examine whether this affects the rankability since different mortality time-frames may result in differences in judgement regarding the performance of hospitals.33

Conclusion

We showed that combining data overall improves the rankability of hospital performance, particularly for mortality and acute readmission because the within-hospital variation decreases. Combining data into composite measures may be a better solution than combining multiple years to improve rankability. This gives a more complete picture of quality of care, as well as representing current or recent practice and thus is more actionable for quality improvement but is also less likely to result in low rankability of hospital performance. Whereas the ‘textbook outcome’ may have the best rankability, the ordinal composite measure may be more actionable for hospitals trying to improve given that this measures enables them to distinguish different adverse outcomes and target specific combinations of outcomes that are lumped into one category with the textbook outcome.

References

↵
2. Dimick JB ,
3. Welch HG ,
4. Birkmeyer JD
. Surgical mortality as an indicator of hospital quality: the problem with small sample size. JAMA 2004;292:847–51.doi:10.1001/jama.292.7.847
OpenUrl CrossRef PubMed Web of Science
↵
2. Henneman D ,
3. van Bommel AC ,
4. Snijders A , et al
. Ranking and rankability of hospital postoperative mortality rates in colorectal cancer surgery. Ann Surg 2014;259:844–9.doi:10.1097/SLA.0000000000000561
OpenUrl CrossRef PubMed
↵
2. Gibberd R ,
3. Hancock S ,
4. Howley P , et al
. Using indicators to quantify the potential to improve the quality of health care. Int J Qual Health Care 2004;16(Suppl 1):i37–43.doi:10.1093/intqhc/mzh019
OpenUrl CrossRef PubMed Web of Science
↵
2. van Dishoeck A-M ,
3. Lingsma HF ,
4. Mackenbach JP , et al
. Random variation and rankability of hospitals using outcome indicators. BMJ Qual Saf 2011;20:869–74.doi:10.1136/bmjqs.2010.048058
OpenUrl Abstract/FREE Full Text
↵
2. Robertsson O ,
3. Ranstam J ,
4. Lidgren L
. Variation in outcome and ranking of hospitals: an analysis from the Swedish knee arthroplasty register. Acta Orthop 2006;77:487–93.doi:10.1080/17453670610046442
OpenUrl CrossRef PubMed Web of Science
↵
2. Hofstede SN ,
3. van Bodegom-Vos L ,
4. Kringos DS , et al
. Mortality, readmission and length of stay have different relationships using hospital-level versus patient-level data: an example of the ecological fallacy affecting hospital performance indicators. BMJ Qual Saf 2018;27:474–83.doi:10.1136/bmjqs-2017-006776
OpenUrl Abstract/FREE Full Text
↵
2. Valori RM ,
3. Damery S ,
4. Gavin DR , et al
. A new composite measure of colonoscopy: the Performance Indicator of Colonic Intubation (PICI). Endoscopy 2018;50.doi:10.1055/s-0043-115897
↵
2. Dimick JB ,
3. Staiger DO ,
4. Osborne NH , et al
. Composite measures for rating hospital quality with major surgery. Health Serv Res 2012;47:1861–79.doi:10.1111/j.1475-6773.2012.01407.x
OpenUrl CrossRef PubMed Web of Science
↵
2. Dimick JB ,
3. Birkmeyer NJ ,
4. Finks JF , et al
. Composite measures for profiling hospitals on bariatric surgery performance. JAMA Surg 2014;149:10–16.doi:10.1001/jamasurg.2013.4109
OpenUrl
↵
2. Chen LM ,
3. Staiger DO ,
4. Birkmeyer JD , et al
. Composite quality measures for common inpatient medical conditions. Med Care 2013;51:832–7.doi:10.1097/MLR.0b013e31829fa92a
OpenUrl
↵
2. Kolfschoten NE ,
3. Kievit J ,
4. Gooiker GA , et al
. Focusing on desired outcomes of care after colon cancer resections; hospital variations in ’textbook outcome'. Eur J Surg Oncol 2013;39:156–63.doi:10.1016/j.ejso.2012.10.007
OpenUrl
↵
2. Busweiler LA ,
3. Schouwenburg MG ,
4. van Berge Henegouwen MI , et al
. Textbook outcome as a composite measure in oesophagogastric cancer surgery. Br J Surg 2017;104:742–50.doi:10.1002/bjs.10486
OpenUrl
↵
Statistics Netherlands. Results based on calculations by Leiden University Medical Center using non-public microdata.
↵
2. Van der Laan J
. Quality of the Dutch Medical Registration (LMR) for the calculation of the Hospital Standardised Mortality Ratio. The Hague 2013.
↵
2. De Bruin A ,
3. De Bruin EI ,
4. Gast A , et al
. Koppeling van LMR- en GBA-gegevens: methode, resultaten en kwaliteitsonderzoek. Voorburg 2003.
↵
2. Van der Laan J ,
3. De Bruin A ,
4. Van den Akker-Ploemacher J , et al
. HSMR 2013: Methodological report. 2014.
↵
2. Lingsma HF ,
3. Bottle A ,
4. Middleton S , et al
. Evaluation of hospital outcomes: the relation between length-of-stay, readmission, and mortality in a large international administrative database. BMC Health Serv Res 2018;18:116.doi:10.1186/s12913-018-2916-1
OpenUrl
↵
2. Marang-van de Mheen PJ ,
3. van Duijn-Bakker N ,
4. Kievit J
. Surgical adverse outcomes and patients' evaluation of quality of care: inherent risk or reduced quality of care? Qual Saf Health Care 2007;16:428–33.doi:10.1136/qshc.2006.021071
OpenUrl Abstract/FREE Full Text
↵
2. Lingsma HF ,
3. Eijkemans MJ ,
4. Steyerberg EW
. Incorporating natural variation into IVF clinic league tables: The Expected Rank. BMC Med Res Methodol 2009;9:53.doi:10.1186/1471-2288-9-53
OpenUrl CrossRef PubMed
↵
2. Lingsma HF ,
3. Steyerberg EW ,
4. Eijkemans MJ , et al
. Comparing and ranking hospitals based on outcome: results from The Netherlands Stroke Survey. QJM 2010;103:99–108.doi:10.1093/qjmed/hcp169
OpenUrl CrossRef PubMed Web of Science
↵
2. van Dishoeck AM ,
3. Koek MB ,
4. Steyerberg EW , et al
. Use of surgical-site infection rates to rank hospital performance across several types of surgery. Br J Surg 2013;100:628–37. Discussion 37.doi:10.1002/bjs.9039
OpenUrl CrossRef PubMed
↵
2. van Dishoeck AM ,
3. Lingsma HF ,
4. Mackenbach JP , et al
. Random variation and rankability of hospitals using outcome indicators. BMJ Qual Saf 2011;20:869–74.doi:10.1136/bmjqs.2010.048058
OpenUrl Abstract/FREE Full Text
↵
2. Dimick JB ,
3. Staiger DO ,
4. Birkmeyer JD
. Ranking hospitals on surgical mortality: the importance of reliability adjustment. Health Serv Res 2010;45(6 Pt 1):1614–29.doi:10.1111/j.1475-6773.2010.01158.x
OpenUrl CrossRef PubMed Web of Science
↵
2. de Boer D ,
3. Delnoij D ,
4. Rademakers J
. The discriminative power of patient experience surveys. BMC Health Serv Res 2011;11:332.doi:10.1186/1472-6963-11-332
OpenUrl CrossRef PubMed
↵
2. Charlson ME ,
3. Pompei P ,
4. Ales KL , et al
. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis 1987;40:373–83.doi:10.1016/0021-9681(87)90171-8
OpenUrl CrossRef PubMed Web of Science
↵
2. Bottle A ,
3. Middleton S ,
4. Kalkman CJ , et al
. Global comparators project: international comparison of hospital outcomes using administrative data. Health Serv Res 2013;48(6 Pt 1):2081–100.doi:10.1111/1475-6773.12074
OpenUrl CrossRef PubMed
↵
2. Dijs-Elsinga J ,
3. Otten W ,
4. Versluijs MM , et al
. Choosing a hospital for surgery: the importance of information on quality of care. Med Decis Making 2010;30:544–55.doi:10.1177/0272989X09357474
OpenUrl CrossRef PubMed Web of Science
↵
2. Marang-van de Mheen PJ ,
3. Dijs-Elsinga J ,
4. Otten W , et al
. The relative importance of quality of care information when choosing a hospital for surgical treatment: a hospital choice experiment. Med Decis Making 2011;31:816–27.doi:10.1177/0272989X10386799
OpenUrl CrossRef PubMed
↵
2. Dimick JB ,
3. Miller DC
. Hospital readmission after surgery: no place like home. Lancet 2015;386:837–9.doi:10.1016/S0140-6736(15)60462-7
OpenUrl
↵
2. Diehr P ,
3. Patrick DL
. Trajectories of health for older adults over time: accounting fully for death. Ann Intern Med 2003;139(5 Pt 2):416–20.doi:10.7326/0003-4819-139-5_Part_2-200309021-00007
OpenUrl CrossRef PubMed Web of Science
↵
2. Glance LG ,
3. Li Y ,
4. Dick AW
. Impact on hospital ranking of basing readmission measures on a composite endpoint of death or readmission versus readmissions alone. BMC Health Serv Res 2017;17:327.doi:10.1186/s12913-017-2266-4
OpenUrl
↵
2. Verburg IW ,
3. de Keizer NF ,
4. Holman R , et al
. Individual and clustered rankability of ICUs according to case-mix-adjusted mortality. Crit Care Med 2016;44:901–9.doi:10.1097/CCM.0000000000001521
OpenUrl
↵
2. Pouw ME ,
3. Peelen LM ,
4. Moons KG , et al
. Including post-discharge mortality in calculation of hospital standardised mortality ratios: retrospective analysis of hospital episode statistics. BMJ 2013;347:f5913.doi:10.1136/bmj.f5913
OpenUrl Abstract/FREE Full Text

Footnotes

Contributors PJM-vdM designed the study. SNH wrote the article and carried out the study. PJM-vdM supervised the study and writing of the manuscript. All authors have critically read and modified both the study protocol and previous drafts of the manuscript and have approved the final version. All authors read and approved the final manuscript.
Funding This study was funded by ZonMw (10.13039/501100001826) and grant number 516022513.
Competing interests None declared.
Patient consent Not required.
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement We used routinely collected administrative admission data of the Dutch National Medical Registration (LMR) from 2007 to 2012 retrieved from Statistics Netherlands. To use these data, please contact Statistics Netherlands.

Linked Articles

Editorial
Composite measures of healthcare quality: sensible in theory, problematic in practice

Rocco Friebel Adam Steventon
BMJ Quality & Safety 2018; 28 85-88 Published Online First: 17 Sep 2018. doi: 10.1136/bmjqs-2018-008280

[1] ↵

Dimick JB ,
Welch HG ,
Birkmeyer JD
. Surgical mortality as an indicator of hospital quality: the problem with small sample size. JAMA 2004;292:847–51.doi:10.1001/jama.292.7.847
OpenUrl CrossRef PubMed Web of Science

[3] Dimick JB ,

[4] Welch HG ,

[5] Birkmeyer JD

[6] ↵

Henneman D ,
van Bommel AC ,
Snijders A , et al
. Ranking and rankability of hospital postoperative mortality rates in colorectal cancer surgery. Ann Surg 2014;259:844–9.doi:10.1097/SLA.0000000000000561
OpenUrl CrossRef PubMed

[8] Henneman D ,

[9] van Bommel AC ,

[10] Snijders A , et al

[11] ↵

Gibberd R ,
Hancock S ,
Howley P , et al
. Using indicators to quantify the potential to improve the quality of health care. Int J Qual Health Care 2004;16(Suppl 1):i37–43.doi:10.1093/intqhc/mzh019
OpenUrl CrossRef PubMed Web of Science

[13] Gibberd R ,

[14] Hancock S ,

[15] Howley P , et al

[16] ↵

van Dishoeck A-M ,
Lingsma HF ,
Mackenbach JP , et al
. Random variation and rankability of hospitals using outcome indicators. BMJ Qual Saf 2011;20:869–74.doi:10.1136/bmjqs.2010.048058
OpenUrl Abstract/FREE Full Text

[18] van Dishoeck A-M ,

[19] Lingsma HF ,

[20] Mackenbach JP , et al

[21] ↵

Robertsson O ,
Ranstam J ,
Lidgren L
. Variation in outcome and ranking of hospitals: an analysis from the Swedish knee arthroplasty register. Acta Orthop 2006;77:487–93.doi:10.1080/17453670610046442
OpenUrl CrossRef PubMed Web of Science

[23] Robertsson O ,

[24] Ranstam J ,

[25] Lidgren L

[26] ↵

Hofstede SN ,
van Bodegom-Vos L ,
Kringos DS , et al
. Mortality, readmission and length of stay have different relationships using hospital-level versus patient-level data: an example of the ecological fallacy affecting hospital performance indicators. BMJ Qual Saf 2018;27:474–83.doi:10.1136/bmjqs-2017-006776
OpenUrl Abstract/FREE Full Text

[28] Hofstede SN ,

[29] van Bodegom-Vos L ,

[30] Kringos DS , et al

[31] ↵

Valori RM ,
Damery S ,
Gavin DR , et al
. A new composite measure of colonoscopy: the Performance Indicator of Colonic Intubation (PICI). Endoscopy 2018;50.doi:10.1055/s-0043-115897

[33] Valori RM ,

[34] Damery S ,

[35] Gavin DR , et al

[36] ↵

Dimick JB ,
Staiger DO ,
Osborne NH , et al
. Composite measures for rating hospital quality with major surgery. Health Serv Res 2012;47:1861–79.doi:10.1111/j.1475-6773.2012.01407.x
OpenUrl CrossRef PubMed Web of Science

[38] Dimick JB ,

[39] Staiger DO ,

[40] Osborne NH , et al

[41] ↵

Dimick JB ,
Birkmeyer NJ ,
Finks JF , et al
. Composite measures for profiling hospitals on bariatric surgery performance. JAMA Surg 2014;149:10–16.doi:10.1001/jamasurg.2013.4109
OpenUrl

[43] Dimick JB ,

[44] Birkmeyer NJ ,

[45] Finks JF , et al

[46] ↵

Chen LM ,
Staiger DO ,
Birkmeyer JD , et al
. Composite quality measures for common inpatient medical conditions. Med Care 2013;51:832–7.doi:10.1097/MLR.0b013e31829fa92a
OpenUrl

[48] Chen LM ,

[49] Staiger DO ,

[50] Birkmeyer JD , et al

[51] ↵

Kolfschoten NE ,
Kievit J ,
Gooiker GA , et al
. Focusing on desired outcomes of care after colon cancer resections; hospital variations in ’textbook outcome'. Eur J Surg Oncol 2013;39:156–63.doi:10.1016/j.ejso.2012.10.007
OpenUrl

[53] Kolfschoten NE ,

[54] Kievit J ,

[55] Gooiker GA , et al

[56] ↵

Busweiler LA ,
Schouwenburg MG ,
van Berge Henegouwen MI , et al
. Textbook outcome as a composite measure in oesophagogastric cancer surgery. Br J Surg 2017;104:742–50.doi:10.1002/bjs.10486
OpenUrl

[58] Busweiler LA ,

[59] Schouwenburg MG ,

[60] van Berge Henegouwen MI , et al

[61] ↵
Statistics Netherlands. Results based on calculations by Leiden University Medical Center using non-public microdata.

[62] ↵

Van der Laan J
. Quality of the Dutch Medical Registration (LMR) for the calculation of the Hospital Standardised Mortality Ratio. The Hague 2013.

[64] Van der Laan J

[65] ↵

De Bruin A ,
De Bruin EI ,
Gast A , et al
. Koppeling van LMR- en GBA-gegevens: methode, resultaten en kwaliteitsonderzoek. Voorburg 2003.

[67] De Bruin A ,

[68] De Bruin EI ,

[69] Gast A , et al

[70] ↵

Van der Laan J ,
De Bruin A ,
Van den Akker-Ploemacher J , et al
. HSMR 2013: Methodological report. 2014.

[72] Van der Laan J ,

[73] De Bruin A ,

[74] Van den Akker-Ploemacher J , et al

[75] ↵

Lingsma HF ,
Bottle A ,
Middleton S , et al
. Evaluation of hospital outcomes: the relation between length-of-stay, readmission, and mortality in a large international administrative database. BMC Health Serv Res 2018;18:116.doi:10.1186/s12913-018-2916-1
OpenUrl

[77] Lingsma HF ,

[78] Bottle A ,

[79] Middleton S , et al

[80] ↵

Marang-van de Mheen PJ ,
van Duijn-Bakker N ,
Kievit J
. Surgical adverse outcomes and patients' evaluation of quality of care: inherent risk or reduced quality of care? Qual Saf Health Care 2007;16:428–33.doi:10.1136/qshc.2006.021071
OpenUrl Abstract/FREE Full Text

[82] Marang-van de Mheen PJ ,

[83] van Duijn-Bakker N ,

[84] Kievit J

[85] ↵

Lingsma HF ,
Eijkemans MJ ,
Steyerberg EW
. Incorporating natural variation into IVF clinic league tables: The Expected Rank. BMC Med Res Methodol 2009;9:53.doi:10.1186/1471-2288-9-53
OpenUrl CrossRef PubMed

[87] Lingsma HF ,

[88] Eijkemans MJ ,

[89] Steyerberg EW

[90] ↵

Lingsma HF ,
Steyerberg EW ,
Eijkemans MJ , et al
. Comparing and ranking hospitals based on outcome: results from The Netherlands Stroke Survey. QJM 2010;103:99–108.doi:10.1093/qjmed/hcp169
OpenUrl CrossRef PubMed Web of Science

[92] Lingsma HF ,

[93] Steyerberg EW ,

[94] Eijkemans MJ , et al

[95] ↵

van Dishoeck AM ,
Koek MB ,
Steyerberg EW , et al
. Use of surgical-site infection rates to rank hospital performance across several types of surgery. Br J Surg 2013;100:628–37. Discussion 37.doi:10.1002/bjs.9039
OpenUrl CrossRef PubMed

[97] van Dishoeck AM ,

[98] Koek MB ,

[99] Steyerberg EW , et al

[100] ↵

van Dishoeck AM ,
Lingsma HF ,
Mackenbach JP , et al
. Random variation and rankability of hospitals using outcome indicators. BMJ Qual Saf 2011;20:869–74.doi:10.1136/bmjqs.2010.048058
OpenUrl Abstract/FREE Full Text

[102] van Dishoeck AM ,

[103] Lingsma HF ,

[104] Mackenbach JP , et al

[105] ↵

Dimick JB ,
Staiger DO ,
Birkmeyer JD
. Ranking hospitals on surgical mortality: the importance of reliability adjustment. Health Serv Res 2010;45(6 Pt 1):1614–29.doi:10.1111/j.1475-6773.2010.01158.x
OpenUrl CrossRef PubMed Web of Science

[107] Dimick JB ,

[108] Staiger DO ,

[109] Birkmeyer JD

[110] ↵

de Boer D ,
Delnoij D ,
Rademakers J
. The discriminative power of patient experience surveys. BMC Health Serv Res 2011;11:332.doi:10.1186/1472-6963-11-332
OpenUrl CrossRef PubMed

[112] de Boer D ,

[113] Delnoij D ,

[114] Rademakers J

[115] ↵

Charlson ME ,
Pompei P ,
Ales KL , et al
. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis 1987;40:373–83.doi:10.1016/0021-9681(87)90171-8
OpenUrl CrossRef PubMed Web of Science

[117] Charlson ME ,

[118] Pompei P ,

[119] Ales KL , et al

[120] ↵

Bottle A ,
Middleton S ,
Kalkman CJ , et al
. Global comparators project: international comparison of hospital outcomes using administrative data. Health Serv Res 2013;48(6 Pt 1):2081–100.doi:10.1111/1475-6773.12074
OpenUrl CrossRef PubMed

[122] Bottle A ,

[123] Middleton S ,

[124] Kalkman CJ , et al

[125] ↵

Dijs-Elsinga J ,
Otten W ,
Versluijs MM , et al
. Choosing a hospital for surgery: the importance of information on quality of care. Med Decis Making 2010;30:544–55.doi:10.1177/0272989X09357474
OpenUrl CrossRef PubMed Web of Science

[127] Dijs-Elsinga J ,

[128] Otten W ,

[129] Versluijs MM , et al

[130] ↵

Marang-van de Mheen PJ ,
Dijs-Elsinga J ,
Otten W , et al
. The relative importance of quality of care information when choosing a hospital for surgical treatment: a hospital choice experiment. Med Decis Making 2011;31:816–27.doi:10.1177/0272989X10386799
OpenUrl CrossRef PubMed

[132] Marang-van de Mheen PJ ,

[133] Dijs-Elsinga J ,

[134] Otten W , et al

[135] ↵

Dimick JB ,
Miller DC
. Hospital readmission after surgery: no place like home. Lancet 2015;386:837–9.doi:10.1016/S0140-6736(15)60462-7
OpenUrl

[137] Dimick JB ,

[138] Miller DC

[139] ↵

Diehr P ,
Patrick DL
. Trajectories of health for older adults over time: accounting fully for death. Ann Intern Med 2003;139(5 Pt 2):416–20.doi:10.7326/0003-4819-139-5_Part_2-200309021-00007
OpenUrl CrossRef PubMed Web of Science

[141] Diehr P ,

[142] Patrick DL

[143] ↵

Glance LG ,
Li Y ,
Dick AW
. Impact on hospital ranking of basing readmission measures on a composite endpoint of death or readmission versus readmissions alone. BMC Health Serv Res 2017;17:327.doi:10.1186/s12913-017-2266-4
OpenUrl

[145] Glance LG ,

[146] Li Y ,

[147] Dick AW

[148] ↵

Verburg IW ,
de Keizer NF ,
Holman R , et al
. Individual and clustered rankability of ICUs according to case-mix-adjusted mortality. Crit Care Med 2016;44:901–9.doi:10.1097/CCM.0000000000001521
OpenUrl

[150] Verburg IW ,

[151] de Keizer NF ,

[152] Holman R , et al

[153] ↵

Pouw ME ,
Peelen LM ,
Moons KG , et al
. Including post-discharge mortality in calculation of hospital standardised mortality ratios: retrospective analysis of hospital episode statistics. BMJ 2013;347:f5913.doi:10.1136/bmj.f5913
OpenUrl Abstract/FREE Full Text

[155] Pouw ME ,

[156] Peelen LM ,

[157] Moons KG , et al

Log in using your username and password

Main menu

Log in using your username and password

You are here

Abstract

Statistics from Altmetric.com

Request Permissions

Introduction

Methods

Study population

Definitions

Statistical analysis

Rankability

Case-mix adjustment

Results

Population

Rankability of individual indicators for each year

Combining data in 3-year periods

Combining data into composite measures

Within-hospital and between-hospital variations

Discussion

Comparison with previous studies in the literature

Strengths and weaknesses

Conclusion

References

Footnotes

Linked Articles

Read the full text or download the PDF:

Log in using your username and password