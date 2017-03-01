Key messages Better process-based performance measures do not always correlate with better outcomes. For instance, hospitals that reduce door-to-balloon (D2B) time do not necessarily reduce 30-day mortality for patients with acute myocardial infarction (MI) treated with primary percutaneous coronary intervention.

This disconnect may represent an ‘ecological fallacy’. For individual patients, a shorter D2B reduces the risk of death. But, hospitals that have reduced their D2B also tend to treat complex patients with a higher risk of death, hence the apparent failure to translate improved processes into better outcomes.

A further problem relates to the denominators of eligible patients for a given process measure. Hospitals may generate different denominators for the relevant patient population if more complex patients are included, particularly in hospitals excluding more patients. This may explain the worse performance on process measures for acute MI patients in hospitals excluding more patients, rather than reduced quality of care.

Variations in case-mix between hospitals may affect interpretation of process-based quality measures, not just patient outcomes.

Efforts to improve quality of care in hospitals often start by comparing clinical processes between hospitals. However, earlier studies have suggested that better performance in process measures believed to be clinically meaningful may not always be linked to improved patient outcomes. At times, this unexpected finding has led to enormous confusion between quality experts and clinicians. In this issue, Bruckel et al1 use data on patients with acute myocardial infarction to tackle a potentially key aspect of this paradox by focusing on the ‘denominator problem’—the observation that large numbers of patient exclusions from many process measures may erode the ability to judge hospitals on the quality of care delivered.

Some allowance for exclusions is widely seen as necessary to ensure enough homogeneity in patients to allow for meaningful comparisons between hospitals. However, the use of exclusions may also raise concerns for process measures. For example, a large number of exclusions may prevent a comprehensive understanding of quality of care in important patient subgroups not captured by the measure. In addition, a novel and less obvious finding by Bruckel et al was that higher rates of exclusions generally correlated with worse performance for the process measures, suggesting that hospitals with large exclusions may have important gaps in quality. Taken together, the authors recommend public reporting of the number and reasons for exclusions as a means to better facilitate comparisons across hospitals.

We largely agree. These findings may explain the ‘missing link’ between why hospital performance in process measures is not always correlated with patient outcomes by postulating that hospitals with large numbers of exclusions are worse performers possibly attempting to ‘game’ the system. However, there is another key factor that may need to be considered as well and could lead to an opposite conclusion: the ecological fallacy. Several examples have been described, both outside and inside healthcare, on how the wrong conclusion could be drawn by assuming that the relationship observed at the hospital (ie, group) level would be the same at the individual level. Indeed, it may be that hospitals with larger exclusions are providing better care.

The ecological fallacy The ecological fallacy refers to an erroneous inference about individuals on the basis of findings for the group to which those individuals belong.2 The term was first coined by Selvin in 1958, but earlier papers had already described the phenomenon with the earliest example by Émile Durkheim in 1897, who found that suicide rates in 19th-century Europe were higher in provinces that were heavily Protestant and concluded that stronger social control among Catholics resulted in lower suicide rates.3 However, as pointed out by Morgenstern, none of the regions were entirely Protestant or Catholic, so it may in fact have been the Catholics living in a predominantly Protestant area who were committing suicide.4 It seems quite plausible that members from a minority may have been more likely to commit suicide. So Morgenstern pointed to the possibility of the ecological fallacy, but others have questioned the presence of the ecological fallacy in Durkheim's work.5 Another compelling example was described in 1950 by Robinson.6 Using data from 48 US states, he showed that states with a higher proportion of immigrants also had higher literacy rates (correlation of 0.53). However, at the individual level, the correlation was inverted with immigrants being less literate than native citizens (correlation of −0.11). So the aggregated state-level correlation gave the incorrect inference of the correlation for the individuals in those states. It was caused by the fact that immigrants tended to settle in states where the native population was relatively more literate, thereby reversing the association. Similarly, a New England Journal of Medicine paper reported a strong (r=0.79) correlation between countries' annual per capita chocolate consumption and the number of Nobel laureates per 10 million persons.7 Perhaps to no surprise, Switzerland was the top performer in both chocolate consumption and Nobel laureates. Interpreting the slope of the fitted regression line, it was estimated that about 0.4 kg of chocolate per capita per year was needed to increase the number of Nobel laureates in a given country by 1, which would amount to 125 million kg per year for the USA. However, before accepting that this would be a causal relationship that also exists at the individual level and grossly stimulating chocolate consumption, we have to consider whether the aggregate consumption is in fact a good predictor of the individual chocolate consumption by the Nobel laureates (besides possible confounders such as differences in socio-economic status between countries).8 Based on these (aggregate) data, we cannot exclude the possibility that the ecological fallacy is operating and that within each country academics in fact have the lowest chocolate consumption, thus an inverse association compared with that observed on the aggregate country level. Figure 1 illustrates how the ecological fallacy works in these examples, with the hypothetical data points showing the negative association within each country or state, and the fitted line illustrating how this might give a strong positive association across countries or states when only considering the average (aggregate) exposure and ignoring the (individual) distribution. Figure 1 Example of ecological fallacy with hypothetical data. The graph shows the relationship between a hypothetical outcome and exposure of interest in five countries (each cluster of data points corresponds to a different country). Within each country, higher exposure is associated with worse outcomes, hence the decrease from left to right within each cluster. However, using group-level data, the opposite picture emerges (represented by the fitted black line). Countries with greater average exposure also have better average outcomes. The ecological fallacy in this example is thus quite striking as the group-level data (country averages) suggest that more exposure produces better outcomes, whereas within each country, patients with greater exposure tend to have worse outcomes.

Relevance of ecological fallacy for quality improvement How is the ecological fallacy relevant for quality improvement? As mentioned above, we often hope to learn from variation between outcomes of hospitals at the aggregate level in relation to differences at the individual level in patients and clinical processes. Thus, it is important to realise that outcomes and processes do not necessarily have to occur in the same individual patients, given what we know about the ecological fallacy. Among the reasons for different relationships at different levels of analysis are loss of information within higher levels of analysis (aggregation bias), different confounders at different levels of analysis and effect modification.9 For example, a hospital may have a higher mortality rate in patients with ST-segment-elevated myocardial infarction (STEMI) undergoing percutaneous coronary intervention (PCI) compared with other hospitals as well as a higher door-to-balloon (D2B) time. This does not necessarily mean that the patients dying were those with higher D2B times. Such conclusion can only be drawn when analysing the individual patient-level data, as illustrated recently. D2B time is known to predict survival in patients with STEMI undergoing primary PCI10–12 and is considered to be a causal relationship based on animal13 and observational studies.14 As a result, guidelines in many countries, for example, the USA,15 require a D2B <90 min for all patients undergoing primary PCI to ensure good quality of care. However, the cardiovascular community was recently alarmed by reports that contemporary decreases in annual D2B times have not been associated with lower mortality over time in patients undergoing primary PCI,16 ,17 which raised uncertainty whether quality initiatives were directed at the right processes. Nallamothu et al18 have cautioned against inferring from these (aggregate level) results that a decrease in D2B time would not improve outcomes for individual patients, which would be another example of the ecological fallacy. They distinguished the relationship between D2B time and mortality on a patient level from the (aggregate-level) secular trends over time and showed a consistent relationship between D2B time and mortality in all years but that the relationship had become steeper over time. This resulted in annual mortality—on an aggregate level—to remain the same or even increase over time. Explanations for the secular trends towards higher mortality in the population undergoing primary PCI include the expanded use of primary PCI over time in more complex patients with STEMI that would not have reached the cardiac catheterisation laboratory in earlier years.