Risk-adjustment schemes are used to monitor hospital performance, on the assumption that excess mortality not explained by case mix is largely attributable to suboptimal care. We have developed a model to estimate the proportion of the variation in standardised mortality ratios (SMRs) that can be accounted for by variation in preventable mortality. The model was populated with values from the literature to estimate a predictive value of the SMR in this context—specifically the proportion of those hospitals with SMRs among the highest 2.5% that fall among the worst 2.5% for preventable mortality. The extent to which SMRs reflect preventable mortality rates is highly sensitive to the proportion of deaths that are preventable. If 6% of hospital deaths are preventable (as suggested by the literature), the predictive value of the SMR can be no greater than 9%. This value could rise to 30%, if 15% of deaths are preventable. The model offers a ‘reality check’ for case mix adjustment schemes designed to isolate the preventable component of any outcome rate.

Hospital mortality rates are widely used as a measure of quality in developed countries. The Health Care Financing Administration (HCFA) released risk-adjusted mortality data, in the form of standardised mortality ratios (SMRs), on all Medicare patients admitted to hospitals in the USA in 1986.

Since a review of the relationship between mortality and quality of care found no empirical studies that directly report the relationship between SMRs and preventable mortality,

The model, though generic to any outcome, is explicated with respect to hospital mortality. For each hospital we assume that the rate of in-hospital mortality (M) can be partitioned into two components:

The proportion of the variance in SMRs attributable to preventable mortality—and the correlation between these quantities—depends on the contribution of preventable mortality to overall hospital mortality rates, and on the performance of the risk-adjustment scheme in eliminating variation due to differences in case mix.

The critical quantities are:

ξ: the average proportion of deaths that are preventable (the ‘preventability index’);

c_{V}: the coefficient of variation (defined as SD÷mean) of the preventable mortality rate across hospitals;

c_{M}: the coefficient of variation of the total in-hospital mortality rate;

R^{2}: the proportion of the variance in total mortality rates explained by the risk-adjustment process;

Q: the correlation coefficient between the hospital SMR and the preventable mortality rate.

The performance of the SMR as a proxy for preventable mortality is governed by the numerical value of Q, and Q^{2} can be interpreted as the proportion of the variation in SMRs attributable to preventable mortality. In the online appendix an upper bound for Q is derived under two assumptions. The first of these deals with the possibility that a high rate of natural (unavoidable) death (U) in a hospital might go hand in hand with a high rate of preventable death (V). In practice, the presence of such positive correlation is entirely plausible since patients at high intrinsic mortality risk are also those for whom medical error is likely to have the most catastrophic consequences. As stated, the assumption (A1) implies that all such correlation between U and V can be accounted for in terms of case mix factors that reflect that intrinsic risk. The second assumption is concerned with the variation in mortality rates among hospitals with identical case mixes. The simplest version of the assumption (A2) says that such variation, as measured by the statistical variance, is the same whatever the case mix happens to be. As demonstrated in the online supplementary appendix, this leads unequivocally to the bound on Q used throughout this paper.

It follows that Q^{2} will not exceed:

Assumption A2 may be a sensible first approximation, but it is open to question as an exact description of reality. Put simply, there is more scope for variability in rates at case mixes when the mean rate is high than at case mixes when it is low. For this reason an alternative assumption (A2′) has been entertained, which posits a proportional relationship between the variance and the square of the case mix-specific mean. (This could arise if case mix differences impact on the relative mortality risk between hospitals.) Under A2′ there may be a modest inflation in the bound for Q, leading to an estimated increase of up to 5% (or 10% for Q^{2}) in the base case described below (see online supplementary appendix). Such increases are not large enough to disturb the general conclusions of the paper.

The coefficient of variation of the overall mortality rate (c_{M}) for 143 Acute Hospital Trusts in England in 2007/8 was 0.19;_{M} of 0.2 has been assumed.

In the SHMI scheme proposed for the NHS, the proportion of the variance explained by risk adjustment has been estimated as 81%;^{2}=0.8 has therefore been assumed.

There appears to be no published study describing the variation in preventable death rates across hospitals. However, the variance of the between-hospital component of preventable adverse events is given as 0.15 by Zegers _{V} of 0.4 which, as it happens, is exactly twice the base case for c_{M}.

Studies of hospital deaths describe the proportion of deaths that may have been caused by clinical error (often at low probability),

The expression (1) imposes a severe constraint on the correlation between hospital SMRs and preventable mortality rates when the parameter values described above are used. For example, if 6% of deaths are preventable (as estimated by Hayward and Hofer), and base case assumptions are made (c_{M}=0.2, c_{V}=0.4, R^{2}=0.8), it follows that Q^{2} cannot exceed:

The point is reinforced if the SMR is treated as a formal diagnostic test for high rates of preventable mortality. Suppose that a warning is triggered if the SMR for a hospital places it among the worst 2.5% of all hospitals. This criterion corresponds to a ‘2-sigma’ action limit for the SMR. The diagnostic performance of this test depends on the value of Q^{2}. The positive predictive value (PPV) of such a warning for identifying a hospital with high preventable mortality will be very low indeed if Q^{2}<0.08, as suggested above. For instance, the PPV for identifying a hospital with a

At higher levels of the preventability index (ξ) the bound on Q^{2} in expression (1) becomes less stringent, and effective monitoring using the SMR will be correspondingly more feasible. The effect is illustrated in

Diagnostic performance of the standardised mortality ratio (2-sigma upper limit) to detect a hospital among the worst 2.5% for preventable deaths. The curve shows the dependency of the upper bound for positive predictive value (PPV) (or true positive rate (TPR)) on the preventability index under a risk-adjustment scheme accounting for 80% of the variation between hospitals. The base case relationship c_{V} = 2c_{M} is assumed.

The argument so far has relied on a base case value for c_{V} (=0.4) grounded in a study of preventable adverse events. Under this condition, preventable mortality rates vary fourfold between the 5th and 95th centiles (_{V} will tighten the constraint (1) on the correlation Q and thus reduce the scope for the SMR to diagnose poor care. This would be the case if, for example, the coefficient of variation was the same for preventable as for overall mortality. For the SMR to function as an effective proxy for preventable mortality rates, c_{V} would need to be higher—indeed, much higher—than the base case value. For example, when ξ=0.06, as in Hayward and Hofer, a PPV of 0.3 could be achieved only if c_{V} were increased from 0.4 to 1.0. This would mean that in a group of 20 randomly chosen hospitals the preventable death rate is, on average, more than 15 times higher in the worst hospital compared with the best one (

Three candidate distributions to describe variation among hospitals in rates of preventable mortality. The distributions are scaled to unit median and a log-normal model is assumed. Under the base case (c_{V}=0.4) the hospital in the 95th centile would have about four times the preventable mortality rate of the hospital at the 5th centile. Under the most dispersed distribution (c_{V}=1.0) the ratio between the 5th and 95th centiles (ie, 0.25 and 3.93) is more than 15, an implausibly large range across a random sample of 20 hospitals.

Satisfactory diagnostic performance could perhaps be achieved if both ξ and c_{V} were to exceed their base case values, but by smaller amounts. For example, case (b) in _{V} need be no more than 0.7 (case (b)) to achieve the same effect on the upper bound for Q^{2} as in case (c). Here preventable mortality rates still would need to differ by an arguably implausible factor of 8.0 between the 95th and 5th centiles.

The results indicate that worthwhile correlations between case-mix adjusted SMRs and rates of preventable mortality are not attainable unless rates of preventable mortality are either (a) much higher than current estimates suggest, or (b) implausibly variable between different hospitals. It can be argued that specificity is not crucial to the performance of SMRs, since they are used as a screen for hospitals requiring further investigation, not as a diagnostic trigger for sanction. However, there is always a trade-off between sensitivity and specificity even for screening tests. In the case of SMRs, high false positives waste resources, stigmatise hospitals, and lead to gaming;

Aside from quality of care, possible sources of variation in hospital SMRs include: differences in discharge policies leading to variations in underlying mortality rates; differences in recording practices for primary diagnoses or comorbidities;

One message from our study is that the diagnostic value of institution-level outcome data is critically dependent on the preventability index (ξ) as demonstrated in _{V}=0.4), consistent with a fourfold variation in rates across a representative sample of hospitals. However, there appear to be no direct estimates of this quantity in the literature. It may even be that our estimate is too high for a mortality measure in which eclectic performance across individuals and departments is aggregated at the hospital level. In such circumstances variation in the performance of individual clinical units may be diluted when these are combined together.

Other things being equal, standardised outcome rates will discriminate well for specific conditions in which preventability rates are high (eg, pressure ulcers, maternal deaths, deaths following elective surgery).

When the preventability index is low, as it is for hospital mortality and many other outcomes in healthcare, it may be necessary to fall back on direct measurement of process and outcome by examining individual cases in detail—for example, by case-note review. Currently, this method is expensive, though it may become easier as sophisticated electronic records become widespread. It is also subject to classification errors (eg, judgment of the preventability of deaths varies by case reviewers). Yet it may remain the only viable option for measuring preventable mortality rates unless there are further improvements in risk-adjustment technology.

Hospital standardised mortality ratios (SMRs) are markers of poor care only to the extent that they correlate with preventable mortality rates.

A mathematical model populated by empirical estimates for critical parameters suggests that this correlation is low.

If preventable deaths make up less than 15% of all deaths, then SMRs are poor diagnostic tests for suboptimal care.

We thank Michael Langman, FRCP, FFPM, FMedSci, Yen-fu Chen, PhD and Semira Manaseki-Holland, PhD (University of Birmingham) for helpful comments.

RJL conceived the idea for the paper and drafted the initial core manuscript; AJG derived the mathematical argument (with input from JW) and drafted the Methods and Results sections; AJG, TPH, JW, PJC, JPN and MAM contributed text and critically reviewed and commented on the document. RJL is the guarantor.

AJG, JW, PJC, RJL acknowledge financial support for the submitted work from the National Institute for Health Research (NIHR) Collaborations for Leadership in Applied Health Research and Care (CLAHRC) for Birmingham and Black Country; the West Midlands Quality Institute; and the EPSRC Multidisciplinary Assessment of Technology Centre for Healthcare (MATCH) programme (EPSRC grant GR/S29874/01).

All authors have completed the unified competing interest form at

The corresponding author (RJL) has been involved in statistical aspects of SMRs and the conduct of individual case note review. This prompted the idea that a mathematical model could be constructed to link preventable deaths revealed through detailed scrutiny of individual cases and overall death rates used to compare hospitals statistically. He discussed this idea with his more algebraically accomplished coauthors, and together the argument was developed.