Article Text

Download PDFPDF

Simpson's paradox: how performance measurement can fail even with perfect risk adjustment
Free
  1. Perla J Marang-van de Mheen1,
  2. Kaveh G Shojania2
  1. 1Department of Medical Decision Making, Leiden University Medical Centre, Leiden, The Netherlands
  2. 2University of Toronto Centre for Quality Improvement and Patient Safety, Sunnybrook Health Sciences Centre, Toronto, Canada
  1. Correspondence to Dr Perla J Marang-van de Mheen, Department of Medical Decision Making, Leiden University Medical Centre, PO Box 9600, Leiden 2300 RC, The Netherlands; p.j.marang{at}lumc.nl

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Efforts to measure quality using patient outcomes—whether hospital mortality rates or major complication rates for individual surgery—often become mired in debates over the adequacy of adjustment for case-mix. Some hospitals take care of sicker patients than other hospitals. Some surgeons operate on patients whom other surgeons feel exceed their skill levels. We do not want to penalise hospitals or doctors who accept referrals for more complex patients. Yet, we also do not want to miss opportunities for improvement. Maybe a particular hospital that cares for sicker patients achieves worse outcomes than other hospitals with similar patient populations.

This debate over the adequacy of case-mix adjustment dates back to Florence Nightingale's publication of league tables for mortality in 19th century English hospitals.1 We have made some progress. Some successes have involved supplementing the diagnostic codes and demographic information available in administrative data with a few key clinical variables.2 ,3 Particularly notable successes consist entirely of clinical variables collected for the sole purpose of predicting risk, such as the various prognostic scoring systems for critically ill patients, such as the Acute Physiology and Chronic Health Evaluation and the Simplified Acute Physiology Score4–6 and the National Surgical Quality Improvement Program.7 (Occasionally, research shows that an outcome measure does not require adjustment for case-mix.8)

But, what if comparing mortality rates (or other key patient outcomes) were problematic even with perfect case-mix adjustment? For example, suppose a 75-year-old man undergoing cardiac surgery has diabetes, mild kidney failure and a previous stroke and a 65-year–old woman has hypertension but no previous strokes or kidney problems. Suppose the case-mix adjustment model assigns a risk of death or major complications after surgery of 8% to the 75-year-old man and only 4% to the 65-year-old woman. And, let's say that over time, …

View Full Text

Linked Articles