Public reporting of surgeon outcomes: low numbers of procedures lead to false complacency

doi:10.1016/S0140-6736(13)61491-9

The Lancet

Volume 382, Issue 9905, 16–22 November 2013, Pages 1674-1677

https://doi.org/10.1016/S0140-6736(13)61491-9 Get rights and content

Summary

The English National Health Service published outcome information for individual surgeons for ten specialties in June, 2013. We looked at whether individual surgeons do sufficient numbers of procedures to be able to reliably identify those with poor performance. For some specialties, the number of procedures that a surgeon does each year is low and, as a result, the chance of identifying a surgeon with increased mortality rates is also low. Therefore, public reporting of individual surgeons' outcomes could lead to false complacency. We recommend use of outcomes that are fairly frequent, considering the hospital as the unit of reporting when numbers are low, and avoiding interpretation of no evidence of poor performance as evidence of acceptable performance.

Introduction

From the summer of 2013, outcomes of some surgical procedures will be reported for individual surgeons as part of the English National Health Service (NHS) Commissioning Board's new policy.¹ This policy follows the example of the Society for Cardiothoracic Surgery in Great Britain and Ireland (SCTS)² and several US states (eg, New York³), which report mortality for adult cardiac procedures by surgeon. The aim is to allow patients to choose their surgeon and clinicians to improve outcomes of care. However, when overall numbers of specific procedures are low, correct identification of a surgeon with poor performance is challenging, even if mortality is high.⁴ The danger is that low numbers mask poor performance and lead to false complacency.

We examine this issue in relation to reporting of surgical mortality for individual surgeons for adult cardiac surgery, plus key procedures in three other specialties: oesophagectomy or gastrectomy for oesophagogastric cancer; bowel cancer resection; and hip fracture surgery. We address three questions. First, what number of procedures is necessary for reliable detection of poor performance? Second, how many surgeons in each specialty actually do this number of procedures in a period of 1, 3, or 5 years? Third, what is the probability that a surgeon identified as a statistical outlier has truly poor performance? Finally, we offer recommendations about how surgeon performance can be assessed in a meaningful way. We used postoperative mortality as an example to address these questions, because it is the outcome that will be reported for English surgeons this summer.

Section snippets

Number of procedures

The number of adult cardiac surgeries done in NHS hospitals is fairly high: 50% of cardiac surgeons do between 60 and 170 per year.² Many other procedures are done less frequently, which means statistical power is poor and that poorly performing surgeons are unlikely to be correctly identified. In this context, statistical power is the probability that a surgeon with poor performance will be detected as a statistical outlier—ie, as significantly worse than average. For example, 80% power means

Proportion of surgeons who do the necessary number of procedures

We estimated the proportion of surgeons who do a sufficient number of procedures to achieve 60%, 70%, and 80% power to detect poor performance (table 2).2, 5 These proportions are calculated for reporting periods of 1, 3, and 5 years, assuming that the overall rate of mortality remains constant with time. The SCTS reports surgeon-level mortality with 3 years of data.² Its data show that about three-quarters of surgeons do sufficient numbers of cardiac operations to achieve 60% statistical power

Correct identification of poor performance

Not all surgeons identified as statistical outliers will truly have poor performance. The proportion correctly identified as having poor performance is known as the positive predictive value.¹⁰ The number correctly identified depends on the significance threshold, how many procedures a surgeon does, and the prevalence of poor performance. With standard diagnostic reasoning, it can be calculated that, if one in 20 cardiac surgeons truly had poor performance, 63% would be correctly identified on

Improving statistical power

There are options for improvements in statistical power other than the pooling of data over time, but each introduces problems of its own. First, data for different procedures could be pooled. However when outcomes differ between procedures, this approach could prevent fair comparisons of outcomes. We grouped gastrectomy, which has a mortality of 6·9%, with oesophagectomy, which has a mortality of 5·7%.⁷ Additionally, cardiac procedures were grouped together, combining coronary bypass surgery

Implications of the new policy

Reporting of outcomes for individual surgeons for cardiac surgery in the UK has largely been viewed as a success.11, 12 As we have shown, numbers of cardiac surgeries are sufficient to allow the process of detection to operate with reasonable statistical power. However, we believe that consultant-level reporting could be far less effective for other specialties. The concern about false identification of poor performance has received much attention in view of the stigma attached to poor

Wider issues

Several wider issues have been raised previously about the reporting of surgeon outcomes, mainly related to adequate adjustment for patient case mix, the accuracy with which the responsible surgeon can be identified, and the shared responsibility for the care of patients within teams.¹⁷ Operative mortality, including unavoidable deaths, might not be a good proxy for preventable mortality. Of particular relevance is the mean proportion of deaths that can be prevented: if this proportion is low,

References (21)

EL Hannan et al.
The New York State cardiac registries: history, contributions, limitations, and lessons for future efforts to assess and publicly report healthcare outcomes
J Am Coll Cardiol
(2012)
R Lilford et al.
Use and misuse of process and outcome data in managing performance of acute medical care: avoiding institutional stigma
Lancet
(2004)
DM Shahian et al.
Autonomy, beneficence, justice, and the limits of provider profiling
J Am Coll Cardiol
(2012)
Everyone counts: planning for patients 2013/14
UK surgeons' results
DG Altman et al.
Absence of evidence is not evidence of absence
BMJ
(1995)
Hospital episode statistics
National Report 2012
National oesophago-gastric cancer audit
National bowel cancer audit

There are more references available in the full text version of this article.

Cited by (94)

Quality indicators for evaluating cancer care in low-income and middle-income country settings: a multinational modified Delphi study
2024, The Lancet Oncology
This Policy Review sourced opinions from experts in cancer care across low-income and middle-income countries (LMICs) to build consensus around high-priority measures of care quality. A comprehensive list of quality indicators in medical, radiation, and surgical oncology was identified from systematic literature reviews. A modified Delphi study consisting of three 90-min workshops and two international electronic surveys integrating a global range of key clinical, policy, and research leaders was used to derive consensus on cancer quality indicators that would be both feasible to collect and were high priority for cancer care systems in LMICs. Workshop participants narrowed the list of 216 quality indicators from the literature review to 34 for inclusion in the subsequent surveys. Experts’ responses to the surveys showed consensus around nine high-priority quality indicators for measuring the quality of hospital-based cancer care in LMICs. These quality indicators focus on important processes of care delivery from accurate diagnosis (eg, histologic diagnosis via biopsy and TNM staging) to adequate, timely, and appropriate treatment (eg, completion of radiotherapy and appropriate surgical intervention). The core indicators selected could be used to implement systems of feedback and quality improvement.
Measuring variation in the quality of systemic anti-cancer therapy delivery across hospitals: A national population-based evaluation
2023, European Journal of Cancer
To date, there has been little systematic assessment of the quality of care associated with systemic anti-cancer therapy (SACT) delivery across national healthcare systems. We evaluated hospital-level toxicity rates during SACT treatment as a means of identifying variation in care quality.
All colorectal cancer (CRC) patients receiving SACT within 106 English National Health Service (NHS) hospitals between 2016 and 2019 were included.
Severe acute toxicity rates were derived from hospital administrative data using a validated coding framework. Variation in hospital-level toxicity rates was assessed separately in the adjuvant and metastatic settings. Toxicity rates were adjusted for age, sex, comorbidity, performance status, tumour site, and TNM staging.
Eight thousand one hundred and seventy three patients received SACT in the adjuvant setting, and 7,683 patients in the metastatic setting. Adjusted severe acute toxicity rates varied between hospitals from 11% to 49% for the adjuvant cohort, and from 25% to 67% for the metastatic cohort.
Compared to the national mean toxicity rate in the adjuvant cohort, six hospitals were more than two standard deviations (2SD) above, and four hospitals were more than 2SD below. In the metastatic cohort, six hospitals were more than 2SD above, and seven hospitals were more than 2SD below the national mean toxicity rate.
Overall, 12 hospitals (12%) had toxicity rates more than 2SD above the national mean, and 11 (10%) had rates more than 2SD below.
There is substantial variation in hospital-level severe acute toxicity rates in both the adjuvant and metastatic settings, despite risk-adjustment. Ongoing reporting of this performance indicator can be used to focus further investigation of toxicity rates and stimulate quality improvement initiatives to improve care.
A Comparison of the Surgical Practice of Potential Revision Outlier Joint Replacement Surgeons With Non-outliers: A Case Control Study From the National Joint Registry for England, Wales, Northern Ireland and the Isle of Man
2021, Journal of Arthroplasty
The National Joint Registry for England, Wales, Northern Ireland and the Isle of Man (NJR) has monitored the performance of consultant surgeons performing primary total hip (THR) or knee replacements (KR) since 2007. The aims of this study were: 1) To describe the surgical practice of consultant hip and knee replacement surgeons in the National Joint Registry for England and Wales (NJR), stratified by potential outlier status for revisions. 2) To compare the practice of revision outlier and non-outlier surgeons.
We combined NJR primary THR and KR data from 2008-2017 separately with relevant anonymised NJR outlier notification records. We described the surgical practice of outliers and non-outliers by surgical workload, implant choice, and patients’ clinical and demographic characteristics. We explored associations between surgeon-level factors and outlier status with conditional logistic regression models.
We included 764,888 primary THRs by 3213 surgeons and 889,954 primary KRs by 3084 surgeons performed between 2008-2017. One hundred and eleven (3.5%) THR and 114 (3.7%) KR consultant surgeons were potential revision outliers. Surgeons who used more types of implant had increased odds of being an outlier (KR: OR/additional implant = 1.35, 95%CI 1.17-1.55; THR: OR = 1.12, 95%CI 1.06-1.18).
The use of more types of implant is associated with increased risk of being a potential revision outlier. Further research is required to understand why surgeons use many different implants and to what extent this is responsible for the effects observed here.
Commentary: Safety in numbers
2021, Journal of Thoracic and Cardiovascular Surgery
Perioperative mortality as a meaningful indicator: Challenges and solutions for measurement, interpretation, and health system improvement
2020, Anaesthesia Critical Care and Pain Medicine
Expanding global access to safe surgical and anaesthesia care is crucial to meet the health targets of the Sustainable Development Goals (SDGs). As global surgical volume increases, improving safety throughout the patient care pathway is a public health priority. At present, an estimated 4.2 million individuals die within 30 days of surgery each year, and many of these deaths are preventable. Important considerations for the collection and reporting of perioperative mortality data have been identified in the literature, but consensus has not been established on the best methodology for the quantification of excess surgical mortality at a hospital or health system level. In this narrative review, we address challenges in the use of perioperative mortality rates (POMR) for improving patient safety. First, we discuss controversies in the use of POMR as a health system indicator and suggest advantages for using a “basket” of procedure-specific mortality rates as an adjunct to gross POMR. We offer then solutions to challenges in the collection and reporting of POMR data, and propose interventions for improving care in the preoperative, operative, and postoperative periods. Finally, we discuss how health systems leaders and frontline clinicians can integrate surgical safety into both national health plans and patient care pathways to drive a sustainable safety revolution in perioperative care.
Prognostic Risk Modelling for Patients Undergoing Major Lower Limb Amputation: An Analysis of the UK National Vascular Registry
2020, European Journal of Vascular and Endovascular Surgery
Major lower limb amputation is the highest risk lower limb procedure in vascular surgery. Despite this, few high quality studies have examined factors contributing to mortality. The aim was to identify independent risk factors for peri-operative morbidity and mortality and develop reliable models for estimating risk.
All patients undergoing lower limb amputation above the ankle entered into the UK National Vascular Registry (January 2014–December 2016) were included. Missing data were handled using multiple imputation. Models were developed to evaluate independent risk factors for mortality (the primary outcome) and morbidity using logistic regression, minimising the Bayesian information criterion to balance complexity and model fit. Ethical approval for the study was granted (Wales REC 3 ref:16/WA/0353).
All 9549 above ankle joint amputations in the registry were included. Overall, 865 patients (9.1%) died before leaving hospital. Independent factors associated with mortality were emergency admission, bilateral operation, age, American Society of Anesthesiologists' grade, abnormal electrocardiogram, and increased white cell count or creatinine (p < .01 for all). Independent factors reducing mortality were transtibial operation, increased albumin or patient weight, and previous ipsilateral revascularisation procedures (p < .01 for all). A risk model incorporating these factors had good discrimination (C-statistic 0.79, 95% confidence interval 0.77–0.80) and excellent calibration. Morbidity rates were high, with 6.6%, 9.7%, and 4.3% of patients suffering cardiac, respiratory, and renal complications, respectively. The risk model was also predictive of morbidity outcomes (C-statistics 0.74, 0.69, and 0.74, respectively).
Morbidity and mortality after lower limb amputation are high in the UK. Some potentially modifiable factors for quality improvement initiatives have been identified and accurate predictive models that could assist patient counselling and decision making have been developed.

View all citing articles on Scopus

View full text

Public HealthPublic reporting of surgeon outcomes: low numbers of procedures lead to false complacency

Summary

Introduction

Section snippets

Number of procedures

Proportion of surgeons who do the necessary number of procedures

Correct identification of poor performance

Improving statistical power

Implications of the new policy

Wider issues

J Am Coll Cardiol

Lancet

J Am Coll Cardiol

Everyone counts: planning for patients 2013/14

UK surgeons' results

Absence of evidence is not evidence of absence

BMJ

Hospital episode statistics

National Report 2012

National oesophago-gastric cancer audit

National bowel cancer audit

Public Health
Public reporting of surgeon outcomes: low numbers of procedures lead to false complacency