Statistics from Altmetric.com
In this issue of BMJ Quality&Safety, Yamamoto and colleagues1 describe an innovative national quality improvement intervention to identify and remediate low-performing Japanese cardiac surgery programmes. This is the most recent of numerous quality initiatives by Japanese cardiothoracic surgical leaders,2–11 and they deserve recognition for their ambitious and ongoing efforts. In 2000, emulating the Society of Thoracic Surgeons Database,12 they developed the Japan Cardiovascular Surgery Database (JCVSD), the foundation for all their subsequent quality activities. Because their board certification requires JCVSD participation,4 this assures that every board-certified cardiac surgeon, and presumably every CT programme, has access to rigorous, nationally benchmarked results. With their newest quality programme, these results are used to target low-performing centres. What can we learn from their experience, and where might there be additional improvement opportunities?
Implementation of the Japanese site visit programme
Selection criteria for the Japanese site visit initiative are based on JCVSD outcomes data and include high-mortality outlier status for isolated coronary artery bypass grafting (CABG), heart valve or thoracic aortic surgical procedures, as well as hospital volumes and the number of procedures for which a hospital is an outlier. During the study period, 10 hospitals (about 1.7% of all Japanese programmes) were visited. We were not provided with the number of initial candidate hospitals that were high-mortality outliers for at least one procedure but were not visited. It is also not clear if selected hospitals were required to participate in the site visit intervention or whether this was voluntary.
Previsit observed mortality rates for site visit hospitals were quite high compared with non-visited hospitals (9.0% vs 2.7% for CABG, 10.7% vs 4.0% for heart valve procedures and 20.7% vs 7.5% for thoracic aortic procedures). They also had, for each of the three procedures, about double the rate of postoperative kidney injury requiring dialysis, which clearly suggests a specific area requiring attention.
Site visits were scheduled for 1 day and were conducted by senior cardiac surgeons who were directors of the Japanese Society for Cardiovascular Surgery. Hospital structures and processes of care were reviewed, but the time allotted appears to have been quite limited (30–60 min per the authors’ table 11). Most of each site visit was devoted to a comprehensive review of patient deaths, discussion of the findings with staff and development of an action plan whose implementation may not have been mandatory (‘…each hospital made voluntary improvements based on the report’).
Site visit initiative results
CABG outcomes at site visit hospitals improved soon after the quality improvement initiative began but were delayed for heart valve and thoracic aortic cases, which the authors speculate may reflect their greater complexity. Although postintervention (2015–2016) observed mortality rates did ultimately improve for heart valve and aortic procedures, the degree of improvement was quite modest. Observed mortality remained more than twice that of non-visit hospitals (9.6% vs 3.8% for heart valves and 18.8% vs 7.3% for aortic cases). Adjusted ORs (AORs) for site visit programmes quickly improved for CABG and became statistically indistinguishable from AORs for non-site visit programmes. However, valve AORs remained significantly higher than those at non-visit hospitals, and there was virtually no improvement in the AORs for aortic surgery until the very end of the postintervention study period (second half of 2016). Thus, the overall results of this innovative programme were inconsistent, with marked improvements noted only in CABG surgery.
Improving the improvement programme?
The authors used 99% Poisson CIs to identify observed to expected mortality outliers (ie, their entire CI exceeded unity for at least one procedure). These wide CIs obviously selected only the most extreme outliers. I believe these 10 hospitals had mortality rates so excessive that the voluntary site visit intervention was inadequate, and I disagree with the authors’ assertion that programme continuation during the interventions was a positive feature. Rather, voluntary or mandatory stand-down of these programmes should have occurred to protect the public while these programmes were being reviewed and the remediation plans were being implemented.
There are also several features of the intervention that could have been improved. First, the site visits appear to have been conducted solely by senior cardiac surgeon directors of their national organisation. For a comprehensive programme review, this is probably inadequate. Many individuals and specialties contribute to the success of a cardiac surgical programme, and ideally, a representative from each would be included in a review team—surgeons, anaesthesiologists, perfusionists, critical care physicians and nurses. At least some of these individuals should have not only content expertise but also formal training in quality and safety principles and performance improvement techniques.
The time allocated for reviewers to learn about the structures and processes of care at each cardiac hospital was inadequate, as these programme characteristics are critical to understanding and addressing low performance. Also, no time seems to have been provided for confidential interviews with members of the team, as well as referring cardiologists if available. In my experience, these individuals have provided invaluable insights into the reasons for suboptimal performance. Overall, an ideal site visit would probably require at least 1.5–2.0 days, depending on the size of the cardiac surgery programme.
While these reviews appropriately concluded with development of an action plan, it appears that the implementation of this plan was voluntary, which may partially explain the inconsistent results. Ideally, a workplan, assigned responsibilities and performance metrics should be developed. Government, relevant licensing bodies, or specialty organisations or boards should require evidence of effective plan implementation before programmes resume normal operations, and results must then be carefully monitored. The persistently quite elevated mortality rates for valve and aortic surgeries at site visit programmes are especially concerning. Rigorous, case-by-case monitoring should be required, perhaps using cumulative sum (CUSUM) or variable life adjusted display (VLAD) charts. Finally, mandatory public reporting of outcomes should be strongly considered to provide public accountability.
Though not the subject of the current study, professional societies should commit the resources necessary to conduct broad, inclusive, national quality improvement efforts for all providers, not just the worst performers. Initiatives in northern New England,13 14 Michigan15 16 and Virginia17 18 are excellent paradigms.
Low programmatic volumes: structural vulnerability of Japanese cardiac surgery
The authors fail to mention what I consider to be an important secondary observation in their study: the disproportionately large number of Japanese cardiac surgical programmes for the size of their population and the low procedural volumes per hospital. This is a long-standing, idiosyncratic and problematic structural characteristic of the Japanese system.5 19 It cannot be addressed by targeted, 1-day site visits to only 10 of nearly 600 cardiac surgery centres.
Similar to the detailed investigation of Miyata and colleagues5 in 2008 and my accompanying commentary,19 the current study reaffirms that Japanese cardiac surgery programmes, on average, have a very low volume. Assuming that their 2013–2016 study period encompasses four full years of activity at 590 centres, I calculate their average, annual isolated CABG volumes per hospital as 24.1 cases; average annual valve volumes were 35.8 cases per hospital, and average annual thoracic aortic volumes per hospital were 28.9 cases. Volumes at the high-mortality site visit programmes were generally similar.
Roughly one CABG every 2 weeks is insufficient to achieve optimum results, and these numbers are for hospitals, not surgeons. If there is more than one surgeon at a typical programme, which is likely, then the volume per surgeon is even lower. The volumes for valve and thoracic aortic surgery are slightly higher, but the average total hospital volume for these three major procedure groups, which should encompass the bulk of an adult cardiac practice, may be fewer than 100 cases per year.
Viewed from a slightly different perspective, based on Japanese census statistics20 and assuming 590 cardiac surgery hospitals, Japan has one cardiac surgery centre for every 179 910 adults aged 18 years and older. Based on US Census Bureau population data21 and numbers of cardiac surgery centres, representative US comparators in 2015 included one programme for every 236 801 adults in California, one programme for every 386 419 adults in Massachusetts and one programme for every 407 397 adult citizens in New York state. Thus, on a per capita basis, there are far more programmes in Japan than in representative US states, including several with long-standing public report cards and excellent results.
Despite the excessive number and low volumes of Japanese cardiac surgical centres, average mortality rates for the non-visit centres are reasonable, and these constitute the majority of programmes. However, we do not have data on the distribution of outcomes across hospitals. Given the low caseloads and random sampling variations, that distribution is likely wide. There are undoubtedly small programmes with low performance on at least one procedure whose wide CIs precluded outlier classification and site visit selection. Nonetheless, many low-volume programmes seem to have acceptable outcomes, and this merits further investigation. There may be organisational approaches to the staffing of cases in Japan (eg, dual attending coverage) or other structures and processes of care (eg, operating only in one hospital, and with one team22 23) that are useful for small cardiac surgical programmes, and these strategies would be important to understand.
Why is low volume a concern?
Low-volume providers merit close oversight for two reasons. First, since the pioneering work of Luft and colleagues24 in 1979, and subsequent hospital25 and surgeon26 studies of Birkmeyer and colleagues, it has been repeatedly demonstrated that a volume–outcome association exists for many procedures and conditions.24–29 In US studies, this association is present but weaker for some mature, relatively frequently performed procedures such as CABG,19 29–32 in contrast to other complex but less common operations, such as thoracic aortic surgery, pancreatectomy or oesophagectomy, where the volume–outcome associations are stronger.
Some surgeons in the USA now advocate for centres of excellence or reference centres for procedures with significant volume–outcome associations, such as mitral valve repair,33 34 and several prominent health policy experts (‘Take the Volume Pledge’35) and the Leapfrog Group have called for minimum procedural volume thresholds for hospitals and surgeons. Although evidence based and well motivated, such approaches are challenging to implement and must consider not only volumes but also the cumulative experience of surgeons and programmes, the impact on coverage schedules and the potential threat of relaxed appropriateness standards to satisfy volume criteria. Further, some patients who are unwilling to travel to regional centres outside their communities may not receive the surgical care they need.36 Finally, while the volume–outcome association exists on average, there are certainly high-performing, low-volume providers, as in the current study.
Notwithstanding the ‘as expected’ results of most Japanese programmes, their mortality rates for some procedures such as CABG seem somewhat high compared with those in the USA and Europe; although this could reflect the inclusion of high-risk, non-primary or emergent cases, this information was not provided. Regardless, could their results be even better if each hospital and surgeon were performing more cases37 38? At the very least, Japanese cardiac surgery leaders should consider whether a strategic consolidation of cardiac centres to fewer, higher volume programmes would be more appropriate.
In addition to the volume–outcome association, there is a second, equally compelling reason why low-volume programmes are challenging from a quality perspective. Small sample sizes and wide CIs make it quite difficult to reliably estimate the quality of the very programmes whose true performance is most likely to be suboptimal and to identify low-performing outliers.19 39–41 Potential approaches to help mitigate this issue include longer periods of provider observation (eg, 3 years of data rather than 1 year) and higher levels of attribution (eg, hospitals or systems rather than individual physicians or surgeons), both of which increase the number of cases and endpoints; broader diagnostic or procedural categories, which have the same effect but at the expense of less clinically coherent procedural groupings; composite measures encompassing multiple quality domains, which effectively increase the number of endpoints; and graphical methods such as funnel plots,42 which explicitly show the increased random variation at low volumes, or CUSUM or VLAD plots,43–45 which allow near-real time monitoring and identification of performance degradation.
Many though not all experts also believe that empirical Bayes shrinkage estimation40 46–54 or hierarchical modelling40 48 50 52 53 55–62 should be used for healthcare performance analyses, both of which provide more stable and accurate results for low-volume, small sample size programmes. Although generally preferred by statisticians, these approaches may be less sensitive to the detection of outliers, but also less likely to falsely label a provider as an outlier.52 53 55 56 59 60 63 64
An ambitious start, but …
The Japanese site visit initiative is a bold attempt by professionals to address low performance among members of their own specialty. Conceptually, this is exactly the kind of self-regulation and quality focus that the public should expect from professional leaders in exchange for the privileges and rights which it grants to them.65 Although the Japanese site visit initiative had strategic and operational weaknesses which contributed to its rather modest impact, these are all potentially solvable. Also, the value of public transparency does not yet appear to have been exploited in Japan, and this is an untapped opportunity. In contrast to the authors, I believe that an extremely low-performing programme should not be allowed to continue or resume operations until there is evidence that recommended changes have been successfully implemented.
The national impact of a few hospital site visits is limited, especially when there are other more basic structural issues. I believe there are far too many cardiac surgical programmes in Japan for the size of the population. A strategic, progressive consolidation or regionalisation into fewer and higher volume programmes would be the next evolutionary step for the progressive, highly skilled, Japanese cardiac surgical community.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Patient consent for publication Not required.
Provenance and peer review Commissioned; internally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.