Article Text

Download PDFPDF

Patterns of performance and improvement in US Medicare’s Hospital Star Ratings, 2016–2017
  1. Paula Chatterjee1,
  2. Karen Joynt Maddox2
  1. 1 General Internal Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
  2. 2 Department of Medicine, Cardiovascular Division, Washington University School of Medicine, St. Louis, Missouri, USA
  1. Correspondence to Dr Paula Chatterjee, General Internal Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA; pchat{at}pennmedicine.upenn.edu

Abstract

Background Publicly reported quality data can help consumers make informed choices about where to seek medical care. The Centers for Medicare and Medicaid Services developed a composite Hospital Compare Overall Star Rating for US acute-care hospitals in 2016. However, patterns of performance and improvement have not been previously described.

Objective To characterise high-quality and low-quality hospitals as assessed by Star Ratings.

Design We performed a retrospective cross-sectional study of 3429 US acute-care hospitals assigned Overall Star Ratings in both 2016 and 2017. We used multivariable logistic regression models to identify characteristics associated with receiving 4 or 5 stars.

Results Small hospitals were more likely to receive 4 or 5 stars in 2016 (33% of small hospitals, 26% of medium hospitals and 21% of large hospitals, OR for medium 0.78, p=0.02, and for large, 0.61, p=0.003). Non-profit status (OR 1.37, p=0.01), midwest region (OR=2.30, p<0.001), west region (OR 1.30 in 2016, p=0.06) and system membership (OR 1.33, p=0.003) were associated with higher odds of achieving a higher Star Rating. Hospitals with the most Medicaid patients were markedly less likely to receive 4 or 5 stars (OR for highest quartile=0.32, p<0.001), and hospitals with the highest proportion of Medicare patients were somewhat less likely to do so (OR for highest quartile=0.68, p=0.01). These associations remained largely consistent over the first two years of reporting and were also associated with the highest likelihood of improvement.

Conclusions Small hospitals with fewer Medicaid patients had the highest odds of performing well on Star Ratings. Further monitoring of these trends is needed as patients, clinicians and policymakers strive to use this information to promote high-quality care.

  • quality measurement
  • healthcare quality improvement
  • health policy

Statistics from Altmetric.com

Introduction

Publicly reported quality data can help consumers around the world make informed choices about where to seek medical care.1 2 For example, in England the National Health Service Choices public reporting website allows consumers to view hospital performance on a number of key quality indicators including mortality, patient safety and patient experience.3 4 In Germany, hospitals are required to produce annual quality reports covering >400 quality indicators5; certain regions of Italy have well-developed quality reporting programmes6 7; some Canadian provinces publish quality report cards for select surgeries,8 and the Netherlands publicly reports a consumer quality index for each hospital.9 In the USA, the Centers for Medicare and Medicaid Services (CMS) has required public reporting of individual quality measures since 2004. However, in 2015, in an effort to condense these reports into a more consumer-friendly format, CMS released its first ever Hospital Star Rating, which simply assigned each US acute-care hospital a score from 1 star to 5 stars. The first release of the Star Rating only comprised patient experience scores.10 The following year, in July 2016, CMS released the first Overall Hospital Star Ratings, this time consolidating hospital performance on 57 quality measures across seven domains (mortality, safety, readmission, patient experience, effectiveness of care, timeliness of care and efficient use of medical imaging) into a single summary score, again from 1 star to 5 stars.11

The introduction of the Overall Hospital Star Rating was controversial.12 13 Academics and hospital leaders alike argued that the Star Ratings failed to capture some critical elements of quality while overweighting others, possibly leading to a skewed view of hospital performance.13 14 Concerns were also raised that hospitals that serve complex populations, such as teaching hospitals and safety-net hospitals,15 could be disproportionately penalised under Star Ratings due to the clinical severity and socioeconomic vulnerability of the patients they serve. While prior work has examined the Overall Star Ratings and their relationship to the local hospital environment,16 and patterns of performance have been reported in the lay press12 and from private groups,17 18 to our knowledge, there has been no previous peer-reviewed report that compares characteristics of hospitals that have performed well or poorly on the Overall Star Ratings, nor changes in Star Ratings over time.

Therefore, in this study we sought to examine three main questions. First, what are the patterns of performance, both by domain and by measure, by hospitals in each Star Rating group? Second, what types of hospitals scored well and which hospitals scored poorly on Star Ratings in 2016 and 2017? Finally, which hospitals were most likely to improve their Star Ratings from 2016 to 2017?

Methods

Data

We obtained publicly available hospital Star Ratings from Hospital Compare representing the December 2016 release and the December 2017 release (of note, 2018 Star Ratings have not been released as of the writing of this manuscript due to technical issues). Star Ratings comprised measures related to mortality (22% wt in overall score), safety of care (22%), readmissions (22%), patient experience (22%), effectiveness of care (4%), timeliness of care (4%) and efficient use of medical imaging (4%). We also downloaded performance scores for each available constituent measure within the seven domains; we did not independently calculate hospital performance on these measures. Although not all hospitals report all measures, CMS sets a minimal threshold of reporting to ensure that Star Ratings are based on an adequate amount of quality information for a given hospital, and applies a clustering algorithm (called k-means clustering) to convert weighted summary scores into Star Ratings. Full details of CMS’s Star Ratings methodology, including links to specifications for each constituent measure, can be found on CMS’s QualityNet website (http://www.qualitynet.org).19

In late 2017, CMS made four changes to their methodology: (a) including both non-adaptive and adaptive quadrature in the latent variable model for each domain, to improve the stability of the latent variable models; (b) running multiple iterations of k-means clustering to improve the reliability of this methodology; (c) eliminating winsorisation of hospital summary score outliers to include more hospitals in the Star Ratings and yield a broader distribution; and (d) assigning Star Ratings after, rather than before, removing hospitals that did not meet volume minimums for public reporting.

We limited our sample to hospitals that were assigned Star Ratings in both 2016 and 2017 to improve comparability. We collected data on hospital characteristics of size, profit status, geographic region, teaching status, rural designation and membership in a health system from the American Hospital Association Annual Survey from 2015. We included general medical and surgical acute-care hospitals in our sample. We excluded hospitals that were federally owned as well as children’s hospitals, long-term care facilities, psychiatric hospitals and specialty hospitals. We also excluded hospitals that were located in US territories, outside of the 50 states.

Analysis

First, we calculated performance on each individual measure for comprising the Star Ratings by rating group 1–5 stars. We then examined patterns of performance on the overall Star Ratings across hospital characteristics, including hospital size (small-sized as <100 beds, medium-sized with 100–399 beds, large-sized with >400 beds), profit status (non-profit, for-profit, public), geographic region (northeast, midwest, south, west), teaching status, rural designation (defined as critical access hospital, sole community provider or rural referral centre), membership in a system (defined as either multihospital systems with two or more hospitals owned or managed by a central organisation, or diversified single hospital systems where single, freestanding hospitals bring into membership three or more healthcare organizations),20 and the proportion of Medicare and Medicaid patients served at each hospital. We analysed hospitals in each star rating level (1, 2, 3, 4 and 5) and also in dichotomous groups (1–3 stars vs 4–5 stars) for ease of presentation. We then created multivariable logistic regression models to estimate the odds of a hospital receiving 4 or 5 Star Ratings according to their hospital characteristics. Models were first run including each variable separately, and then including all covariates in a single model. We repeated these analyses separately for 2016 and 2017 data. Additionally, we tested for interactions between proportion Medicaid and region as well as teaching status and region, to determine whether the impact of social and medical complexity might vary geographically. These interaction terms were significant so we conducted stratified analyses for each category. We performed sensitivity analyses in which models were weighted by the number of hospital admissions at each hospital. We also estimated statewide averages of hospital Star Ratings, weighted by the total number of admissions at a given hospital. We created maps to visually depict the weighted average overall hospital Star Rating for each state.

Finally, we calculated differences between each hospital’s 2016 and 2017 ranking and plotted these differences on a histogram. We then compared characteristics between hospitals that improved (Star Rating went up by one or more stars) to those that worsened (Star Rating went down by one or more stars) using χ2 tests.

All analyses were performed using Stata V.15. We considered statistical significance at a p value <0.05.

Results

Overall performance patterns

We identified 3429 hospitals that were assigned Star Ratings in both 2016 and 2017 and had available hospital characteristics. Among these hospitals, in 2016, 65 hospitals (2%) achieved 5 stars; 903 hospitals (26%) 4 stars; 1696 hospitals (49%) 3 stars; 658 hospitals (19%) 2 stars; and 107 hospitals (3%) 1 star. In 2017, 283 hospitals (8%, figure 1) achieved 5 stars; 1073 hospitals (31%) 4 stars; 1098 hospitals (32%) 3 stars; 723 hospitals (21%) 2 stars; and 252 hospitals (7%) 1 star. There was significant geographic variation across the USA: the weighted average overall Star Rating was higher in midwestern and western states, and lower in the northeastern states (online supplementary appendix figures 1a/b and tables A1/A2 .

Supplemental material

Figure 1

Distribution of Star Ratings by year.

Performance within each domain and on most measures differed across Star Rating categories (table 1). For example, 30-day mortality for patients with acute myocardial infarction was 13.6% in 1-star hospitals compared with 12.6% in 5-star hospitals (p for trend across groups<0.001), and the standardised infection ratio for central line-associated bloodstream infections was 1.07 in 1-star hospitals (indicating an infection rate 7% higher than expected) compared with 0.69 in 5-star hospitals (indicating an infection rate 31% lower than expected, p for trend across groups<0.001). However, there were a few measures for which performance did not differ meaningfully by Star Rating group, such as the standardised ratio of catheter-associated urinary tract infections, surgical-site infection following colectomy and C lostridium dif f icile infections. Aspirin for acute myocardial infarction, patients leaving the ED without being seen, patients with imaging within 45 min of suspected stroke presentation, early elective labour induction and receipt of appropriate radiation therapy for cancer metastatic to bone also did not differ significantly across Star Rating groups.

Table 1

Performance on measures and domains by Star Rating, 2018

Star Ratings by hospital characteristics

Although a higher proportion of hospitals overall received 4 or 5 stars in 2017, hospital structural characteristics associated with high performance were largely similar in 2016 and 2017. For example, 33% of small hospitals received 4 or 5 stars in 2016 compared with 26% of medium hospitals and 21% of large hospitals (OR for medium 0.78, p=0.02, and for large, 0.61, p=0.003, tables 2 and 3); 49% of small hospitals received 4 or 5 stars in 2017 compared with 35% of medium hospitals and 32% of large hospitals (OR for medium, 0.56, p<0.001; OR for large 0.50, p<0.001). Non-profit status (OR 1.37 in 2016, p=0.01; OR=1.59 in 2017, p<0.001, table 1), midwest region (OR 2.30 in 2016, p<0.001; OR 2.31 in 2016, p<0.001), west region (OR 1.30 in 2016, p=0.06; OR 1.69 in 2017, p<0.001), and system membership (OR 1.33 in 2016, p=0.003; OR 1.33 in 2017, p=0.001) were associated with higher odds of achieving a higher Star Rating. There was no statistically significant difference between urban and rural hospitals’ odds of achieving a high Star Rating. Patterns were similar when examining each star level independently (online supplementary appendix table A3) and when adding interaction terms for region with Medicaid quartile and teaching status (online supplementary appendix table A4).

Table 2

Characteristics of hospitals with top Star Ratings (4 or 5 stars), 2016 and 2017

Table 3

Characteristics of hospitals with worse versus better Star Ratings in 2017 compared with 2016

In terms of patient characteristics, hospitals with the most Medicaid patients were markedly less likely to receive 4 or 5 stars (OR for highest quartile of Medicaid patients=0.32 in 2016, p<0.001; OR 0.33 in 2017, p<0.001), and hospitals with the highest proportion of Medicare patients were somewhat less likely to receive 4 or 5 stars (OR for highest quartile of Medicare patients=0.68 in 2016, p=0.01; OR 0.61 in 2017, p<0.001).

Improvement in Star Ratings over time

There were significant changes in hospital ratings between the 2 years of reporting. While 959 hospitals had an improvement in their Star Rating, 727 hospitals had a lower Star Rating in 2017 than in 2016 (figure 2). Small (31% of hospitals with improvement) and large hospitals (35% of hospitals with improvement) were more likely to see their scores improve than medium hospitals (25% with improvement, OR 0.71 compared with small hospitals, p<0.001, table 3). For-profit hospitals (OR 0.73, p=0.04) were less likely to improve compared with public hospitals. Hospitals in the midwest and west (OR 1.28, p=0.046 and OR 1.47, p=0.004, respectively) were more likely to improve. Improvement differed by patient mix: hospitals with the highest proportion of Medicaid patients were least likely to improve (27% of hospitals with improvement compared with 32% for hospitals with the lowest proportion of Medicaid patients, OR 0.61, p=0.001).

Figure 2

Distribution of change in Star Ratings, 2016–2017.

Discussion

In this national study of hospital Star Ratings, we found that small, non-profit hospitals with a lower proportion of Medicaid patients had the highest odds of performing well. Conversely, large, for-profit hospitals with more Medicaid patients were less likely to perform well on the Star Ratings. These patterns remained largely consistent over the first two years that these ratings have been reported, with a similar geographic distribution. Hospitals with more Medicaid patients were less likely to improve their scores over the 2-year study period.

Medicare’s Star Ratings have been somewhat controversial since their release. In the absence of a ‘gold standard’ for quality against which to compare, it is difficult to say with certainty whether they are an accurate reflection of hospital performance. Advocates have argued that they provide meaningful information to patients and families hoping to seek care at the highest-quality hospital they can. Detractors have argued that the Star Ratings reflect hospitals’ underlying patient populations in terms of complexity and socioeconomic makeup more than delivered care quality. Our findings would suggest that the truth is somewhere in between.

On the vast majority of the quality measures included in the Star Ratings, there was a clear gradient in performance across the Star categories. Consumers using these ratings, or others like them, to choose a hospital might consider these differences to be meaningful and reflective of underlying patterns of care. For example, we found that small hospitals did well on Star Ratings; prior work has shown that small hospitals tend to score very well on certain measures of patient experience and safety.21 22 It is possible that smaller hospitals are more easily able to adopt a patient-centred approach to care: for example, continuity of physician and nursing teams, or a quieter hospital stay may both be more common at smaller facilities, and may engender better care delivery and a more positive patient experience. Understanding more about which factors drive better performance, and ultimately, higher Star Ratings, might help other hospitals understand where improvement could be sought.

On the other hand, it is also possible that the ratings may be misleading because of limitations with the measures that comprise them. Specifically, the ratings patterns may reflect inadequate risk adjustment such that hospitals caring for more complex patients are more likely to score poorly due to patient population rather than actual hospital quality. For example, we did not find that teaching status or large hospital size were associated with higher performance on Star Ratings, despite prior work suggesting that these types of hospitals may provide higher-quality care.23 24 One reason for this may be that the quality measures included in the Star Ratings may not fully account for differences in medical complexity between hospitals. For instance, academic medical centres tend to have more capacity to care for highly complex patients, such as those needing advanced cancer, cardiovascular or neurosurgical protocols or procedures that are not available elsewhere. Current metrics of patient safety and mortality may not adequately capture that type of complexity.25 26 A hospital’s ability to perform percutaneous coronary intervention or coronary artery bypass surgery, or provide advanced imaging or high-acuity specialised intensive care may represent other elements of quality that are worthy of additional attention as consumers seek care. There is no a priori reason that teaching hospitals should provide higher-quality care for basic medical needs, but patients seeking care may want to know whether high-star hospitals have the ability to provide high-acuity care if or when needed.

Similarly, we found that hospitals serving a high proportion of Medicaid patients were markedly less likely to receive a high Star Rating and less likely to improve over time. While Medicaid status is an imperfect proxy for low income, these findings suggest a relationship between the sociodemographic makeup of a hospital and its performance on Star Ratings. This finding is consistent with prior work suggesting a strong relationship between social risk and performance on quality measures in the hospital,25 27 physician28 29 and post-acute facility settings.30 31 Evidence supports the concept that poor quality performance at hospitals serving vulnerable populations is likely a combination of care delivery under hospitals’ control and patient complexity outside their control, and it can be difficult to disentangle the two.32 33 Consequently, whether, and how, to account for these differences when comparing hospital quality is a topic of much current debate and warrants further research. Our findings also suggest that dedicating resources to help safety-net hospitals improve over time may be particularly important.

We found that CMS’s changes to the Star Rating methodology, implemented in part due to concern from multiple stakeholders about the methodology in the initial year of the programme, broadened the distribution of performance over the first two years of the programme, but did not alter the underlying patterns of performance by hospital type. The changes CMS undertook largely comprised minor modifications to the way score cut-offs are calculated19 and did not represent the major reweighting of the measures’ constituent parts that many had hoped. Future changes to Star Ratings are currently being discussed and might do well to be informed by a data-driven approach incorporating patients’ and physicians’ opinions and values.

Our study has some limitations. First, these data only represent two points in time. Monitoring trends in hospital performance will be critical as CMS continues to refine their methodology and report summary performance using Star Ratings. Second, we were unable to account for unmeasured confounders in our models and were limited by available hospital-level characteristics provided in the American Hospital Association’s Annual Survey. Third, because Star Ratings are recalculated and recalibrated at least annually, we cannot be certain whether changes in scores represent true improvement in quality, or simply a different pattern of clustering from year to year. Relatedly, not all hospitals report data on all measures in all years. Since many of the measures included in the Star Ratings are based on more than 1 year of data (eg, mortality and readmissions are both based on 3 years of data), change in hospital performance may take time to show up in the Star Ratings. Fourth, because some hospitals are exempt from receiving Star Ratings (in particular, smaller hospitals), we may not be able to generalise our findings to the entire population of US hospitals. Fifth, because we did not calculate each performance measure ourselves, and thus relied on publicly reported performance, we were unable to determine the impact of particular methodological choices such as fixed vs random effects or risk adjustment for each measure, and instead had to rely on the measure as specified by CMS. Finally, while many of the measures contained in the Star Ratings overlap with publicly reported measures in other countries,4–9 performance rates may not be comparable due to differences in methodology.

In conclusion, we found that small non-profit hospitals with fewer Medicaid patients were most likely to perform well on US Medicare Overall Star Ratings. These patterns were consistent over the first two years of reporting. Ongoing research is needed to identify dimensions of quality that are most useful for patients and physicians, and to translate those dimensions into meaningful products for the public.

References

Footnotes

  • Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

  • Disclaimer The NHLBI played no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; and decision to submit the manuscript for publication.

  • Competing interests KJM was supported by the National Heart, Lung, and Blood Institute under grant K23-HL109177-03.

  • Patient consent Not required.

  • Ethics approval Institutional Review Board at Washington University in St. Louis.

  • Provenance and peer review Not commissioned; externally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.