Article Text

Download PDFPDF
Continuous quality improvement in statistical code: avoiding errors and improving transparency
  1. Thomas S Valley1,2,3,
  2. Neil Kamdar2,
  3. Wyndy L Wiitala4,
  4. Andrew M Ryan2,5,
  5. Sarah M Seelye4,
  6. Akbar K Waljee2,4,6,
  7. Brahmajee K Nallamothu2,4,7
  1. 1 Division of Pulmonary and Critical Care Medicine, University of Michigan, Ann Arbor, Michigan, USA
  2. 2 Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, Michigan, USA
  3. 3 Center for Bioethics and Social Sciences in Medicine, University of Michigan, Ann Arbor, Michigan, USA
  4. 4 Center for Clinical Management Research, VA Ann Arbor Healthcare System, Ann Arbor, Michigan, USA
  5. 5 School of Public Health, Department of Health Management and Policy, University of Michigan, Ann Arbor, Michigan, USA
  6. 6 Division of Gastroenterology and Hepatology, University of Michigan, Ann Arbor, Michigan, USA
  7. 7 Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, Michigan, USA
  1. Correspondence to Dr. Thomas S Valley, Division of Pulmonary and Critical Care Medicine, University of Michigan, Ann Arbor, Michigan 48105, USA; valleyt{at}

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Clear communication of statistical approaches can ensure healthcare research is well understood, reduce major errors and promote the advancement of science. Yet in contrast to the increasing complexity of data and analyses, published methods sections are at times insufficient for describing necessary details. Therefore, ensuring the quality, transparency and reproducibility of statistical approaches in healthcare research is essential.1 2

Such concerns are not just theoretical and have direct implications for research in the quality and safety field. For example, the Hospital Readmissions Reduction Program was instituted in 2012 by the US Centers for Medicare & Medicaid Services (CMS) and imposed financial penalties on hospitals with high readmission rates. Subsequent studies sought to determine the extent to which this programme was successful in reducing readmissions without promoting unintended consequences, such as increased mortality. Clearly defining the success or failure of this programme is essential, but in 2018 two prominent articles using the same CMS data set presented opposing results.3 4 These conflicts are undoubtedly due to differences in analytical choices, but specific differences are challenging to reconcile given statistical code was unavailable to readers. In another recent example, a major article was retracted due to the discovery of a statistical coding error that reversed the categorisation of treatment and control groups.5 This clinical trial examined a support programme for hospitalised patients with chronic obstructive pulmonary disease, originally reporting a lower risk of hospitalisation and emergency department visits, but in actuality demonstrating the support programme was associated with harm. Both cases demonstrate how better practices with statistical coding sharing at the time of publication may improve the quality of research.

While the utility of statistical code sharing may seem self-evident, it occurs more infrequently than one would expect.2 We believe a principal barrier to statistical code sharing is …

View Full Text


  • Twitter @tsvalley, @Andy_Ryan_dydx, @bnallamo

  • Disclaimer This manuscript does not necessarily represent the view of the U.S. Government or the Department of Veterans Affairs.

  • Competing interests All authors have completed the ICMJE uniform disclosure form at TSV declares support from the NIH (K23 HL140165).

  • Patient consent for publication Not required.

  • Provenance and peer review Not commissioned; internally peer reviewed.

  • Data availability statement There are no data in this work.