Article Text

Development and validation of a new ICD-10-based screening colonoscopy overuse measure in a large integrated healthcare system: a retrospective observational study
  1. Megan A Adams1,2,
  2. Eve A Kerr1,3,
  3. Jason A Dominitz4,
  4. Yuqing Gao1,
  5. Nicholas Yankey1,
  6. Folasade P May5,
  7. John Mafi5,
  8. Sameer D Saini1,2
  1. 1 VA Ann Arbor Center for Clinical Management Research, VA Ann Arbor Healthcare System, Ann Arbor, Michigan, USA
  2. 2 Division of Gastroenterology, University of Michigan Medical School, Ann Arbor, Michigan, USA
  3. 3 Division of General Internal Medicine, University of Michigan Health System, Ann Arbor, Michigan, USA
  4. 4 Gastroenterology Section, VA Puget Sound Health Care System Seattle Division, Seattle, Washington, USA
  5. 5 University of California Los Angeles, David Geffen School of Medicine, Los Angeles, California, USA
  1. Correspondence to Dr Megan A Adams, VA Ann Arbor Center for Clinical Management Research, VA Ann Arbor Healthcare System, Ann Arbor, MI 48105, USA; meganada{at}med.umich.edu

Abstract

Background Low-value use of screening colonoscopy is wasteful and potentially harmful to patients. Decreasing low-value colonoscopy prevents procedural complications, saves patient time and reduces patient discomfort, and can improve access by reducing procedural demand. The objective of this study was to develop and validate an electronic measure of screening colonoscopy overuse using International Classification of Diseases, Tenth Edition codes and then apply this measure to estimate facility-level overuse to target quality improvement initiatives to reduce overuse in a large integrated healthcare system.

Methods Retrospective national observational study of US Veterans undergoing screening colonoscopy at 119 Veterans Health Administration (VHA) endoscopy facilities in 2017. A measure of screening colonoscopy overuse was specified by an expert workgroup, and electronic approximation of the measure numerator and denominator was performed (‘electronic measure’). The electronic measure was then validated via manual record review (n=511). Reliability statistics (n=100) were calculated along with diagnostic test characteristics of the electronic measure. The measure was then applied to estimate overall rates of overuse and facility-level variation in overuse among all eligible patients.

Results The electronic measure had high specificity (99%) and moderate sensitivity (46%). Adjusted positive predictive value and negative predictive value were 33% and 95%, respectively. Inter-rater reliability testing revealed near perfect agreement between raters (k=0.81). 269 572 colonoscopies were performed in VHA in 2017 (88 143 classified as screening procedures). Applying the measure to these 88 143 screening colonoscopies, 24.5% were identified as potential overuse. Median facility-level overuse was 22.5%, with substantial variability across facilities (IQR 19.1%–27.0%).

Conclusions An International Classification of Diseases, Tenth Edition based electronic measure of screening colonoscopy overuse has high specificity and improved sensitivity compared with a previous International Classification of Diseases, Ninth Edition based measure. Despite increased focus on reducing low-value care and improving access, a quarter of VHA screening colonoscopies in 2017 were identified as potential low-value procedures, with substantial facility-level variability.

  • Healthcare quality improvement
  • General practice
  • Performance measures

Data availability statement

Data are available upon reasonable request. Members of the scientific community who would like a copy of the final data sets (ie, data sets underlying publication) from this study can request a copy by emailing Jennifer Burns at jennifer.burns@va.gov. They should state their reason for requesting the data and their plans for analysing the data. Final data sets will be copied onto a DVD. The DVD will be sent to the requester via FedEx. Each data set will be accompanied by documentation that lists all variables described in the publication and links them with variable names in the data set. De-identified data will be provided after requesters sign a letter of agreement (LOA) detailing the mechanisms by which the data will be kept secure. The LOA will also state that the recipient will not attempt to identify and individual in the data, will not share the data outside of their research team, and will provide information on any files to be linked to the data. The data set will not include PII and all dates will be changed to integers to allow for calculation of time periods.

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

  • Low-value use of colonoscopy for screening and other preventive indications is a common and well-documented problem across healthcare systems.

WHAT THIS STUDY ADDS

  • An International Classification of Diseases, Tenth Edition based electronic measure demonstrated high specificity and moderate sensitivity for identifying screening colonoscopy overuse in the Veterans Health Administration (VHA), the largest integrated healthcare system in the USA. Levels of screening colonoscopy overuse in VHA were substantial in 2017, with significant facility-level variability.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

  • Tracking rates of potential screening colonoscopy overuse over time will allow health systems to target quality improvement interventions to low-performing sites to reduce procedural overuse and enhance overall endoscopy access by eliminating unnecessary colonoscopies.

Introduction

Low-value use of colonoscopy for screening and other preventive indications is a common and well-documented problem across healthcare systems.1–3 A recent systematic review found substantial overuse of diagnostic testing across all healthcare settings and highlighted the need for health systems, providers and policy makers to develop and implement effective strategies to curb overuse of low-value diagnostic testing.4 Low-value care has been characterised as services that provide little to no benefit to patients, have potential to cause harm, incur unnecessary costs to patients or waste limited healthcare resources.5–8 Low-value use of colonoscopy (ie, overuse) not only is wasteful and potentially harmful to patients—it also can impede access for patients who need care by unnecessarily increasing endoscopy demand in the setting of limited resources. Low-value care is particularly problematic in large integrated healthcare systems such the Veterans Health Administration (VHA), where reduced access may lead to longer-than-acceptable wait times and impair VHA’s ability to meet its central mission to ensure that US military veterans are able to access the care they need in a timely manner. Indeed, VHA has previously been scrutinised for prolonged wait times for routine medical care, including elective outpatient procedures such as colonoscopy.9

In 2014, VHA was investigated for prolonged wait times for routine medical care including elective outpatient colonoscopies at the Phoenix Veterans Affairs Health Care System and other sites.9 Prior to 2014, our team developed an International Classification of Diseases, Ninth Edition (ICD-9) based electronic measure to identify potential low-value use of screening colonoscopy in VHA endoscopy facilities.3 However, interim adoption of International Classification of Diseases, Tenth Edition, Clinical Modification (ICD-10-CM; hereinafter, ‘ICD-10’) rendered the ICD-9-based measure unusable to monitor overuse as the health system sought to improve access to care.

In this study, we sought to develop and validate an ICD-10-based electronic measure of screening colonoscopy overuse. We then applied this measure to estimate facility-level variation in colonoscopy overuse in VHA for purposes of targeting quality improvement initiatives to reduce overuse. We hypothesised that there would be substantial potential overuse of screening colonoscopy in VHA given inherent challenges in reducing use of low-value services and the lack of specific programmes or interventions focused on this topic.10 We also hypothesised that our ICD-10-based measure would have similar specificity but enhanced sensitivity compared with the previously developed ICD-9-based measure due to the addition of more specific codes for non-screening indications in ICD-10.

Methods

This was a retrospective observational study using VHA administrative, clinical and laboratory data available in the Corporate Data Warehouse (CDW). Because this work was performed under an operations Memorandum of Understanding with the VA Office of Reporting Analytics, Performance, Improvement and Deployment, approval by the Ann Arbor VA Institutional Review Board was not required. VHA is the largest integrated healthcare system in the USA. The study population included veterans undergoing a screening colonoscopy at one of 119 VHA endoscopy facilities in 2017. For patients who had two or more colonoscopies within the 2017 study period, we included only the first (index) procedure because our primary focus was on identifying overuse of routine, ambulatory screening colonoscopies. A small proportion of patients undergo multiple colonoscopies in a short period of time, but in most instances those colonoscopies repeated in the short term (ie, <1 year) are appropriate (eg, due to poor bowel preparation on the initial procedure, inadequate sedation during the initial procedure, or the need for surveillance of a large, piecemeal polypectomy site to ensure no residual polyp tissue remains).

The study proceeded in three steps: (1) Measure specification and electronic approximation of the measure using ICD-10-era administrative codes; (2) Measure validation via manual record review using a national random sample of colonoscopy cases, with oversampling to ensure adequate numbers of screening colonoscopies; and (3) Application of the ICD-10-based electronic measure to all eligible patients undergoing colonoscopy to calculate 2017 VHA screening colonoscopy overuse rates, facility-level variation in overuse and potential explanatory factors.

Measure specification and electronic measure construction

Measure specifications were initially defined by an expert workgroup comprising VA experts in colorectal cancer screening and in performance measurement.3 Prior to the initial workgroup meeting, members were provided with relevant literature to review, including US Preventive Services Task Force (USPSTF) guidelines for colorectal cancer screening and VHA Colorectal Cancer Screening Directives in effect at the time of measure development. While there were interim updates to USPSTF guidelines between 2013 and 2017,11 these updates did not impact the measure specifications (a 2021 update to the USPSTF guidelines recommended changing the age of screening initiation from 50 years to 45 years;12 however, this was not the standard of care during our study period). Workgroup members were charged with developing a measure that: (1) Was based on high-quality evidence (ie, the evidence summarised in pre-2021 USPSTF guidelines); (2) Maximised specificity at the expense of sensitivity; and (3) Could be implemented electronically.

The workgroup began by broadly defining overuse as a screening colonoscopy performed at an inappropriately short interval (eg, screening colonoscopy performed 5 years after a prior negative screening colonoscopy in an average-risk patient) or in a patient for whom the benefit of screening is low (eg, screening colonoscopy in an 86-year-old patient). The workgroup was then asked to more clearly specify the data elements that comprised the measure denominator (the eligible population, table 1, Denominator) and the numerator (those meeting the measure, table 1, Numerator). For the numerator, discussion focused on the appropriate cut-offs for age, time interval and life expectancy. The workgroup then classified each item comprising the numerator as ‘probable’ or ‘possible’ overuse based on the strength of the guideline recommendation and the likelihood of misclassification. The workgroup also specified exclusions from the measure denominator, factors indicating that the colonoscopy was not an average-risk screening procedure such as increased risk for colorectal cancer (eg, screening colonoscopy in a patient with a family history of colorectal cancer) or ineligibility for screening (eg, prior total abdominal colectomy).

Table 1

Numerator, denominator and exclusions for screening colonoscopy overuse measure

Supplemental material

The workgroup met three times over a 6-month period. Rather than using a formal Delphi process, any potential disagreement was resolved through discussion. Ultimately, consensus was achieved in all elements of the measure specification. The final measure defined by the workgroup identified average-risk screening colonoscopies (comprising the eligible population in the denominator, table 1) that met one or more criteria for probable or possible overuse (comprising the numerator, table 1).

Electronic approximation of measure denominator

The measure denominator (prior to application of exclusions to eliminate procedures with indications other than average-risk screening) consisted of all index colonoscopies performed in FY2017 in patients who had not had a colonoscopy within the preceding 12 months. To approximate the measure denominator (table 1), we first identified all colonoscopies performed in fiscal year (FY2017 for any indication using Current Procedural Terminology (CPT) and Healthcare Common Procedure Coding System (HCPCS) codes (online supplemental appendix 1 table 5). For patients who had more than one colonoscopy performed in FY2017, only the first (index) procedure was included in the denominator. Likewise, only patients who had no prior colonoscopy performed in the 12 months preceding the index FY2017 colonoscopy were included in the denominator. This is because when a colonoscopy is repeated within a year, there is a high probability that the repeat procedure was done for a reasonable indication such as inadequate bowel preparation on the prior procedure, sedation intolerance leading to an incomplete procedure or failure to complete the prior procedure due to technical difficulty. The process of updating the measure specifications to ICD-10 coding was aided by use of the 3M ICD-10 Code Translation Tool, a proprietary software application that assists in the conversion of ICD-9-based applications to ICD-10.13

Denominator exclusions (exclusion of non-average-risk screening colonoscopies)

We then excluded procedures that may have been performed for diagnostic or high-risk screening or surveillance indications, using an approach previously developed and validated by Fisher and colleagues.14 First, we excluded patients who had an ICD-10 code for specific gastrointestinal symptoms or for colorectal neoplasia within 12 months of the FY17 colonoscopy (online supplemental appendix 1 table 6). To further increase the specificity of the electronic measure, we also excluded individuals with ICD-9 and ICD-10 codes indicating high risk for colorectal cancer or prior total abdominal colectomy at any time in the prior 10 years (from FY07 to FY17) (online supplemental appendix 1 tables 7a,b). Specifically, patients were excluded if CPT or ICD-9/ICD-10 codes revealed any of the following diagnoses between FY07 and the qualifying FY17 colonoscopy (online supplemental appendix 1 tables 7a,b): (1) Prior colectomy, (2) History of colorectal cancer; (3) History of colon polyps; (4) History of inflammatory bowel disease; or (5) Family history of colorectal cancer. Both ICD-9 and ICD-10 codes were used because VHA (like most US healthcare systems) transitioned between these two coding systems on 1 October 2015. These additional exclusion criteria were selected to ensure that the cohort comprised individuals who were at average (rather than increased) risk of CRC. Finally, we excluded individuals who underwent their FY17 colonoscopy during a hospitalisation (since such colonoscopies are unlikely to be performed for screening). Thus, the final denominator (after all exclusions) consisted of all average-risk screening colonoscopies performed in FY2017.

Electronic approximation of measure numerator

Specification of electronic elements comprising the measure numerator (probable and possible overuse, table 1) was more straightforward than for the denominator since these elements were primarily based on factors such as patient age and the time interval between colonoscopies (which are reliably coded in administrative data). To identify faecal occult blood tests (FOBTs), we used Logical Observation Identifiers Names and Codes (online supplemental appendix 1 table 5). To identify patients with life expectancy <6 months, we used structured data from CDW that are used to indicate limited life expectancy for clinical purposes (ie, the CDW Health FactorType domain, which contains information about health factors, severity level, and other indicators of health and includes a forecast of the probable outcome of a disease to flag patients with a life expectancy of <6 months).

Validation of updated ICD-10 measure

Validation sample

The electronic, ICD-10-based measure was validated against the gold standard of manual record review using a national random sample of colonoscopy cases (ie, patients who had a CPT or HCPCS code for a colonoscopy of any type performed in a VHA facility in 2017) (online supplemental appendix 1 table 5) stratified by VA and VA community care. For purposes of the validation, we oversampled screening colonoscopies within each stratum using new ICD-10 code Z12.11 (denoting a screening indication) with the goal of achieving 50% screening procedures in our sample, and 50% non-screening procedures in our sample. If we had randomly sampled 500 colonoscopies (ie, 500 patients who had a colonoscopy procedure code in 2017), approximately 25% of our sample would have been screening procedures (according to 2013 data, only a quarter of all colonoscopies are performed for screening), limiting our ability to conduct a robust validation.3 Thus, by oversampling screening colonoscopies using new ICD-10 code Z12.11 within the larger national random sample of colonoscopy cases (ie, patients who had a CPT or HCPCS code for a colonoscopy of any type performed in a VHA facility in 2017), we were able to ensure that roughly 50% of colonoscopies in our validation sample were screening colonoscopies.

In parallel with the present study, we also validated the measure in VA community care data (ie, via manual review of non-VA endoscopy reports and other data). VA has recently expanded the ability of VA-enrolled veterans to receive care in non-VA facilities at VA expense.15 Therefore, we initially pulled a sample of 1000 patients—500 who had a colonoscopy in a VA facility and 500 who had a colonoscopy through VA community care per administrative data. While the initial sample size for the VA measure validation was 500, 11 of the patients who were identified as having their colonoscopy through VA community care were found to have received their procedure at a VA facility on further review. Therefore, these 11 patients were included in our validation of the measure using VA data. Thus, the final validation sample for this study was 511. Validation of the new ICD-10 measure included review of fewer charts than the validation of the previous ICD-9-based measure (3000),3 because the ICD-10-based measure was an update of the previous measure and the same core data elements used in the original 2012 abstraction were used in the 2017 abstraction, with minor changes in how reviewers were instructed to document several of these data elements.

Validation protocol

Manual record review was performed by Quality Insights, a professional chart abstraction group that performs large-scale, national chart reviews for the VA performance measurement programme (VA External Peer Review Program, or EPRP) on an ongoing basis. EPRP uses quality control processes to maximise the consistency and completeness of data collected from VA records. These processes include: (1) Internal quality control question and mnemonic level analysis; and (2) Inter-rater reliability (IRR) assessment.

We refined a standardised electronic health record (EHR) abstraction algorithm to identify overuse measure elements in manual record review, using a process similar to that used for the ICD-9-based colorectal cancer screening overuse measure.3 A 78-question abstraction algorithm (online supplemental appendix 2) was developed in collaboration with Quality Insights and EPRP. We first outlined the data elements that would be needed from manual record review to calculate the measure. We then determined which potential data sources could be accessed to retrieve each of these elements (eg, endoscopy report, primary care clinic note, laboratory data). We also developed a lexicon of potential findings in each of these data sources (eg, for colonoscopy indication, findings could include average-risk screening, high-risk screening, surveillance and diagnostic). This process was iterative and collaborative, conducted through a combination of electronic communication and five conference calls (October 2018 to February 2019) between the abstraction leads and study team members with clinical expertise in gastroenterology (SDS, MAA) to enhance its ease of use and reliability. The a priori data elements, data sources and potential findings were combined into a standardised algorithm for record reviewers.

Supplemental material

Prior to beginning chart abstraction, Quality Insights staff provided education to the abstractors via webinar and PowerPoint presentation. This included introducing each step of the abstraction algorithm and clarifying relevant medical terminology. Abstractors were blinded to the electronic measure determination. During the abstraction process, a log of questions from the abstractors was compiled, and these questions were reviewed and resolved by members of the study team on an ongoing basis (SDS, MAA). A total of six trained abstractors (three Registered Nurses and three with a Registered Health Information Administrator credential; average of 15.6 years medical record review experience) used the final algorithm/abstraction instrument (online supplemental appendix 2) to review 511 records.

IRR assessment

IRR testing of the final instrument was performed using Cohen’s kappa and Gwet’s AC as measures of agreement. Since the abstraction of all medical records in the validation sample was impractical, IRR testing was performed on a weighted sample of 100 records drawn from the 511-record validation sample (ie, two reviewers independently abstracted data from the same 100 records, and then their results were compared).

These 100 records included 25 non-screening, 25 screening/non-overuse cases and 50 screening/overuse cases, randomly sampled from the full 511-record validation sample. With this sample size, we had >80% power to detect a kappa of 0.61 (moderate agreement) or greater. The two IRR reviewers were Registered Nurses and held the Registered Health Information Administrator credential, with extensive medical record review experience.

Calculation of diagnostic test characteristics

The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of the electronic measure were assessed. Because we oversampled for screening procedures in the validation sample using the Z12.11 code (meaning that screening prevalence in the validation sample was higher than in the overall sample by design), direct calculations of PPV and NPV would be inaccurate. We therefore used bootstrapping to calculate the PPV and NPV of the electronic measure in resampled subsets of the 511-record validation sample with the same proportion of screening procedures as in the national/overall sample.

Application of measure to assess VHA screening colonoscopy overuse

Following measure validation, we applied the electronic measure to all eligible patients in 2017 to calculate facility-level overuse, and variation in facility-level overuse of screening colonoscopy to identify facilities with high levels of overuse as candidates for closer inquiry and potential targets of quality improvement efforts. Rates of overuse were reported as a composite of ‘possible’ and ‘probable’ overuse as specified by the expert workgroup (table 1).

We also used negative binomial regression to model the proportion of overused screening colonoscopies per facility, with the number of overused screening colonoscopies as the outcome and total number of screening colonoscopies as the offset. Facility-level predictors examined included the following: (1) Proportion of screening-eligible patients up to date for screening per current guidelines as assessed by chart review via VA’s ongoing EPRP, (2) Median number of weeks between a positive FOBT and colonoscopy (‘FOBT wait time’), (3) VA facility complexity score (incorporating factors including patient risk, clinical volume, teaching/research activity and intensive care unit level, rated on a scale from 1a (highest complexity) to 3 (lowest complexity)), (4) Academic affiliation (obtained from the VA Office of Academic Affiliations), (5) Annual colonoscopy volume (2017, extracted from CDW) and (6) Proportion of colonoscopies outsourced to non-VA facilities (2017, extracted from CDW); (7) Proportion of colonoscopies performed on Black patients (extracted from CDW), and (8) Proportion of colonoscopies performed on Hispanic or Latino patients (extracted from CDW). Risk ratios (RRs) and 95% CIs were calculated. All data management and analyses were performed using SAS V.9.4 of the SAS Enterprise Guide for Linux (SAS Institute, Cary, North Carolina, USA).

Results

Validation of electronic measure

Based on independent duplicate manual review (n=100), the kappa of the final instrument was 0.81 (95% CI 0.69 to 0.93) and the Gwet’s AC was 0.83 (95% CI 0.72 to 0.94), indicating ‘near perfect’ and ‘very good’ agreement, respectively. Both reviewers identified 35 of 100 cases as screening colonoscopy overuse, and 56 of 100 cases as appropriate colonoscopies. Reviewers disagreed in 9 of 100 cases. In seven of nine cases of disagreement, reviewers disagreed regarding whether the procedure was a screening procedure, rather than on whether that screening procedure was overuse. In the other two cases of disagreement, the reviewers agreed that the procedure was a screening colonoscopy but disagreed regarding whether that screening colonoscopy was appropriate versus overuse.

Compared with manual record review (n=511), the ICD-10-based electronic measure had high specificity (99%, 95% CI 98% to 100%) and improved sensitivity (46%, 95% CI 35% to 57%) compared with the prior ICD-9-based measure (which had a sensitivity of 20% and a specificity of 97%).3 The electronic ICD-10-based measure was also accurate in estimating overuse compared with manual record review (19% ICD-10-based measure overuse (95% CI 15% to 24%) vs 23% manual record review overuse (95% CI 19% to 28%)). The adjusted PPV of the electronic measure was 33% (95% CI 21% to 42%) and the adjusted NPV was 95% (95% CI 93% to 97%).

Measurement of overuse in VHA

A total of 269 572 outpatient colonoscopies were performed in VHA in 2017 (36% screening, 64% non-screening indications). After applying exclusion criteria, 88 143 screening colonoscopy encounters remained. Patients were predominantly male (91.6%) and healthy (median Charlson-Deyo Comorbidity Index Score16=0), with a median age of 62 years (IQR 54–68) (table 2). Facilities were predominantly academically affiliated (95.0%) and high-complexity (64.7%).

Table 2

Characteristics of patients undergoing (N=88 143) and facilities performing (N=119) screening colonoscopies in the Veterans Health Administration, 2017

Applying the electronic measure to all eligible patients in 2017, 24.5% (21 600/88 143) of VA screening colonoscopy encounters in 2017 met the definition for probable (13.3%, 11 759) or possible (11.2%, 9841) overuse. Of the 21 600 colonoscopies meeting a consensus definition of overuse, the top two reasons for overuse were screening colonoscopy performed <9 years after a previous colonoscopy (45% in 2017) and screening colonoscopy performed <6 months after a negative FOBT (23% in 2017) (table 3). Median facility-level overuse was 22.5% (IQR 19.1%–27.0%), with fourfold to fivefold variability among facilities based on crude percentages (figure 1).

Figure 1

Variation in overuse of screening colonoscopy across 119 Veterans Health Administration facilities (n=88 143). Each marker represents a single VA facility, with error bars indicating 95% CIs (median overuse=23%, IQR=19%–27%). (Created by the authors).

Table 3

Reasons for screening colonoscopy overuse in the Veterans Health Administration, 2017 (N=21 600)

Examining predictors of overuse

Examining the association between screening colonoscopy overuse and facility characteristics, none of the facility-level factors examined were found to be associated with screening colonoscopy overuse except academic affiliation (RR 1.41, 95% CI 1.06 to 1.87) (table 4).

Table 4

Association between facility-level factors and screening colonoscopy overuse (N=119) in the Veterans Health Administration, 2017

Discussion

In this study, we developed and validated a new ICD-10-based measure of screening colonoscopy overuse and demonstrated that it measures overuse with robust specificity and markedly better sensitivity than a previous ICD-9-based measure.3 The new ICD-10-based measure could be used to track facility-level screening colonoscopy overuse over time in the ICD-10 era with little to no burden to clinicians or patients. Such information can be used to limit low-value colonoscopies, thus resulting in both improved quality and expanded capacity for high-value colonoscopies. This is particularly important in systems where access to colonoscopy is limited. Decreasing screening colonoscopy overuse also saves patient time and the anxiety, stress and discomfort associated with undergoing an invasive procedure. Despite increased focus on reducing low-value care and enhancing access, approximately 24% of screening colonoscopies in VHA in 2017 were identified as potential low-value procedures with substantial facility-level variability. The ICD-10-based measure was substantially more sensitive in identifying overuse than the previous ICD-9-based measure,3 meaning that it detects more potential cases of overuse. While drawing direct comparisons between 2013 and 2017 data has limitations given differences in ICD-9 and ICD-10 measure characteristics, it is worth noting that screening colonoscopy overuse rates did not meaningfully change between 2013 (as measured by an ICD-9 based measure) and 2017 (as measured by an ICD-10 based measure).3 This rate of potential overuse is within the credible range found in non-VA health systems in the ICD-9 era in a recent systematic review.17

High rates of potential low-value screening colonoscopy across VHA medical centres (ie, approximately one in four screening procedures) may result in part from the absence of assessment of colonoscopy overuse in VHA’s centralised performance measurement and improvement infrastructure. However, in response to high-profile access challenges, there has been increased focus by VHA leadership since 2017 on proposing innovative solutions to address these access challenges, including a focus on reducing procedural overuse. For example, in response to ongoing specialty care backlogs, the VHA Office of Veterans Access to Care convened a VA Gastroenterology Access Meeting in Washington, DC in September 2018 focused specifically on development and implementation of a coordinated, multicomponent access strategy to reduce wait times. This included development and implementation of guidelines and strategies to address overuse, including facility-level monitoring over periods of time (eg, quarterly) with targeted interventions for sites with relatively high levels of overuse. This work remains ongoing at the national VHA level in conjunction with the VA Office of Reporting, Analytics, Performance, Improvement and Deployment and other operational offices, with continued efforts focused on implementing this ICD-10-based measure into national reporting systems and development of facility-specific reports that can be used to communicate performance data to sites and explicitly highlight the link between reducing low-value colonoscopy and improving overall endoscopy access. Thus, there is reason for optimism regarding VHA’s ability to achieve a meaningful reduction in low-value screening colonoscopy in the future. To our knowledge, other well-regarded integrated healthcare systems, including the Kaiser Permanente integrated managed care consortium, do not presently employ a screening colonoscopy overuse monitoring programme. Our updated colonoscopy overuse measure could easily be adopted in these non-VHA settings to improve access and enhance overall performance. In addition, non-integrated healthcare systems, such as academic medical centres, community practices and others could benefit from monitoring of screening colonoscopy overuse, particularly as healthcare reimbursement systems shift to value-based payment/alternative payment models.

The substantial variability in performance across VHA’s 119 endoscopy facilities may be reflective of cultural or other unmeasured facility-level characteristics that enhance or impede efforts to reduce low-value care at those sites. Indeed, substantial variability in use of other low-value diagnostic testing has been demonstrated across facilities in other studies.18 19 For example, sites may have varying levels of available resources to carefully triage colonoscopy consults to detect low-value procedures, differences in leadership support and stakeholder engagement to support these efforts, and varying levels of recognition of the strong link between minimising low-value care (ie, decreasing demand) and improving overall endoscopy access. Because overuse is a complex problem, accomplishing meaningful and sustainable improvements in facility-level performance will require not only rigorous performance measurement and performance feedback, but also collaboration with willing leaders and front-line providers and patients to facilitate necessary changes to organisational culture.20

Our study adds meaningfully to existing knowledge in several ways. First, these findings are among the first to suggest that ICD-10 codes can substantially improve the performance characteristics of electronic quality measures.21 The markedly improved sensitivity of our updated colonoscopy overuse measure was largely due to significant improvement in the sensitivity of the underlying electronic measure for screening indication due to the addition of more specific codes for non-screening indications in ICD-10. Specifically, the sensitivity of the ICD-9-based electronic measure for screening indication was only 36%3 as compared with 79% for the ICD-10-based measure. Examples of cases that might still be missed by the ICD-10-based measure include situations in which an overuse screening colonoscopy is misclassified as non-screening (eg, due to a diagnostic code inappropriately leading to exclusion—see online supplemental appendix 1) or in which a patient undergoes colonoscopy outside the VA healthcare system (which is not captured electronically in VA data). Second, development and validation of an ICD-10-based electronic measure will allow health systems (whether in VHA, other integrated healthcare systems, academic centres or community practices) to monitor overuse rates longitudinally in the ICD-10 era, which is essential for tracking performance over time and targeting interventions to low-performing sites. The majority of performance measures track underuse rather than overuse,22 making this measure one of only a handful of overuse measures that can be implemented in clinical performance monitoring programmes to enhance healthcare value and optimise access.

Applying such a measure at the health system level can be particularly effective, because health systems can leverage efficiencies of scale to apply such metrics across a large number of affiliated clinicians, use such measures to track performance over time, and develop and disseminate new programmes or technologies (including clinical decision support tools, evidence-based guidelines or other tools) to improve system-wide care delivery and change organisational culture to value stewardship of healthcare resources.23 As an electronic measure, the screening colonoscopy overuse measure could be easily integrated into EHR platforms such as Epic or Cerner, and used in clinical decision support (ie, at the time a procedure is ordered) to flag a potential low-value procedure or used for quality monitoring and improvement efforts such as what is being considered in VHA. While this measure is currently not included in national performance programmes such as the Merit-Based Incentive Payment System or other pay-for-performance programmes, which have tended to favour more typical underuse measures, widespread adoption of a screening colonoscopy overuse or other low-value care measure in this manner could aid in dissemination across practice settings and enhance its potential impact.

We acknowledge that varying incentives (including financial incentives) may influence the degree of motivation present in these various practice settings to adopt such low-value care measures into routine performance-monitoring programmes. However, adoption of alternative payment models that modify these organisational incentives could be effective at reducing low-value care, as suggested by a recent study of variation in use of low-value services in provider organisations.24

Several study limitations also are worthy of mention. First, our measure relies on accurate coding of colonoscopy indication in administrative data, which is difficult to verify. However, comparing our measure to medical record abstraction did show high levels of accuracy. Further, facility-level variation in screening colonoscopy overuse rates may have been confounded by unmeasured factors including the potential for systematic differences in coding of procedural indication across sites and existing implementation of facility-specific interventions designed to curb screening colonoscopy overuse. The potential for coding differences across sites is mitigated, however, by use of automated endoscopy reporting software across VA endoscopy sites that auto-populates ICD/CPT codes into the reports. It also is important to acknowledge that our measure is a composite measure that combines cases of ‘possible’ and ‘probable’ overuse. However, prioritisation of measure elements is possible (eg, prioritising efforts towards facilities with the highest rates of probable overuse) and is being discussed as part of our efforts to use this measure for quality improvement within the VHA. Additionally, some reasons for overuse (eg, repeat colonoscopy 5 years after prior negative colonoscopy) are arguably more important to target than others. Finally, we acknowledge that there can never be perfect adherence to guidelines, both because guidelines themselves are not perfect and because guidelines must be applied in the context of the individual patient. While the ‘acceptable’ level of overuse has not been precisely defined, this measure can aid in identifying outlier facilities whose practices should be more closely scrutinised.

With respect to our analysis of facility-level predictors of screening colonoscopy overuse, we acknowledge that, because some of the CIs are wide and the point estimates substantial, there could be associations between overuse rates and facility-level predictors other than academic affiliation that were not detected in our analysis. However, it is not possible to increase the power of our study because the number of VHA endoscopy facilities is fixed. Therefore, while this component of our analysis may leave some unanswered questions, we believe the findings still add substantially to the literature and presenting our results is valuable to the field. Certainly, as the largest integrated healthcare delivery system in the USA, VHA is an optimal setting in which to conduct this type of analysis. It is also important to recognise that, while our study demonstrates variation in facility-level overuse, the drivers of overuse among individual low-performing facilities may be quite different, such that tailored strategies to reduce low-value use of screening colonoscopy will likely be needed rather than a uniform approach. We also could have underestimated the true rate of screening colonoscopy overuse among VA-enrolled veterans by not capturing use of non-VA care, which is increasingly prevalent given legislative initiatives such as the VA Maintaining Internal Systems and Strengthening Integrated Outside Networks Act of 201815 and its predecessors. However, we were able to capture these cases in our manual record review. We also separately validated our measure using VHA community care data, suggesting the measure could be applied to data and sites outside of VHA. Finally, emerging changes in screening colonoscopy guidelines, including a newly updated USPSTF recommendation (released May 2021) to begin colorectal cancer screening at age 45 years (rather than 50 years) in average-risk patients, may require refinement of overuse criteria in the future to ensure that the measure does not penalise appropriate care.12 However, it is important to note that only a small proportion (12%) of potential overuse was due to performance of a screening colonoscopy in patients 45–49 years of age.

Conclusion

Our updated ICD-10-based measure reliably measures screening colonoscopy overuse with similar specificity but markedly better sensitivity than a previous ICD-9-based measure, allowing VHA to track facility-level performance over time and target sites with higher rates of low-value procedures for improvement. Despite increased focus on reducing low-value care and enhancing access, levels of screening colonoscopy overuse in VHA were substantial in 2017, with significant facility-level variability. None of the facility-level factors examined were found to be associated with screening colonoscopy overuse, except for academic affiliation. However, recent systematic efforts to address specialty care access barriers, including through reducing procedural overuse, reflect increasing recognition of the impacts of overuse on overall procedural access and hold promise for reducing overuse of procedural services such as low-value screening colonoscopies in the future.

Data availability statement

Data are available upon reasonable request. Members of the scientific community who would like a copy of the final data sets (ie, data sets underlying publication) from this study can request a copy by emailing Jennifer Burns at jennifer.burns@va.gov. They should state their reason for requesting the data and their plans for analysing the data. Final data sets will be copied onto a DVD. The DVD will be sent to the requester via FedEx. Each data set will be accompanied by documentation that lists all variables described in the publication and links them with variable names in the data set. De-identified data will be provided after requesters sign a letter of agreement (LOA) detailing the mechanisms by which the data will be kept secure. The LOA will also state that the recipient will not attempt to identify and individual in the data, will not share the data outside of their research team, and will provide information on any files to be linked to the data. The data set will not include PII and all dates will be changed to integers to allow for calculation of time periods.

Ethics statements

Patient consent for publication

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

Footnotes

  • Twitter @jnmafi

  • Contributors MAA, MD, JD, MSc: study concept and design, acquisition of data, analysis and interpretation of data, statistical analysis, drafting the manuscript. EAK, MD, MPH: analysis and interpretation of data, critical revision of the manuscript for important intellectual content. JAD, MD, MHS: analysis and interpretation of data, critical revision of the manuscript for important intellectual content. YG, MS: acquisition of data, analysis and interpretation of data. NY, MPH, MSW: acquisition of data, analysis and interpretation of data. JNM, MD: analysis and interpretation of data, critical revision of the manuscript for important intellectual content. FPM, MD, PhD: analysis and interpretation of data, critical revision of the manuscript for important intellectual content. SDS, MD, MS: study concept and design, analysis and interpretation of data, critical revision of the manuscript for important intellectual content, study supervision.

  • Funding This study was supported by the VA Office of Reporting, Analytics, Performance, Improvement, and Deployment (RAPID). The first author is supported by a 2018 American College of Gastroenterology Junior Faculty Development Grant. A coauthor is supported by a National Institute on Aging (NIA) K76 Beeson Emerging Leaders career development award 1K76AG064392-01A1 and NIA R01 R01AG070017-01 Award.

  • Competing interests None declared.

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Linked Articles