Background Many healthcare organisations (HCOs) use peer review to evaluate clinical performance, but it is unclear whether these data provide useful insights for assessing the sharp end of patient safety.
Objective To describe outcomes of peer review within the Department of Veterans Affairs (VA) healthcare system and identify opportunities to leverage peer review data for measurement and improvement of safety.
Design We partnered with the VA's Risk Management Program Office to perform descriptive analyses of aggregated peer review data collected from 135 VA facilities between October 2011 and September 2012. We determined the frequency of screening factors used to initiate peer review and processes contributing to substandard care. We also evaluated peer review data for diagnosis-related performance concerns, an emerging area of interest in the patient safety field.
Results During the study period, 23 287 cases were peer reviewed; 15 739 (68%) were sent to local peer review committees for final outcome determination after an initial review and 2320 cases were ultimately designated as substandard care (mean 17 cases/facility). In 20% of cases, the screening source was unspecified. The most common process contributing to substandard care was ‘timing and appropriateness of treatment’. Approximately 16% of committee reviewed cases had diagnosis-related performance concerns, which were estimated to occur in approximately 0.5% of total hospital admissions.
Conclusions Peer review may be a useful tool for HCOs to assess their sharp end clinical performance, particularly safety events related to diagnostic and treatment errors. To address these emerging and largely preventable events, HCOs could consider revamping their existing peer review programmes to enable self-assessment, feedback and improvement.
- Performance measures
- Risk management
- Diagnostic errors
- Healthcare quality improvement
- Patient safety
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.
The success of efforts to improve patient safety hinges in part upon the availability of valid measures of errors and other safety-related problems and outcomes.1–3 Current methods used to quantify safety concerns are generally incomplete, underestimate the magnitude of risk to safety4–6 and are often limited by the means available to collect the necessary information.7 While efforts to refine measurement of patient harm or treatment errors have increased, targeted surveillance of patient safety outcomes in both established areas of risk (e.g. surgical site infections, adverse drug events) and emerging areas (e.g. diagnostic errors, treatment delays) remains underdeveloped.8–12
There is a compelling need for measures of risks at the ‘sharp end’ of patient care, that is, the events or circumstances most proximal to adverse outcomes.1 ,13 One resource for capturing such information is peer review or professional evaluation of a similar professional's performance. Peer review can shed light on provider-related errors and associated safety concerns, some of which may be modifiable. There is evidence that current peer review processes are not effectively contributing to quality improvement.14–18 Additionally, when examined in aggregate rather than case-by-case, peer review data may be a useful adjunct to other methods of safety assessment.
Aggregated data from large healthcare organisations (HCOs) provide a rich sample from which to evaluate the value of peer review as a safety assessment strategy. For instance, the Department of Veterans Affairs (VA) has implemented a structured peer review mechanism throughout its healthcare system.19 Healthcare provided by VA professionals who exercise autonomous judgment (e.g. physicians, nurses, pharmacists, allied health professionals) is eligible for peer review. A recent Government Accountability Office evaluation found that although the process is fairly structured and national VA policy provides guidance to all facilities, current implementation and monitoring of peer review may lead to missed opportunities to improve patient safety.17
We partnered with the VA Risk Management Program Office to evaluate aggregated peer review data and explore its potential use in safety measurement. Our aims were to: (1) describe the outcomes of peer review processes within the VA healthcare system and (2) highlight opportunities for measurement and improvement of patient safety through peer review.
Development of the partnership
The VA's Risk Management Program Office is responsible for professional oversight and risk management activities. An advisory council was formed in 2009 to assess the efficacy of clinical risk management activities in VA and advise the Risk Management Program Director about possible programme improvements in three areas: peer review of individual healthcare providers, tort claim management and disclosure of adverse events to patients. Researchers (DWM and HS) participating on this council recognised that VA peer review was a potentially underexplored data source for studying patient safety. Since 2008, the Risk Management Program Office has maintained a centralised database of peer review outcomes. In 2012, it began to collect additional data about clinical processes contributing to safety concerns. The Risk Management Program Office also conducted an internal needs assessment survey of field-based risk managers and found several areas requiring development: transforming risk management strategies of VA to become more proactive in measurement and improvement of safety risks, coordinating risk management with other quality improvement activities, and developing mechanisms to detect and address diagnosis-related problems. Therefore, research interests and operational needs aligned to use available peer review data to describe outcomes of peer review processes and highlight opportunities to improve safety measurement.
Sources of data
The Veterans Hospital Administration is the largest integrated health system in the USA. Risk management professionals at all VA medical centres screen episodes of care for peer review using predefined criteria. Facilities are given flexibility in their methods for identifying cases for peer review. Some types of events, such as unexpected readmissions and inpatient deaths, can be identified through computer-based triggers or automatically generated reports, whereas others are brought to the attention of risk managers by clinicians and administrators. Once an episode of care has been selected for peer review, individual reviewers (usually providers of the same specialty) are assigned for evaluating care delivery. Reviewers are asked to assign one of the following levels to the case (also see figure 1):
Level one: Most experienced, competent practitioners would have managed the case in a similar manner.
Level two: Most experienced, competent practitioners might have managed the case differently.
Level three: Most experienced, competent practitioners would have managed the case differently.
A list of contributing processes, which was formulated by the Risk Management Program Office in 2004, guides reviewers to identify elements of care that might have contributed to substandard performance. Each review can be associated with more than one contributing process, although there were no prescribed definitions of the contributing processes. Upon initial peer review, episodes of care judged as potentially substandard (levels two and three) are forwarded to a multidisciplinary peer review committee, chaired by the facility's Chief of Staff. The professional under review is invited to submit written comments or appear in person to provide relevant information and discuss his or her decision-making during the episode. The committee makes the final determination regarding performance and assigns one of the three levels of care. The objective of this comprehensive process is a consistent, balanced and fair review in order to improve organisational performance and patient outcomes.19
US law requires that records and documents created through medical quality assurance programmes are confidential and privileged. In general, individual reviews may not be disclosed except for use in licensing or accreditation, for use by healthcare quality assurance programmes, as required for law enforcement investigations, or in aggregate form.19 Therefore, the peer review narrative, clinical information, including details of patients or providers, was unavailable for analysis, and the smallest available unit of analysis was the facility. The study was exempted from institutional review board approval.
Data collection and statistical analyses
Aggregate data are reported by VA facilities to the Risk Management Program Office and maintained in an administrative database. We analysed several data elements from the database including number of initial and final reviews associated with each level of determination, the source used to initiate peer review (e.g. mortality screen, adverse events, executive concern) and contributing processes associated with substandard care. Facilities were grouped by complexity (1a=highest complexity, 3=lowest complexity) which is a VA-assigned category that is based upon several factors including volume, teaching, research and intensive care unit capability.20 We used descriptive statistics (mean, median and IQR) to summarise these findings across facilities. Additionally, we described and then compared the proportion of cases in which each facility's initial review levels were upgraded (changed from a lower to higher level of care for cases that had the potential to be upgraded) with the proportion of cases in which its initial reviews were downgraded (changed from a higher to lower level of care for cases that had the potential to be downgraded) by the peer review committee using a paired samples t test. We also compared the per cent of diagnosis-related performance concerns per facility as a function of facility complexity using analysis of variance. Diagnostic errors have been identified as an emerging safety concern in the VA; therefore, we used admission data for each facility and estimated a system-wide prevalence of diagnosis-related performance concerns detected by peer review.21 ,22 Microsoft Excel and SPSS V.21.0 were used in all analyses.
Complete peer review and admissions data were available from 135 of 141 VA medical centres from October 2011 to September 2012. During this time period, 765 112 admissions occurred in the included facilities which ranged from large, urban tertiary care centres to rural hospitals with primarily outpatient care delivery. A total of 23 287 cases (approximately 3% of admissions) underwent initial peer review, of which 15 739 cases (67.6% of initial reviews or approximately 2% of admissions) were reviewed by the local peer review committees for final determination. Table 1 summarises initial and final peer review outcomes. Peer review committees designated 2320 cases as level three care (i.e. most experienced, competent practitioners would have managed the case differently), with a mean of 17.2 (IQR 8.0–23.0) cases per facility or 15.1% (IQR 8.5%–19.4%) of cases per facility. Committees were more likely to downgrade than upgrade the initial assigned level of care, t (134)=5.97, p<0.001. The IQRs demonstrate noticeable variation in peer review outcomes among individual facilities (figure 2). Most notably, there is a wide variation in the number of level one reviews. This variation is less pronounced, but discernible in level two and three reviews with several outliers present.
Table 2 presents the sources that led to an initial peer review. The largest source for initial peer review was classified as ‘other’, with no further description provided. The ‘other’ source identified a mean of 19.7% (IQR 7.0%–30.1%) of cases per facility. Most reviews were initiated by manual efforts: executive concern, identification of adverse events and mortality reviews. Use of automated screens to identify cases was limited except for identifying inpatient mortality (i.e. Screen 4) with a mean of 16.5% (IQR of 5.3%–24.9%) of reviews per facility.
Care processes implicated in peer review
The care processes that contributed to substandard performance as determined by peer review committees for level two and three cases are presented in table 3. Treatment-related processes were commonly determined to have contributed to substandard care. Specifically, problems related to ‘timing of treatment initiation and appropriateness of treatment’ and ‘performance of a procedure or treatment’ were reported in a mean of 23.4% (IQR 8.7%–31.6%) and 15.8% (IQR 3.0%–19.5%) of reviews, respectively. Other frequently cited factors were ‘other relevant aspects of care’ (uncategorised) in 20.5% of cases (IQR 3.9%–25.0%) and ‘medical record documentation’ in 16.3% (IQR 7.0%–22.2%).
In light of few existing measures specific to diagnosis-related performance concerns, we examined the subgroup of cases potentially related to these concerns. We defined a diagnosis-related peer review as involving one or more of the following contributing processes: choice and ordering of diagnostic tests; addressing abnormal results of diagnostic tests; and timeliness and appropriateness of diagnosis. We estimated the prevalence of diagnosis-related peer reviews conservatively such that for each facility, we selected only the most frequent of the three contributing process (although hypothetically, all three of processes could be assigned in a single review). Averaged across facilities, the mean proportion of peer reviewed cases involving a diagnosis-related performance concern was 15.5% (6.8%–19.8% IQR) per facility or about 0.5% of admissions (95% CI 0.37% to 0.64%): mean 50.3 per 10 000 admissions; median 25.9 per 10 000 admissions; and IQR of 14.7–52.4 per 10 000 admissions. The mean per cent of diagnosis-related performance concerns did not differ by complexity level of the facility (using analysis of variance; p=0.66). However, there are clear outliers within most complexity levels (figure 3). Therefore, learning opportunities may exist with regard to diagnostic performance and outliers may be facilities where concentrated quality improvement initiatives may have the greatest yield.
As part of a larger effort within the VA to strengthen quality and safety initiatives, we analysed aggregated data from VA facilities to explore the potential of peer review to assess the ‘sharp end’ of patient safety. Approximately one in seven reviewed cases was assigned a level of care that suggested substandard performance, most of which was related to diagnosis and treatment issues. While we did not have access to narratives within this data source, our findings suggest that HCOs that have peer review programmes could use these as data sources for improvement at a local, institutional level and address the types of safety concerns that emerge. To our knowledge, this is the first description of aggregated peer review processes and outcomes in a large integrated HCO.
Peer review focuses on individual decision-making where few current safety measurement methods focus. Our data were useful in identifying the relatively high frequency of diagnosis and treatment-related performance concerns, an increasingly recognised safety issue in the VA and elsewhere.10 Nevertheless, our evaluation also reveals that further work needs to be done to define contributory care process elements and outcomes. For instance, approximately 20% of peer reviews were initiated from a source identified as ‘other’, and a similar proportion of reviews were determined to have a contributing process of ‘other relevant aspect of care’. There was substantial variation of peer review outcomes as demonstrated by the wide IQRs; for example, committee determined (final) level three outcomes varied from 8.5% to 19.4% of committee reviews. Variation was also noted in numbers of reviews by facility complexity, with the greatest variation in absolute number of level one reviews. The present data and study do not allow us to understand this variation beyond the fact that it was not dependent on the VA-assigned level of facility complexity. Future efforts, perhaps of a qualitative nature, may allow greater understanding of this variation and identification of best peer review practices for both VA and non-VA programmes. These practices could strengthen the peer review infrastructure within the VA and beyond, potentially leading to findings that can improve patient care.
Our study should provide the impetus to open the dialogue on how to use peer review data locally for quality and safety improvement. Ideally, peer review should be non-punitive, evaluative and corrective.18 The Joint Commission's standards maintain that effective peer review should be consistent, timely, defensible, balanced, useful and ongoing.18 However, several problems with peer review processes have prevented effective use in improving quality and safety. For instance, a 2008 commissioned study of peer review events reportable to the Medical Board of California concluded that peer review often fails because of excessive variation in policies, poor tracking of providers with substandard care, biased or ineffective reviews, lack of transparency, lengthy processes, omission of some physicians from review, and burdensome cost.16 Other barriers to effective peer review include lack of trained evaluators, lack of measurement instruments, time constraints, confidentiality requirements and difficulties with providing immunity to reviewers. Proposed modifications to improve the validity and reliability of peer review include use of multiple reviewers, objective or explicit assessment procedures, elimination of systemic reviewer bias, higher standards for peer reviewers, and assessment of patients’ outcomes.14 ,23–29 Nevertheless, peer review practices remain variable across healthcare systems and institutions.30 ,31 While some top-down or system-level guidance may be needed to mitigate these barriers, local efforts to encourage provider engagement in peer review processes may facilitate organisational learning and quality improvements.32 ,33
Confidentiality restraints of the final peer review ‘product’ should not thwart the development of opportunities for learning and feedback on safety issues and improving the peer review process itself. Despite the final ‘product’ being confidential and unavailable for researchers to study, individual institutions have ready access to these data but these data are not being optimally used for improvement despite significant time and resource investment. Risk managers and other quality and safety personnel who have access to these data could use it for improvement at a local, institutional level. For example, repeat level threes for the same physician in a year might be a signal for either a competency problem or unprofessional behaviour, prompting additional examination by the institution. Similarly, peer review data could be meaningfully used by institutions to identify additional ‘red-flags’ for action and feedback for improving both sharp end (provider) and blunt end (system) performance.34
Our findings underscore the importance of emerging safety concerns related to diagnosis and treatment. A 2010 Medicare Inspector General report on adverse events in hospitals found that almost half of preventable events were largely attributable to delays in diagnosis and treatment.35 Furthermore, the prevalence of diagnostic errors has been reported from the Netherlands to be approximately 0.4% of hospital admissions.36 Our estimate of 0.5%, though obtained through different methods, is comparable and provides potential credibility to our data source. Peer review processes may help uncover deficiencies related to both diagnosis and treatment-related concerns. For example, diagnostic errors often involve provider-related factors37 ,38 and recent research has brought attention to diagnostic errors as a priority area in both VA and non-VA settings.22 ,39–43 Since peer review focuses on the performance and decisions of individual providers, it may serve as a unique source of insight into the cognitive and provider factors affecting diagnostic performance within the larger context of the healthcare system. Given the relative lack of attention given to diagnostic errors and the lack of robust measurement tools for diagnostic errors, we believe that considering peer review as a potential tool to address this safety issue is important.44 ,45 While it has limitations we described above, peer review is a readily available tool that organisations can leverage to make progress on improving diagnostic safety and performance.
Currently, there are no formal feedback methods for clinical performance related to diagnosis.46 We posit that peer review data could be used for HCOs to perform a self-assessment on diagnosis-related performance concerns and to generate confidential, non-punitive feedback to frontline providers for helping them better ‘calibrate’ their diagnostic performance.47 Existing quality measures such as readmission rates may capture some sharp end clinical performance problems, but are also likely to reflect a variety of other quality concerns (e.g. medication adherence, iatrogenic complications, care coordination, patient engagement). There are limitations of using other measurement methods for this purpose, including voluntary reporting and direct observation, and random record reviews can be expensive. Future research could help understand how best to use peer review for assessment and feedback related to diagnosis-related performance concerns. Recognising that peer review potentially offers new types of knowledge, the VA Risk Management Program Office is now considering strategies to strengthen this programme by: (1) improving the tracking of providers who exceed a predetermined threshold of substandard care; (2) having more rigorous categorisation of peer review screens and contributory processes; and (3) using peer review to study diagnosis-related problems.
A notable strength of this study is the large size derived from a fairly structured peer review process within a single-payer system. However, limitations include the unknown sources of variation in the safety concerns leading to peer review in facilities that differ in size and populations served. The peer review data were originally collected for administrative reporting and not for research purposes. Therefore, several elements of peer review have not yet met the rigour of research-related definitions and VA facilities might implement the national peer review policy according to locally developed standards. Furthermore, potentially valuable information contained in the narrative report of peer review documents was not available for analysis. The present data set does not contain information regarding patient outcomes or the actions taken to improve a provider's performance, if any, after a peer review. Therefore, we were not able to assess the extent to which peer review can be used to change provider behaviour and improve care delivery. Newer research models such as participatory research or researcher-in-residence may facilitate better learning opportunities while respecting the legal restrictions to protect peer review.48 Despite our study's limitations, peer review processes warrant further scrutiny as tools for identifying safety concerns that are not consistently captured within other safety measurement systems.
In conclusion, our partnership-based research found that peer review may be a useful tool for HCOs to assess their sharp end clinical performance, particularly related to diagnostic and treatment errors. Comprehensive peer review programmes could provide HCOs with data for self-assessment and feedback to address these emerging and preventable safety events.
Portions of this work were presented as a poster at the Diagnostic Error in Medicine 7th International Conference, September 2013
Contributors Study concept and design: DWM and HS. Acquisition of data: ANDM, BR and YNW. Statistical analysis: ANDM. Analysis and interpretation of data: DWM, ANDM and HS. Drafting of the manuscript: DWM. Critical revision of the manuscript for important intellectual content: DWM, ANDM, BR, HS and YNW. Administrative, technical or material support: DWM, ANDM, BR, HS and YNW. Study supervision: DWM and HS.
Funding This project is supported by the VA Health Services Research and Development Service (CRE 12-033; Presidential Early Career Award for Scientists and Engineers USA 14-274), the Agency for Health Care Research and Quality (R01HS022087), the VA National Center of Patient Safety and in part by the Houston VA HSRD Center of Innovation (CIN 13-413), and Baylor College of Medicine's Department of Family and Community Medicine Primary Care Research Fellowship/Ruth L Kirschstein National Research Service Award T32HP10031 (Meeks).
Disclaimer The views expressed in this article are those of the authors and do not necessarily reflect the position or policy of the Department of Veterans Affairs or the United States government.
Competing interests None.
Ethics approval Institutional review board waived review.
Provenance and peer review Not commissioned; externally peer reviewed.