Statistics from Altmetric.com
Knowledge and application of non-technical skills (NTS) may represent the greatest challenge facing medical education today. For centuries, medical education focused on developing individual clinical knowledge and technical skills. But, the modern complexities of healthcare delivery and rapid expansion of medical knowledge necessitate a high-functioning team approach, which requires human factors engineering and NTS to operate effectively.
Other complex high-risk industries—like aviation, oil drilling, nuclear power and the military—have aligned their educational systems to match.1–3 While certain healthcare disciplines have developed frameworks to ensure the acquisition and maintenance of clinical and technical skills, no standard framework for NTS exists. With the increasing computational support for clinical and technical skills—decision aids, predictive algorithms, robotic surgery and image interpretation—the true added value of human clinicians may lie with their mastery of NTS.
Failure of NTS has been linked to poor quality and safety of care.2 A prospective observational study of 28 laparoscopic cholecystectomies found a strong correlation between surgical team situational awareness and fewer technical errors.4 In Japan, a 3-year retrospective review of fatal medical accidents submitted to a third-party safety organisation found roughly half to be due to failures of NTS, most often related to situational awareness, teamwork and decision-making.5 A review of malpractice claims identified >1 personnel involved in 83% of errors, but only 24% directly attributable to communication breakdown, which was the only NTS specifically studied.6 A review of trauma and orthopaedic-related adverse events from the National Reporting and Learning System found many to be related to NTS—situational awareness (52%), communication/teamwork (21%), leadership (16%) and decision-making (12%).7 Prospective direct observation of 293 surgical procedures found a strong association between less effective teamwork behaviours—as measured by the Behavioural Marker Risk Index—and a higher risk of death or serious complication, even after controlling for American Society of Anesthesiologists risk category.8 A multi-institutional retrospective review of Veteran’s Health Administration found interventions to increase NTS were associated with reductions in perioperative mortality.9 Although these studies demonstrate a link of NTS to quality and safety, quantifying the impact and comparing across studies is limited by a lack of standard definitions of NTS and relevant outcomes. Advancing our understanding of NTS requires a more thoughtful and standardised framework.
Boet et al review assessment focused on team performance in crisis situations10 and Higham et al review assessments of broader NTS domains across multiple healthcare disciplines.11 These rigorous and thorough reviews of the literature provide a snapshot of the contemporary science underpinning the measurement of NTS in healthcare and provide researchers guidance for selecting the best tools available to measure NTS. However, the greatest lesson to take from these reviews may be the current gaps in the measurement of NTS and how to advance the field.
The first gap is the lack of a standardised definition of NTS and subdomains. The framework used by Boet et al for teamwork-related NTS included a total of 14 subdomains, whereas Higham et al include four domains for all NTS.10 11 Each tool included in these reviews measured a unique subset of overlapping domains and often apply to narrow contexts.2 In addition, each of these tools is based on external observations, which may not capture some domains effectively, such as situational awareness and decision-making. These may be better assessed through direct measurement via mental task load index, eye-gaze patterns to assess attention and written examinations.2 Like the blind men and the elephant—each tool only measures a portion of NTS and so may instead interpret the elephant as a wall, spear, snake, tree, fan or rope.12 Generalising a tool beyond the context in which it was developed, combined with Goodhart’s Law where ‘when a measure becomes a target, it ceases to be a good measure’ leaves policy changes based on any of the existing measurement tools open to potentially damaging unintended consequences.13
The studies cited by both reviews also exhibit heterogeneity in the psychometric principles used to assess each tool. The Boet et al framework for reliability and validity include nine domains, whereas Higham and colleagues compare two domains of validity and treat reliability and usability as distinct single domains.10 11 Again, each of the tools is often assessed by only a subset of these psychometric test domains. Emphasising validity and reliability overlooks feasibility testing, which is critical for effective implementation. Assessing NTS at first may appear straightforward, but requires thorough training to assess properly.14
Finally, the included studies provide no benchmark for adequate performance in NTS. While NTS have been associated with improved quality and safety, the dose–response relationship remains poorly understood. Providing actionable feedback to clinicians requires comparison to a standard, therefore establishing ‘good’ or ‘poor’ performance. Currently, each individual institution or researcher is left to interpret the results on their own. Without standard definitions for NTS, psychometric assessment inclusive of feasibility and benchmarks for determining competence, any system-wide intervention to increase assessment and training in NTS will likely fail to reach critical mass and acceptance.
An important initial next step is to exponentially increase the data collected regarding the current state of NTS and their impact on patient outcomes. Automated real-time data collection of clinician interactions may provide better insight into the association between NTS and decrease the burden of assessment. In aviation, the cockpit black box records all communications between pilot and team, which are then rigorously analysed after adverse events. In healthcare, the medical record and clinician recollection of events are used as a proxy for this black box, which under-represents the nuances of decisions and clinical communications.15 Some surgeons are routinely capturing video to assess for technical skill.16 17 A system to automatically collect intraoperative events, dubbed the OR Black Box, has been used to assess for intraoperative distractions and adverse events but may also provide insight into NTS.18 Ongoing fears of loss of prestige and litigation stand in the way of routine recording of clinical interactions.2 Legal protection against the subpoena of clinical recordings is necessary.19
Once this robust data set is collected, we must establish a parsimonious set of domains—prioritised according to impact on patient outcomes, transferable across specialties and clinical domains and benchmarked for adequate performance. We must design assessment tools that prioritise domains of NTS most closely associated with poor outcomes. The emphasis should be on a small set of generalisable domains. Opportunities may then arise to automate data analysis through natural language processing and machine learning algorithms.20 21
The final step will be to link incentives to performance in NTS and an ability to surpass the benchmark. In the UK and the USA, the General Medical Council and Accreditation Council for Graduate Medical Education outline general principles, but no specific standards for certification.22 23 There exists a curriculum through the American College of Surgeons geared towards surgical residents.24 Guidebooks are available but lack a broader systematic approach with external accountability for organisations and clinicians.25 Some countries have attempted to more systematically incorporate human factors and ergonomics systematically into healthcare.26 Any system for accountability must provide a blame-free framework to support and remediate poor performers—both trainees and active clinicians—and thus prevent a failure to fail.27 Mature team simulation for NTS has already been initiated by malpractice insurers.28 We can look to Crew Resource Management (CRM) framework for licensing and competence assurance from civil aviation, nuclear power, offshore oil drilling, mining, rail and emergency services.29 30 However, we must not forget the importance to align systems design with any training paradigm to foster the application of CRM in the workplace. Training alone without system redesign with human factors in mind will be insufficient to ensure appropriate quality and safety.3 31
The authors of both of these systematic reviews10 11 should be commended for their rigorous work to aggregate and compare a wide range of disparate instruments. They provide guidance on how to navigate the current literature on assessing NTS in healthcare. However, the strongest take away from their work may be recognising the vast amount of work yet to do to quantify the impact of NTS in healthcare and standardise assessment. We need more robust data, a parsimonious set of NTS and a set of benchmarks and incentives to guide adoption among clinicians.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests RA is a consultant for Applied Medical
Patient consent for publication Not required.
Provenance and peer review Commissioned; internally peer reviewed.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.