In the surgical context proficiency refers to expert independent execution of an operation. It is best modeled by a zone rather that a sharp threshold, since surgeons bring different levels of innate abilities to the task [1]. In this model, the proficiency zone represents what society expects of fully trained surgeons: an outcome that varies from one surgeon to another within very narrow limits defined by the upper and lower thresholds. The proficiency-gain curve represents the course (with time and number of procedures performed) by which an individual surgeon reaches the proficiency zone and is then able to perform the operation consistently well with a good outcome [2].

Unfortunately, in the surgical literature the proficiency-gain curve is inappropriately referred to as the learning curve. There are many reasons against the use of this terminology but the most important is its connotation—learning refers only to cognitive knowledge (e.g., knowledge of the steps of the operation) whereas proficiency requires knowledge but in addition and especially, the ability to execute the procedure consistently well. Obviously, the transition from one to the other with training and experience is ultimately dependent on the innate abilities for psychomotor eye–hand coordination skills that individuals bring to operative surgery. The various published reports on “learning curves” for specific operations based exclusively on clinical endpoints, e.g., operative time, conversion rates, morbidity rates etc., and reaching conclusions on the number of operations required for completion of the “learning curve” (equated with proficiency in the execution), lack both science and validity. On a priori grounds, in the absence of data, the proficiency-gain curve is likely to be specific to the individual as it is to the intervention.

The acquisition of proficiency in the execution of an operation can be studied by human reliability assessment (HRA) techniques [3], which we have adopted to surgical care [4, 5]. In contradistinction to industrial HRA, clinical HRA is based on observational data capture and is thus referred to as observational clinical HRA (OCHRA). In the present study, OCHRA was used to study the proficiency-gain curve of a trained laparoscopic surgeon for one specific operation, following return to his hospital on completion of training. To our knowledge, this is the first detailed documentation of the proficiency-gain curve for any operation, open or laparoscopic.

Methods

Series

Following completion of fellowship training at Ninewells Hospital, Dundee for 8 months, which included operative exposure to advanced laparoscopic surgery and training in intracorporeal suturing and hand sutured visceral anastomosis in the Cuschieri skills, a prospective series of palliative bypass procedures were performed by the trained surgeon at the Imam Khomeini Hospital, Tehran, Iran on 14 patients with advanced gastric and pancreatic cancer involving the construction of 20 anastomoses: gastro-jejunostomy (GJ) and cholecysto-jejunostomy (CJ). The indications for bypass were gastric outlet obstruction and or jaundice. All the procedures were commenced with a staging laparoscopy to confirm unresectable disease.

All interventions were performed by the total laparoscopic approach with a positive pressure capnoperitoneum of 12.0 mmHg. Each patient was placed in the Trendelenburg position, thus enabling displacement of the transverse colon superiorly and identification of the ligament of Treitz. The proximal end of the jejunal loop for anastomosis to stomach or gallbladder (or both) was chosen at 15–20 cm below the Treitz ligament. The selected loop was grasped along its antimesenteric border and brought to the intended site (stomach or gall bladder) to establish easy reach without any tension.

Anastomotic technique

The anastomotic technique was standardized and consisted of a single-layer deep seromuscular continuous technique with atraumatic 03 Vicryl mounted on an endo ski (Ethicon). In both GJ and CJ the posterior suture line was performed first and, on completion, two enterotomies were made using the electrosurgical hook knife 3 mm on either side of the posterior suture line. The anterior suture line was then performed with inverting deep seromuscular continuous sutures. All knots (starter and terminal) were performed intracorporeally. Any prominent vessel was coagulated initially and any bleeding from the cut edges of the enterotomy was also controlled by electrocoagulation.

Follow-up

All patients were followed up to time of discharge. In patients undergoing GJ, a contrast radiological study was performed on the fifth day to evaluate diameter of anastomosis and exclude any leak. An abdominal ultrasound examination was performed in CJ cases on the fifth postoperative day to exclude any bile collections. The subsequent analysis by OCHRA of the unedited videotapes was performed in Ninewells Hospital under supervision of the senior author (A.C.) and a lecturer trained in human factors work including OCHRA (B.T.).

OCHRA analysis of videotapes

For the OCHRA analysis, each anastomosis was subdivided into nine tasks, subtasks (n = 38), and related steps (Table 1, web only). The OCHRA analysis documented for each task (with its subtasks and steps) instrument movement, errors and their types, execution time, and instrument traffic seeking to determine the proficiency-gain curve. The quality of the suturing was also assessed from the bite intervals and depths and knot (starter and terminal) quality. Performance shaping factors that contributed to the errors were also identified.

Table 1 Tasks and subtasks

Data analysis

The data were collated in Excel® (Microsoft, Redmond, Washington, USA) and analyzed by using the SPSS® software statistical package (SPSS®, Chicago, Illinois, USA). Data with normal distribution (Shapiro–Wilk and Kolmogorov–Smirnov tests) were analyzed by the independent sample T-test, whereas abnormally distributed data were analyzed by nonparametric tests (Mann–Whitney U test). The change in performance with experience was evaluated by multiple analyses of variance (MANOVA) by using estimated population marginal means for the various measures of performance. In all instances, significance was set at 5%.

Results

Clinical outcome

There were no postoperative leaks or collections, but complications occurred in four patients and included acute tubular necrosis (n = 1), cardiac arrhythmia (n = 1), pleural effusion (= 1), pulmonary collapse (n = 1), and persistent jaundice after CJ (n = 1). One patient died for reasons unrelated to anastomosis on the tenth day after operation.

Proficiency gain

Proficiency of execution in terms of anastomosis completion time was achieved by the surgeon after 14 hand-sutured anastomoses (Table 2, web only) and is shown graphically in Fig. 1. Another measure of proficiency in execution is obtained from analysis of the number of movements necessary to perform the various component tasks of the procedure. This reached a plateau around the 13th or 14th anastomosis (p = 0.006). A more quantitative measure of productive (as distinct from nonproductive) movements is derived from the economy of movement index (EMI) given by:

Observed number movements to complete the procedure/ideal number of productive movements necessary to complete the procedure

Table 2 Duration of tasks (min), total task time (min), and total operative time (min) for all the anastomosis in chronological order (web version only)
Fig. 1
figure 1

Improvement in execution time (task efficiency) across 20 consecutive anastomoses. Line drawn after log transformation

In this context, productive movements were defined as purposeful intended movements resulting in a positive result, i.e., a surgical step/task; as opposed to nonproductive movements, which achieved no discernible result or surgical step or which caused tissue damage.

The ideal value indicating perfect execution with maximal possible economy is 1.0. In the present study, the change in EMI with increasing experience by the surgeon is shown in Fig. 2. This demonstrates that his economy of movement improved from around an EMI of 7.0/6.0 to a consistent index of 3–2.0) after he had performed 14 anastomoses.

Fig. 2
figure 2

Economy of movement index (EMI) across 20 consecutive anastomoses. EMI is the observed number of movements to complete the task divided by the ideal number of productive movements necessary to complete the task. Perfect (ideal) performance is represented by a value of 1.0

Errors

The classification of total errors and their relative incidence for the procedures is shown in Table 3. More detailed information is obtained from the data on errors for each of the component tasks of the operations. The tasks (T) associated with high errors were T3 (mean 20), T4 (mean 20.5), T7 (mean 14.5), and T8 (mean 25.7)—all related to intracorporeal suturing. The error probability declined significantly (p = 0.006) with increasing experience (across the 20 consecutive anastomoses). The reduction in errors (analysis of estimated marginal means) with increasing experience is shown in Fig. 3.

Table 3 Classification and incidence of errors for entire procedures
Fig. 3
figure 3

MANOVA analysis of estimated population marginal means, indicating a significant reduction in error probability after the 14th anastomosis (p = 0.006)

Although the total number of errors enacted during use of the high-frequency electrosurgical hook knife was relatively small (90 for all anastomoses, mean 4.5), tissue damage was encountered in 31 instances (34%) and included liver surface burns (n = 31), gallbladder surface burns (n = 3), gastric serosal superficial burn (n = 1), and 14 omental/parietal burns.

Quality of suturing

The metrics for suturing errors were based on depth of suture bites, interval between bites, needle orientation not being at right (RT) angles to the jaws of needle driver or away from the jaw tips, needle swivel, and noncircular needle passage (details have not been included but are available from A.C.).

The majority of errors occurred during performance of the anterior suture line (n = 25). There were 75 instances of tissue damage, with 9 needing repair. In line with improved proficiency, the instrument traffic during suturing declined with experience and became significant after the 14th anastomosis (p = 0.0001).

Performance shaping factors

The predominant factor involved was concentration lapses (55%). Concentration lapses were identified by lack of demonstrable activity (productive or nonproductive) seen on the recording for more than 5 s. Other human factors were less frequent: misjudgment (9%), poor camera work (8%), and fatigue (5%). In only a small minority (1.4%) could no detectable cause for the error be established by the OCHRA analyses.

Discussion

To our knowledge, this is the first detailed report on the operative proficiency of a surgeon immediately after completion of the higher (fellowship) training and commencement of independent (unsupervised) practice in another hospital. The data, of course, refer to an individual surgeon and to a specific operation.

Proficiency is best regarded as a composite function of knowledge and innate ability relevant to the task against skill acquisition and experience (reinforcement and time) in execution. The proficient surgical operator eventually reaches the plateau or proficiency plateau—semi-automatic perfect execution of an operation—as distinct from the controlled-conscious mode which operates during the training and proficiency-gain period. In the absence of hard data, we hypothesize that each operation has its own average/median proficiency slope, i.e., the number of operations required to reach proficient performance by the majority. There is no information on the nature of the distribution of the slope of the curve for the majority of surgical trainees for any given operation; but whatever its nature, two outliers (extremes) are foreseen: the very gifted (with high-level natural aptitude) with a high-gradient slope (i.e., requiring fewer cases than the norm) and those with a low-gradient slope (requiring more cases to reach proficiency, for a variety of reasons, including hand dominance, varying levels of eye–hand coordination abilities etc.).

The continued use of “learning curves” for specific operations (usually laparoscopic) in the surgical literature is a pity, especially when adopted in otherwise sound studies involving factors that may influence acquisition of competence by trainee surgeons [6]. This terminology is inappropriate and confusing, especially when qualified by adjectives such as “steep,” inferring that the operation is difficult but imparting no other useful feedback or analyzable information. In addition, the measures for evaluating learning/proficiency-gain curves have, to date, been exclusively clinical: operating times (poor index of quality), conversion rates, morbidity, mortality, etc., while none have included errors and related ergonomic variables.

One study on the “learning curve” for percutaneous nephrolithotomy (CNL), which was based on three parameters (operating time, fluoroscopic screening time, and radiation dose), reported “competence” for this procedure after execution of 60 cases; but attainment of “excellence” required considerably more reinforcement and experience, and was acquired only after execution of more than 115 such interventions [7]. This grading of proficiency into “competent” and “excellent” (compared with an experienced consultant who had done more than 1,600 CNLs) is unsound. Thus, in this study, the stage classified as “competent” had longer screening time and larger radiation dose (manifestly still suboptimal compared with the expert), indicating that proficiency had not been reached, although the individual could do the operation. In this respect, proficiency for the individual performance of an operation is absolute (an all-or-none phenomenon), which is either reached or not reached. We also disagree with the view that the “learning curve” for the execution of a specific operation is never completely finished, as reported by Bollens et al. in relation to laparoscopic radical prostatectomy [8]. We do agree, however, that the proficiency slope declines (“tails”) before the plateau of proficiency is reached. This has been demonstrated semiquantitatively in the present study and has been previously reported in relation to laparoscopic cholecystectomy [9, 10].

Proficiency testing programs (PTPs) are used in approving/licensing analytical chemical and biochemical laboratories. There is evidence that such PTPs can be used to improve the performance and quality assurance of such laboratories [11]. In like manner, we believe that the laparoscopic cases performed by trainees as first surgeons, assisted by their tutors, should be recorded for subsequent critical but constructive review. In this exercise, the services of a human factors specialist, though helpful, is not essential, as an experienced senior surgeon can provide the necessary useful critical feedback to the trainee based on the video tapes. Despite the extra work involved, we recommend that regular (3-monthly) reviews of unedited tapes of the operations done by surgical trainees, with emphasis on those attended by postoperative complications, should become an integral component of the surgical curriculum. In contrast, scientific studies to establish the proficiency-gain curves for specific operations require a prospective OCHRA-type approach with involvement of human factors scientists working closely with surgeons, rather than inaccurate and misleading estimates based on retrospective analysis based on morbidity and operative times.