Article Text

Download PDFPDF

Teamwork and team performance in multidisciplinary cancer teams: development and evaluation of an observational assessment tool
  1. Benjamin W Lamb1,2,
  2. Helen W L Wong1,
  3. Charles Vincent1,
  4. James S A Green2,
  5. Nick Sevdalis1
  1. 1Department of Surgery and Cancer, Centre for Patient Safety and Service Quality, Imperial College London, London, UK
  2. 2Department of Urology, Whipps Cross University Hospital, London, UK
  1. Correspondence to Benjamin W Lamb, Department of Surgery and Cancer, Imperial College London, 5th Floor Medical School Building, St Mary's Hospital, London W2 1PG, UK; benjamin.lamb{at}


Aim Team performance is important in multidisciplinary teams (MDTs), but no tools exist for assessment. Our objective was to construct a robust tool for scientific assessment of MDT performance.

Materials and methods An observational tool was developed to assess performance in MDTs. Behaviours were scored on Likert scales, with objective anchors. Five MDT meetings (112 cases) were observed by a surgeon and a psychologist. The presentation of case history, radiological and pathological information, chair's effectiveness, and contributions to decision-making of surgeons, oncologists, radiologists, pathologists and clinical nurse specialists (CNSs) are analysed via descriptive statistics, a comparison of average scores (Mann–Whitney U) to test interobserver agreement and intraclass correlation coefficients (ICCs) to further assess interobserver agreement and learning curves.

Results Contributions of surgeons, chair's effectiveness, presentation of case history and radiological information were rated above average (p≤0.001). Contributions of histopathologists and CNS were rated below average (p≤0.001), and others average. The interobserver agreement was high (ICC=0.70+) for presentation of radiological information, and contribution of oncologists, radiologists, pathologists and CNSs; adequate for case history presentation (ICC=0.68) and contribution of surgeons (ICC=0.69); moderate for chairperson (ICC=0.52); and poor for pathological information (ICC=0.31). Average differences were found only for case-history presentation (p≤0.001). ICCs improved significantly in assessment of case history, and Oncologists, and ICCs were consistently high for CNS, Radiologists, and Histopathologists.

Conclusions Scientific observational metrics can be reliably used by medical and non-medical observers in cancer MDTs. Such robust assessment tools provide part of a toolkit for team evaluation and enhancement.

  • Multidisciplinary
  • neoplasm
  • team
  • assessment
  • observation
  • decision-making
  • healthcare quality improvement
  • human factors
  • shared decision-making
  • teamwork

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Cancer care in many countries is now delivered by multidisciplinary teams (MDTs), which typically consist of surgeons, physicians, oncologists, radiologists, pathologists, clinical nurse specialists (CNSs) and palliative-care specialists.1–3 The MDT performs a variety of tasks including presentation and consideration of information that is relevant to the patient and to the treatment of their disease; discussion of the information; reaching a decision about a possible diagnosis; deciding on and recommending one or more treatment plans for a patient; and recommending further investigations as required to allow treatment to be planned. The forum for these tasks is the ‘MDT meeting,’ which occurs approximately once a week.4 The outcomes of an MDT can fall into one of several categories, including recommendation of one or more treatment plans for a patient; recommendation of further investigation/opinion in order to allow treatment to be planned; deferral of the case discussion or any of the decisions above to a later meeting owing to a lack of information; neither deferral nor decision. Good performance should take account of both the process (tasks) and the outcome, namely: comprehensive consideration of holistic information relating to the patient and their disease with open discussion between interested healthcare professionals; and recommendation of one or more treatments that are both clinically appropriate and acceptable to the patient, on the first presentation of the case at an MDT meeting.5 6 Many countries are adopting MDTs as the preferred structure for clinical decision-making in cancer care.1–3 A solid body of evidence shows that MDTs can bring about improvements in clinical care by consensual decision-making and good teamworking.7–10

Although MDTs can enhance patient care, decision-making and overall performance in these teams are variable.7 The existing evidence base reveals several factors affecting the performance of cancer MDTs. These include the presence of proper imaging, histopathological information and detailed consideration of comorbidities and patient preferences11–17; open and equal contribution of team-members18 19; and effective leadership.7 However, there are currently no structured or standardised methods for conducting a case discussion and no current requirements for minimum data sets for radiology, pathology, case history or any other information.3 Optimising these factors can enhance team performance by ensuring that decision-making is based on a consideration of complete information, all team-members have had a chance to take part in case discussions, and the decision made can be implemented.5 6

The lack of an agreed-upon standard extends into the existing methods of reviewing the performance of MDTs. Such processes typically involve subjective assessment by a clinical expert. In the UK, this is a mandatory process known as ‘peer review’ and involves gathering data largely focused on the organisation of cancer services. Little attention is paid to teamworking and quality of clinical decision-making—despite their importance in the management of patients with cancer.3 Some research studies have attempted to address this gap by using qualitative methods to assess teamwork in MDTs.18 19 Although useful for exploring novel research questions in depth, such methods are time-consuming and resource-intensive for practical clinical use.

Quantitative approaches to assess team performance are yet to be trialled in the context of cancer MDTs. In recent years, there have been advances in quantitative research into team performance in healthcare. Research has been translated from industries such as commercial aviation and the military20 21 to specialties such as surgery and anaesthesia, and reliable and valid tools for assessment and feedback to improve team performance have been developed and are currently in use.22 23 The aim of the research reported here was to build on research in other healthcare specialties and construct a robust (ie, reliable and valid) and feasible observational tool to assess team performance in MDTs. Specific objectives are:

  • to construct a robust tool for systematic assessment of MDT team performance;

  • to assess which aspects of MDT team performance can be validly and reliably assessed by observers;

  • to assess the usability of the tool by clinical and non-clinical observers (including observers' learning curves, ie, whether observers' reliability improves as their volume of observations increases).


Tool development

To ensure content and face validity, tool development proceeded in two phases:

In Phase 1, the evidence base on team performance in cancer MDTs was reviewed to ensure that the assessment tool captures all relevant aspects of team functioning.5 6 The review revealed that aspects of team performance can be organised into an input-process–output model, with component parts further grouped depending on whether they were technical and non-technical. This model has been described in more detail elsewhere5 6; the key aspects of team performance that emerged from the review are outlined below:

  • Information presentation to the team: coverage of all relevant domains for all patients in the discussion list.

  • Team leadership: aspects of effective and ineffective leadership and on MDT decision-making.

  • Team decision-making processes: level of involvement of different professional groups; ability to reach and implement a decision.

Phase 2 involved the modification of an existing validated tool that assesses team performance to include the themes of cancer team functioning that emerged from Phase 1. The Observational Teamwork Assessment for Surgery (OTAS), a validated structured observational tool for use in operating theatre teams,23–25 was chosen as a basis for MDT observation. In OTAS, behaviours of operating theatre team-members are scored by an observer (clinician or psychologist) on Likert scales with reference to predefined observable anchor behaviours. OTAS thus captures a range of behaviours (including communication, leadership, team monitoring and others) of all members of a team (nurses, surgeons and anaesthetists) using systematic, objective anchors. OTAS remains one of the most comprehensive and best validated tools to assess teamwork within a healthcare context,23–25 and therefore in this phase of our tool development, we sought to replicate these aspects. A team of experts (one consultant surgeon: JSAG), a patient safety expert (NS) and a health psychologist (HWLW) converted the themes from Phase 1 into behaviours applicable to cancer teams. An oncologist (HP) and a CNS (PA) were consulted to ensure and check the face validity of the chosen example behaviours. These behaviours, which members of MDTs could reasonably be expected to exhibit during an MDT meeting, were used as examples of optimum behaviour. Observed behaviours could be compared with the example behaviours (anchors). Five-point Likert scales were introduced to define a range of anchors. A score of 5 represents the evidence-based optimal behaviour, and a score of 1 a behaviour contrary to the defined optimum. A score of 3 was chosen as the midpoint of the scale, representing behaviours that are exhibited to some degree, but not consistently. Finally, scores of 2 and 4 were included to allow the observers to grade their observations. This approach to scoring via observation (low, medium and high anchors, plus scores that allow grading) has been used extensively in relation to both team skills and behaviours,24 but also in relation to technical aspects of clinical performance.26 The behaviours included in the assessment tool were: presentation of radiological and pathological information, performance of the chair and contribution to team decision-making of surgeons, oncologists, radiologists, pathologists and CNS. Upon completion of the study, modification of the wording of behaviour anchors was made to match the observation method. An abbreviated version of the tool is shown in figure 1.

Figure 1

Multidisciplinary team observation assessment tool with behavioural markers for scale ratings (a full version is available from the authors upon request).


Data were collected from five MDT meetings (112 patients with cancer) across three different MDTs of three separate hospitals in England by a total of 78 team-members. The observed MDTs were general urology MDTs.


Ethical approval for the study was given by the South East London five Research Ethics Committee. A surgeon of Registrar level (BWL) and a psychologist researcher with expertise in observing healthcare teams (HWLW) sat in and observed the MDT meetings. Oral informed consent was given by team members. The observers used the tool to rate team decision-making for every patient discussed and were kept blinded to each other's ratings. At the end of the observation period, data were collated for statistical analyses.

Data analyses

Mean and median, SD and range are reported for behavioural ratings. Differences in ratings of the various behaviours were assessed statistically using the non-parametric Kruskal–Wallis test. All behaviours were assessed against the scale midpoint (3) using a one-sample t test. The interobserver reliability was assessed statistically, first, by demonstrating that there is adequate correlation between the scores awarded by the observers across cases (using intraclass correlation coefficients (ICCs) and 95% CIs)27; and second, by comparing the average score per behaviour awarded by each observer (non-parametric Mann–Whitney U tests). Finally, to assess improvement in tool utilisation over time, observed cases were grouped into cohorts of 10 and ICC calculated for each cohort. Improving ICCs would demonstrate learning curves in tool usage by the observers.

All statistical analyses were performed using SPSS version 17.0. Significance was taken at the 0.05 level, and Bonferroni correction was used to correct for multiple tests.


Descriptive information on the meetings observed is given in table 1. Between 18 and 34 cases were discussed at each meeting. The meetings were busy, with an average 19 cases discussed per hour (average of 3.2 min spent on team discussion per case). Cases were presented by a range of MDT members, predominantly consultant surgeons and oncologists. The majority of the time for each case was taken with information presentation, and often discussion occurred concurrently with presentation of information.

Table 1

Descriptive data of multidisciplinary team meetings observed

Ratings of quality of information presentation and team-members' contributions

Table 2 summarises the observers' ratings. Both observers (surgeon and psychologist) rated the various aspects of information presentation and team-member contribution in the same rank order. Significant differences were observed between different information presentation and team members' contributions (Kruskal–Wallis, p<0.05) indicating that the observers rated some behaviours significantly higher than others.

Table 2

Observers' ratings across all categories

Regarding the quality of presented information to the team, case history information was rated highest by both observers (observers' mean=3.93, SD=0.89), followed by radiological information (observers' mean=3.56, SD=1.72) and pathological information (observers' mean=3.03, SD=1.27). Regarding each team-member's contribution to discussion, surgeons were scored highest (observers' mean 4.05, SD=1.26) and CNS lowest (observers' mean=1.60, SD=1.07), with other team members in between (p<0.05).

Ratings were tested with a one-sample t test to see whether they were significantly different from the scale midpoint (3)—higher than 3 indicates good teamworking; lower than 3 indicates poor teamworking; and non-significant differences indicate average teamworking. The contribution of the surgeons and the behaviour of the chair, as well as presentation of case-history information, and of radiological information were rated above average (p≤0.001), whereas the contributions of histopathologists and CNS were rated below average (p≤0.001). Presentation of pathological information, and contribution of oncologists and radiologists were rated average (p=0.78; p=0.77; p=0.09 respectively).

Inter-rater reliability

Table 2 also summarises the observers' ICCs. The ICCs obtained (marked Embedded Image) were very high (>0.70) for five of the nine assessed categories, and about adequate for rating of case-history information (0.68) and contribution of surgeons (0.69). Rating of the quality of presentation of pathological information only reached poor reliability (0.31). Finally, reliability for assessment of the chairperson was moderate (0.52).

Table 2 also presents analyses of average differences in the scoring between the two observers (marked Embedded Image). These revealed no significant differences in eight of the nine categories, demonstrating that the tool allows both observers to score consistently, without one being significantly more or less lenient than the other. Differences were obtained for case-history information, where the surgeon observer gave significantly lower scores than the psychologist.

Observers' learning curves

Figure 2 displays ICCs for cohorts of 10 cases in each category of observation. The plots are annotated with a solid line at ICC=0.00 and a dotted line at ICC=0.70. Reliability coefficients below the solid line (≤0.00) indicate serious disagreement; observations between the solid and the dotted line (≥0.00 and ≤0.70) indicate some agreement, and observations above the dotted line (≥0.70) indicate very good agreement between surgeon and psychologist observers.

Figure 2

Graphs depicting intraclass correlation coefficients (ICCs) for cohorts of 10 cases in each category of observation. The solid line marks ICC=0.00, and the dotted line ICC=0.70. Reliability coefficients below the solid line (≤0.00) indicate serious disagreement; those between the solid and the dotted line (≥0.00 and ≤0.70) indicate some agreement; and those above the dotted line (≥0.70) indicate very good agreement.

The two observers showed good agreement in their assessment of CNS, radiologists and histopathologists. From the other behaviours, learning was noted in the assessment of case history information and oncologists. Assessment of surgeons and radiological information showed a more mixed pattern, with some good agreement, but not consistent improvement over time. Finally, the observers did not reach adequate agreement for the presentation of pathological information and the contribution of the chair.


Cancer care MDTs need to be able to assess their own performance objectively and reliably. This study demonstrates that observational metrics can be used reliably by medical and non-medical observers to assess a range of aspects of MDT performance. The two observers statistically agreed on ratings of quality of information presentation and team members' contribution, with the exception of the rating of the performance of the MDT chair. Reliable assessment was demonstrated for seven of the nine aspects of the MDT performance (case history and radiological information; contribution of surgeons, oncologists, radiologists, histopathologists and CNS).

As the first study of its kind, these findings are scientifically very encouraging. The poorer reliability in the assessments of the Chair's contribution requires further clarification, as this is an important aspect of MDT working. Assessment of quality of presentation of pathological information was not reliable, suggesting that assessment of this information may be feasible only for medically trained observers. Finally, reliability was consistently high for CNS, radiologists and histopathologists; learning curves were evident for oncologists and case-history information but mixed for surgeons and presentation of radiological information. From our research groups' extensive experience with training observers to assess behaviours in healthcare teams,23–25 these findings offer evidence that MDT behaviours can indeed be observed robustly in real time.

These findings are subject to certain limitations. First, the sample of MDTs was small and potentially unrepresentative of MDTs in general. However, with three different hospitals in the study and variation in the number and length of team discussions, these MDTs were mixed and representative of urological MDTs, We do recognise of course that although certain aspects of MDT working are shared across tumour types, other aspects of urological MDTs are peculiar to this specialty. For example, cases of renal disease tend to be staged without biopsies, with subsequently little input from histopathologists. Furthermore, many patients referred with suspected cancer are not seen by the CNS before the initial MDT discussion, which limits the potential for involvement of the CNS at the team discussion. Once patients are given their diagnosis, the CNS is usually very involved. Further research is needed to discover whether our findings are applicable across tumour types. Second, at times, the meetings were extremely busy, and the case discussions were rapid. Closer inspection of figure 2 reveals a consistent ‘dip’ across most behaviour ratings in cohort 4. This cohort consists of observations carried out in meetings, in which 21 MDT members discussed 25 patients. In such conditions, it is difficult for an observer to capture and rate all aspects of a case discussion. This has been acknowledged as a limitation of all observational assessment of teamwork.28 An additional limitation that affects all observational research is the interpretation of silence.29 It is difficult for observers to rate whether a lack of input is constructive or not. For clarity in this study, a low mark was recorded. However, in reality, such silence during an MDT meeting may be necessary to permit ordered discussion—and thus further investigation of warranted and unwarranted silences in these meetings is required. Further, the unsystematic presentation of information and contribution to team discussion that was often witnessed made data capture difficult. Information on pathology or radiology was often presented within the case history, not by the relevant professional (ie, the histopathologist or the radiologist), making it difficult for the non-medical observer to tease out. The presence of the psychologist observer was intended to demonstrate whether an expert in human behaviour (rather than a clinical expert) would be able to capture human interaction in MDTs, and it remains for subsequent stages of the research to replicate these findings with non-domain expert observers. Finally, the correlations between observers for assessment of the chair were variable. One possible explanation is that the chair is the only member of the MDT with a dual roles—as clinician and chair—which at times were difficult to separate for the purposes of observation. However, a learning curve was apparent across the course of the study, which implies that the observers can eventually learn to separate these two functions.

The lack of standardisation of case discussion and team decision-making in MDT meetings was apparent during the course of this study. The observers found that information was presented in an unsystematic manner, and discussions did not always include all team-members. This was evident from the observers' assessment of the contribution of CNS—which were typically low. Future research should document current methods for conducting case discussions and common data sets for radiology, pathology, case history and other information, and assess the effect of more structured ways of working on MDT performance. This may have double benefits: both improving MDT performance and facilitating development of tools for team assessment and feedback. In addition, this study has demonstrated that a non-medical observer can capture many aspects of MDT working. Future research should investigate the effect of training on the ability of non-clinicians to assess MDT working—the type of training that is most effective (eg, simulation vs live observation), and the amount of it that is necessary to achieve good reliability across all aspects of team performance.30

Robust measures of cancer MDT performance will allow identification of factors that improve performance as well as those which are potentially detrimental.23 We envisage that future research should attempt to link team performance with clinical processes in cancer care—and potentially patient outcomes, though these are so multifactorial that direct associations may be difficult to detect.10 From a team development perspective, research on surgical teams has shown that structured assessment for formative feedback using evidence-based tools is well received by teams, who want to know how best to improve the ways they work.31 In the same way, the tool developed here could be used by members of MDTs, or by assessors from outside the team, as a vehicle to inform teams about some of their own strengths and weaknesses in relation to team working. A key benefit of the tool, compared with current systems such as peer review in the UK is that it offers a science-driven, standardised and transparent approach via well-defined behaviours that are evidence-based and a behaviourally anchored scoring system based on these behaviours. MDTs are a typical incarnation of a multiteam system, as described in the team literature.32 33 The interactions within and between the teams that make up the MDT, as well as between the MDT and the other teams in the care pathway, should be explored in order to define and improve teamwork across the entire cancer care pathway, with the ultimate goal of enhancing patient safety.


Healthcare providers and policy makers are seeking cost-effective and decentralised ways for healthcare teams to self-monitor and improve their performance. In this study, we have piloted an observational tool for use in multidisciplinary cancer teams and demonstrated content validity, face validity, feasibility and interobserver agreement. This alone, without further interventions (such as team training, team building or others) may help to highlight problems and issues in a systematic and transparent manner—such that some teams will be able to resolve these issues internally. Other teams may require further support, in the form of training or other interventions.34 With further validation, this observational assessment may therefore provide the first phase in a staged approach to assessing and improving teamworking within cancer MDTs.


H Payne and P Allchorne were consulted during the course of this study, and we are grateful for their help.



  • Funding This research was supported by the UK's National Institute for Health Research through the Imperial Centre for Patient Safety and Service Quality and Whipps Cross University Hospital NHS Trust R&D Department.

  • Competing interests None.

  • Ethics approval Ethics approval was provided by the South East London 5 REC.

  • Provenance and peer review Not commissioned; externally peer reviewed.