Article Text

Insightful practice: a reliable measure for medical revalidation
  1. Douglas J Murphy1,
  2. Bruce Guthrie1,
  3. Frank M Sullivan1,
  4. Stewart W Mercer2,
  5. Andrew Russell3,
  6. David A Bruce4
  1. 1Quality, Safety and Informatics Research Group, University of Dundee, Dundee, UK
  2. 2Institute of Health and Wellbeing, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK
  3. 3Medical Directorate, NHS Tayside, Dundee, UK
  4. 4Postgraduate General Practice Education, NHS Education for Scotland, UK
  1. Correspondence to Dr Douglas Murphy, Senior Clinical Research Fellow, University of Dundee, Mackenzie Building, Kirsty Semple Way, Dundee DD2 4BF, UK; d.y.murphy{at}


Background Medical revalidation decisions need to be reliable if they are to reassure on the quality and safety of professional practice. This study tested an innovative method in which general practitioners (GPs) were assessed on their reflection and response to a set of externally specified feedback.

Setting and participants 60 GPs and 12 GP appraisers in the Tayside region of Scotland, UK.

Methods A feedback dataset was specified as (1) GP-specific data collected by GPs themselves (patient and colleague opinion; open book self-evaluated knowledge test; complaints) and (2) Externally collected practice-level data provided to GPs (clinical quality and prescribing safety). GPs' perceptions of whether the feedback covered UK General Medical Council specified attributes of a ‘good doctor’ were examined using a mapping exercise. GPs' professionalism was examined in terms of appraiser assessment of GPs' level of insightful practice, defined as: engagement with, insight into and appropriate action on feedback data. The reliability of assessment of insightful practice and subsequent recommendations on GPs' revalidation by face-to-face and anonymous assessors were investigated using Generalisability G-theory.

Main outcome measures Coverage of General Medical Council attributes by specified feedback and reliability of assessor recommendations on doctors' suitability for revalidation.

Results Face-to-face assessment proved unreliable. Anonymous global assessment by three appraisers of insightful practice was highly reliable (G=0.85), as were revalidation decisions using four anonymous assessors (G=0.83).

Conclusions Unlike face-to-face appraisal, anonymous assessment of insightful practice offers a valid and reliable method to decide GP revalidation. Further validity studies are needed.

  • Audit and feedback
  • continuous quality improvement
  • general practice
  • governance
  • medical education
  • evaluation methodology

This is an open-access article distributed under the terms of the Creative Commons Attribution Non-commercial License, which permits use, distribution, and reproduction in any medium, provided the original work is properly cited, the use is non commercial and is otherwise in compliance with the license. See: and

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.


Revalidation of practising doctors has prompted a wave of worldwide interest and remains a high-stakes challenge.1 Doctors' capacity to self-regulate has been questioned,2 but the measurement of quality of patient care is complex and agreement on a UK revalidation system has been problematic and implementation repeatedly delayed (currently scheduled for introduction from late 2012). Unfortunately, there is a sparse evidence base to inform its implementation.3 Understandably, the public and government want clinically effective, safe and person-centred care delivered by competent and, ideally, excellent doctors.4 In the UK, the domains and attributes required of Good Medical Practice have been defined (box 1).5

Box 1

General Medical Council domains and attributes of a doctor for appraisal and revalidation

Domain 1: knowledge, skills and performance

  • 1. Maintain your professional performance.

  • 2. Apply knowledge and experience to practice.

  • 3. Ensure that all documentation (including criminal records) formally recording your work is clear, accurate and legible.

Domain 2: safety and quality

  • 4. Contribute to and comply with systems to protect patients.

  • 5. Respond to risks to safety.

  • 6. Protect patients and colleagues from any risk posed by your health.

Domain 3: communication, partnership and teamwork

  • 7. Communicate effectively.

  • 8. Work constructively with colleagues and delegate effectively.

  • 9. Establish and maintain partnerships with patients.

Domain 4: maintaining trust

  • 10. Show respect to patients.

  • 11. Treat patients and colleagues fairly and without discrimination.

  • 12. Act with honesty and integrity.

Revalidation aims to promote quality improvement as well as demonstrate a doctor being up to date and fit to practise.5 Current proposals in the UK include an annual appraisal to check the quantity and quality of workplace and continuous professional development data collected over a 5-year cycle.6 Satisfactory completion will lead to recommendation by an appointed Responsible Officer to the General Medical Council (GMC) for successful revalidation.6 This moves appraisal from its current focus on supporting professional development to judging evidence.7 Two issues need to be considered. First, continuous professional development has at its heart practitioners' ability to self-assess his or her educational needs. However, difficulties in recognising one's own (in)competence can lead to inflated or pessimistic self-assessments.8 Second, there is no evidence that assessment at appraisal of this type is reliable enough for use in such high-stakes as revalidation.9 As a possible alternative, formal examinations, such as those used by the American Board of Medical Specialties, could be used for revalidation in the UK, but knowledge on its own is unlikely to measure all the professional attributes of a doctor.10

To protect patients and ensure trust in doctors, we argue that we need a system of revalidation that is valid, reliable and supports reflective practice. Medical professionalism has been defined as a partnership between patient and doctor based on mutual respect, individual responsibility and appropriate accountability.11 This definition formed the rationale for a new concept tested in this study: insightful practice. Insightful practice was defined as doctors' willingness to engage with and show insight into independent credible feedback on their performance and, where applicable, take appropriate action for improvement.

The aim in promoting insightful practice was to help individuals build beyond the conscientious collection and reflection of evidence to include independently verified outcomes for professional improvement. A doctor's professionalism and suitability for revalidation would be evidenced by testing his or her levels of insightful practice by measuring his or her willingness to engage with revalidation (responsibility and accountability); to show insight12 ,13 into external feedback on his or her performance (mutual respect); and take action as needed to improve his or her patient care (partnership, responsibility and accountability). The study design took account of GMC attributes4 and was further underpinned by GMC guidance to Post-Graduate Deans and GP Directors on professional remediation.14 The GMC guidance advises that remedial training is only a practicable solution if a doctor demonstrates insight into his or her deficiencies and accepts that a serious problem exists, and that a remedial training programme can only be successful with the doctor's willingness and commitment.14 In addition, the same guidance advises that, when deciding whether the doctor is suitable for remedial training, the panel should consider whether the doctor has insight into and is willing to address the problem.14

The purpose of this study was to test if:

  • 1) Specified independent feedback (box 2) could validly cover necessary GMC attributes (box 1)15

  • 2) Participants' level of insightful practice offered a reliable basis for making recommendations on revalidation.

Box 2

Study's suite of independent feedback

Personal feedback

  • 1. Colleague (clinical and non-clinical) feedback: multi-source feedback.

  • 2. Patient feedback: patient satisfaction questionnaires.

  • 3. Open book self-evaluated knowledge test.

Team feedback

  • 4. Clinical governance data: prescribing safety and quality of care data.

  • 5. Patient complaints.


Included here is a summary of the methods. More information is available as a data supplement in the web appendices 1 and 2.16 ,17

This was a study which involved recruited general practitioners (GPs) collecting a suite of specified feedback on their performance. Participants completed a mapping exercise to test their agreement of the perceived validity of specified sources of feedback content at the start and end of the study. Participants received an appraisal from a GP colleague approved by the Health Board to help demonstrate their insightful practice by showing appropriate reaction to collected feedback. Doctors' success in showing insightful practice was subsequently assessed by the face-to-face appraiser and then again by three other anonymous appraiser assessors. The reliability of assessment of insightful practice (AIP) and subsequent recommendations on GPs' revalidation by face-to-face and anonymous assessors was investigated using Generalisability G-theory.9 Decision (D) studies were conducted to determine the number of assessors required to achieve a reliability of 0.8, as required for high-stakes assessment.9

Participants and sample size calculation

Sixty-one participants were recruited from all GPs (n=337) within the National Health Service in Tayside in Scotland. Three information meetings were held, in different geographical locations, at the end of which GPs signed a register to confirm their interest in taking part. A consent form was then sent to each participant along with a covering letter and study information sheet. Participating GPs received financial reimbursement: equivalent to 17 h extra payment per GP participant in addition to existing reimbursement for participation in the Health Board's existing statutory annual appraisal system. This additional payment was to allow for the estimated additional time commitment to collect the study's multiple sources of evidence on more than one occasion. The power calculation was based on Fisher's ZR transformation of the intraclass correlation coefficient.9 Given a required reliability intraclass correlation coefficient R of 0.8 for a high-stakes assessment of portfolios,9 specified SE of the reliability of 0.05 and three assessors of each subject, Fisher's ZR transformation specified a minimum of 46 subjects.

Performance measures and data collection

The study appraisal process was facilitated by a website called Tayside In-Practice Portfolio developed to administer, collect and assess all participant data,18 making the allocation of tasks and feedback feasible. GPs were asked to collect specified data (patient and colleague feedback including complaints) and were also provided with feedback on their practice team's quality of care and prescribing safety (table 1). GPs were then asked to reflect on this specified suite of feedback in a portfolio to be submitted for appraisal.

Table 1

Summary of tools used and processes followed*

Content validity of feedback

To ensure the content validity of the feedback in terms of the proposed suite of feedback covering the required GMC attributes,5 each participant completed a mapping exercise of his or her perception (prestudy) and experience (poststudy) on each feedback tool's capacity to test the GMC attributes (see online appendix 1).

Study steps: reflection, appraisal and assessment

Step 1: Mapping exercise 1: June–July 2009 (online appendix 1).

This measured participant prestudy perceptions of the specified suite of feedback table 2.

Embedded Image

Step 2: Collection of specified feedback: July–September 2009.

Study participants were provided with data via the study website including:

  • a. Colleague and patient feedback (existing available tools)

  • b. Report on undesirable co-prescriptions (developed for study)

  • c. Quality outcome framework data (currently used in UK General Practice System of Remuneration).

Some additional data were personally collected by participants:

  • d. Patient complaints

  • e. Self-evaluated knowledge test: developed by the Royal College of General Practitioners.

Embedded Image

Step 3: Reflection on feedback and setting personal objectives for improvement (September–October 2009).

Having reflected on their performance feedback, participants used a reflective template with four 7-point Likert scales to rate each source of feedback data as having:

  • 1) Highlighted important issues

  • 2) Demonstrated concern in performance

  • 3) Led to planned change

  • 4) Given valuable feedback.

GPs then wrote a free-text commentary and framed any planned actions as Specific, Measurable, Achievable, Relevant and Timed (SMART) objectives (table 2).23

Embedded Image

Step 4: Participants then received a face-to-face appraisal under the existing appraisal system, after which they had the opportunity to amend or add any personal objectives (October–December 2009).

Embedded Image

Step 5: Assessment of participants' level of insightful practice by face-to-face appraiser postappraisal (October–December 2009).

Following the appraisal, the GP's appraiser rated the GP using an AIP template with four 7-point Likert scales. These related to GPs' engagement with the appraisal process, insight into the data collected, planning of appropriate action in response, and a global rating of their engagement, insight and action as a marker of GPs' insightful practice. Additionally, the appraiser was asked to assess whether the GP was ‘on track for revalidation’ (table 2).

Embedded Image

Step 6: The anonymous postappraisal assessment of participants' level of insightful practice by three additional anonymous appraisers postappraisal was completed by the same process as in step 5 (October–December 2009).

Embedded Image

Step 7: Mapping exercise 2: November 2009–January 2010 (online appendix 1).

This measured participant experience post study of the specified suite of feedback.

Table 2

Rating questions completed by general practitioner (GP) participants (preappraisal), by appraisers (after face-to-face appraisal) and by anonymous web-based portfolio assessors


The reliability of insightful practice as a measure was calculated using Generalisability G-theory following a web-based anonymous marking exercise after appraisal.9 Anonymous assessors were recruited from study appraisers (n=5) and included one Deanery assessor. Two groups of assessors (n=3) each marked 30 GP portfolios (raters nested within group). Reliabilities (internal consistency and inter-rater) of anonymous assessor decisions for AIP (Questions 1–3) and inter-rater reliabilities, intraclass correlation coefficients, and the associated CIs were calculated for AIP Questions 4 and 5 using Generalisability G-theory.9 Decision (D) studies were conducted to determine the number of assessors required to achieve a reliability of 0.8, as required in high-stakes assessment9 (see online appendix 2).

Participant experience

Participants' evaluation of the provided suite of feedback was investigated by comparing four groups:

  1. GPs with a satisfactory score (4 or above) in insightful practice.

  2. GPs with an unsatisfactory score (<4) in insightful practice.

  3. Face-to-face appraisers

  4. Anonymous assessors.

Mean scores for each participant's rating of the value of each source of feedback were calculated and any significant differences between participant groups (1–4) examined using ANOVA with post hoc testing of differences.


Included here is a summary of the results. More information is available as a data supplement in the web appendices 1 and 2.

In all, 61 GP participants were recruited to the study. Of these, 60 were established independent GPs and one was a GP practice locum practitioner. Participants worked in a range of urban (n=48), accessible (n=9), and remote (n=3) practices.24 Overall, 60 GPs (98.4%) completed the study, with one dropping out after completing an initial content validity (mapping) exercise.

Mapping exercise

GP participants completed a mapping exercise of their perception (prestudy) and experience (poststudy) on each feedback tool's capacity to test the GMC attributes5 (see online appendix 1).

Results for the poststudy mapping exercise are given in table 3.

Table 3

Mean general practitioner (GP) ratings of perceived ability of each feedback tool (columns) to assess the 12 General Medical Council (GMC) attributes (rows) after feedback received. Scale (1–7) for each GMC with a score of 4 as a neutral point*

Mean GP scores in the mapping exercise (1–7) for each GMC attribute (row) and tool (column) are given in table 3 with a score of 4 as the neutral point. All GMC attributes were covered (score>4) by at least one tool.

Reliability of participants' AIP as measured by face-to-face and anonymous assessors

There was a highly significant difference in the mean scores of global AIP (Q4) with face-to-face assessment scoring more highly than anonymous assessment (mean difference 1.07, 95% CI 0.73 to 1.41, t=6.29, 59 df, p<0.001). Dichotomous judgment on GPs' suitability for revalidation (AIP Q5) also revealed significant differences between face-to-face and anonymous assessment. No portfolio was considered unsatisfactory at face-to-face assessment, while 42/180 (23.3%) of the three anonymous markings of each of the 60 portfolios were considered unsatisfactory (χ2, value 16.97: p<0.001). Face-to-face appraisal did not discriminate between GPs and therefore could not be classed as reliable. In contrast, high reliability was demonstrated by anonymous global assessment by three assessors (G= 0.85) of GPs' insightful practice. A recommendation on GPs' suitability for revalidation was also highly reliable by four assessors (G=0.83) (table 4, online appendix 2).

Table 4

Reliability of assessment of insightful practice (AIP) questions 1–5

Participant experience

The four groups of participants rated the suite of five feedback sources positively (mean value rating over all feedback tools for each participant group above a neutral score of 4), with anonymous assessors giving significantly higher ratings than other groups (mean 5.4 vs 4.7–4.9, p=0.05) (table 5).

Table 5

Mean scores for reflective template questions (1–4) for feedback sources for each group (n=4)



This study demonstrates that a valid suite of independent feedback covering necessary GMC attributes can be created for use in GP appraisal and revalidation. Doctors' insightful practice, measured by GPs demonstrating accountability for making quality improvement where needed, offers a reliable basis for a recommendation on revalidation.


A system of revalidation is needed that is valid and reliable.23 Revalidation goals appear to include restoring public trust, promoting quality improvement and identifying doctors in difficulty, but there is a sparse evidence base to inform the introduction of an agreed system.3 This is the first study of which we are aware to formally use medical professionals' insightful practice as a proxy of workplace-based performance and to include a form of knowledge testing, an element of competency testing demanded by the Shipman Inquiry.2 Study methods were robust and the tested system included recently developed and innovative reliable indicators on high risk prescribing for participants to reflect on practice improvement.23


This work contributes to the limited evidence in this important area for both public and profession.3 ,25 The proposed role of insightful practice is to act as the hub within a continuous cycle to generate, monitor and maintain objective evidence of personal responsibility and accountability for quality improvement as needed (figure 1).

Figure 1

Cycle of insightful practice.

The nature of reflective practice makes its quantification a challenge.26 Doctors' capacity to show insight, overcome challenges and incorporate new behaviours and attitudes have previously been described as a fundamentally personal and subjective concept called mindful practice.27 Reflection and facilitation are known to prove useful for the assimilation of feedback and acceptance of change.28 Insightful practice is arguably a useful conceptual development, which both lends itself to the reliable measurement of objective outcomes and combines the subjective consideration of self-perceptions with the reflections and facilitation by others on needed insight for improvement. In addition, the study's combination of feedback from multiple methods, reflection and mentoring is consistent with the call for innovation in assessing professional competence and shows how assessment instruments might be used together to promote performance improvement.29 ,30 By placing a focus on productive reflection (engagement and insight) and needed action (life-long learning and appropriate response to performance feedback), measurement of insightful practice may also offer an answer to the call for innovation in measuring professionalism to cover previously poorly tested areas of seeking and responding to feedback and results of audit.31

A challenge for revalidation will be whether the system benefits all doctors, while still identifying those at risk of poor performance. If adopted, the tested system could meet this challenge by early and reliable identification of doctors' level of, and progress with, improvements in care, as well as allowing the monitoring of progress towards satisfactory revalidation. The collection of specified data is feasible if spread over the proposed 5-year cycle. The role, frequency and targeting of appraisal would need further consideration should such a system be implemented, with a possible reduction in ongoing scrutiny and support for those doctors shown to be ‘on track’. In cases of unsatisfactory progress, early identification would give maximum opportunity to target professional support (figure 1). No system guarantees identification and protection from criminal behaviour, but valid and reliable external monitoring should help to reassure the public in the quality and safety of their doctors. The participant mapping exercise gave evidence of content validity of the specified feedback. The subsequent agreement between participants that the suite of feedback was of value added further face validity to the system. Anonymous assessors' significantly higher rating of the value of the suite of feedback possibly reflected its help in quantification and discrimination of those assessed. It is interesting that opinion of patient and self-evaluated knowledge testing feedback both improved significantly with experience.


This study had limitations and there is a need for significant further research. The assessors' role and process of making judgements in a ‘live’ system of revalidation will need to be explicit to inform further research. While many health professionals believe that more objective is equivalent to better, this is not always the case. Much research in medical education has suggested that expertise is not always characterised by comprehensiveness. As a result, assessment processes that are scored by simple frequency counts of whether or not particular actions were taken tend to be less valid indicators of performance than more subjective global ratings provided by informed raters.32 This concept underpinned this study's investigation of insightful practice as a possible foundation for revalidation recommendations.

While reliabilities reported in this study were generalised across assessors, using G-theory and associated D studies,9 the participants were limited to GPs in a single region of Scotland. Future research needs to focus on the capacity of insightful practice to offer reliable and valid measure in the performance across other settings and specialties as the measurement properties of every instrument are specific to the population on which the instrument is tested.9

In addition, although the literature supports insightful practice as a proxy measure for successful performance improvement,11–14 the construct validity of this was not possible to test within this study. Engagement in appraisal is needed to promote improved clinical management,33 and GMC recommendations on remediation highlight the importance of insight and capacity to address problems.14 Although there is evidence that well-founded and well-planned change is still a reasonable surrogate for successful implementation,34 it was not possible in this study to track whether GPs' SMART personal objectives were carried through.23 This requires further research to demonstrate.


The real test of revalidation will be whether its introduction leads to improvement in the quality and safety of healthcare. Further research will be needed, but public trust in doctors requires them to be held to account for their own performance and urgent progress is long overdue. The appraisers' role in revalidation could lie among coaching, educational advocate and supporter at one end, and assessor accountable for revalidation and the quality of its outcome at the other. This study's findings suggest that a single face-to-face appraiser is unlikely to be able to make a valid or reliable judgement about fitness for revalidation, but that anonymous measurement of insightful practice offers an alternative platform from which a robust system of revalidation could be developed and implemented.


We thank all the general practitioners and general practice appraisers who took part in the study; programmers Jill JeanBlanc and Keith Milburn who helped develop materials and the study website; and Selene Ross who acted as the study administrator.


Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:


  • Funding The study was funded by the Chief Scientist Office (CSO) Scottish Government, Royal College of General Practitioners (RCGP), NHS Education for Scotland (NES) and Scottish Patient Safety Research Network (SPSRN). DM, BG and FS are employed by University of Dundee. SM is employed by the University of Glasgow; AR is employed by NHS Tayside, and DB by NHS Education for Scotland. All authors had full access to all the data and agreed responsibility for the decision to submit for publication independently from any funding source. DM is supported by a Primary Care Research Career Award from the Chief Scientist Office, Scottish Government.

  • Competing interests None.

  • Ethics approval Formal application and submission of the research proposal was made and ethical approval granted for all of the work contained in this paper by the Tayside Committee on Medical Research Ethics A. Participants gave informed consent before taking part.

  • Provenance and peer review Not commissioned; externally peer reviewed.