Background As quality improvement (QI) programmes have become progressively larger scale, the risks of implementation having unintended consequences are increasingly recognised. More routine use of balancing measures to monitor unintended consequences has been proposed to evaluate overall effectiveness, but in practice published improvement interventions hardly ever report identification or measurement of consequences other than intended goals of improvement.
Methods We conducted 15 semistructured interviews and two focus groups with 24 improvement experts to explore the current understanding of balancing measures in QI and inform a more balanced accounting of the overall impact of improvement interventions. Data were analysed iteratively using the framework approach.
Results Participants described the consequences of improvement in terms of desirability/undesirability and the extent to which they were expected/unexpected when planning improvement. Four types of consequences were defined: expected desirable consequences (goals); expected undesirable consequences (trade-offs); unexpected undesirable consequences (unpleasant surprises); and unexpected desirable consequences (pleasant surprises). Unexpected consequences were considered important but rarely measured in existing programmes, and an improvement pause to take stock after implementation would allow these to be more actively identified and managed. A balanced accounting of all consequences of improvement interventions can facilitate staff engagement and reduce resistance to change, but has to be offset against the cost of additional data collection.
Conclusion Improvement measurement is usually focused on measuring intended goals, with minimal use of balancing measures which when used, typically monitor trade-offs expected before implementation. This paper proposes that improvers and leaders should seek a balanced accounting of all consequences of improvement across the life of an improvement programme, including deliberately pausing after implementation to identify and quantitatively or qualitatively evaluate any pleasant or unpleasant surprises.
- healthcare quality improvement
- quality measurement
- qualitative research
- patient safety
- implementation science
Statistics from Altmetric.com
- healthcare quality improvement
- quality measurement
- qualitative research
- patient safety
- implementation science
Unintended consequences with negative or positive effects on care processes and outcomes can occur with any change in complex systems like healthcare organisations,1–3 and so are an important potential problem in quality improvement (QI).4–6 More routine use of balancing measures to account for and manage unintended consequences of improvement interventions is recommended by a number of organisations.7–10 The Institute of Healthcare Improvement (IHI), for example, describes measurement in improvement programmes in terms of process and outcome measures focused on delivering predefined intended benefits, and balancing measures in terms of negative unintended consequences in other parts of the healthcare system (box 1).7 8 Reflecting this perspective, hospital readmission rates are often used as a balancing measure for interventions aiming to reduce the length of hospital stay, since it is plausible that shortening length of stay could mean discharging patients who are then unable to manage at home.11–13
Institute of Healthcare Improvement (IHI) recommended types of measures7 8
‘Use a balanced set of measures for all improvement efforts: outcomes measures, process measures, and balancing measures.
Outcome Measures: How does the system impact the values of patients, their health and wellbeing? What are impacts on other stakeholders such as payers, employees, or the community?
Process Measures: Are the parts/steps in the system performing as planned? Are we on track in our efforts to improve the system?
Balancing Measures (looking at a system from different directions/dimensions): Are changes designed to improve one part of the system causing new problems in other parts of the system?’
Adapted from IHI (text is verbatim quote but examples are omitted and text is renumbered).
Despite calls for a more systematic accounting of all side effects of improvement interventions,14 15 a number of systematic reviews have shown that balancing measures appear rarely used or reported in practice. A review of the application of Plan Do Study Act (PDSA) methods found that only 6 (6.4%) of 94 included studies reported any ‘disconfirming observations’ about the intervention,16 and only 1 of 100 included studies in a systematic review of perioperative care improvement interventions reported an ‘unfavourable or unintended sign, symptom or event’.17 These findings are consistent with other reviews, including one of the applications of improvement methodologies in surgery which found that none of 34 included studies reported on unintended consequences,18 and another where only 1 of 121 studies of interventions to reduce patient falls and catheter-associated infections measured any unintended consequences.19 Several other studies in the latter review provided anecdotal evidence of ‘unexpected occurrences’,19 but robust evaluation of such claims is rare in improvement programmes more generally.20 There is additionally little evidence that improvers routinely consider the potential for unexpected consequences postimplementation,21 and the amount of missing data about outcomes other than goals is often significant.22 23 The aim of this paper is to explore current understanding of balancing measures in healthcare improvement, including the range of consequences that could, or should be considered to inform a more balanced accounting of the overall impact of improvement interventions.
Design and participants
The research was carried out in two phases, with semistructured interviews used in the initial phase to formulate a draft conceptual framework for considering all consequences of improvement which was then explored using focus group interviews to refine and elaborate the framework, and to consider its wider applicability.
We used purposive sampling to include a broad spectrum of stakeholders with expertise in metrics and measure design in healthcare QI or relevant clinical and/or academic experience in improvement implementation. Participants in both phases of the study included improvement advisors, clinical academics, providers of health and social care services, policymakers and patient representatives identified from relevant publication records and major conferences on QI, members of QI groups, online searches of open-access information and research teams’ networks and contacts. Participants were largely based in Scotland, where comprehensive healthcare, which is free at the point of care, is provided to all residents by the taxpayer-funded National Health Service (NHS). Digital maturity of the system varies, with all primary care practices exclusively using electronic medical records (EMR) with widespread electronic data sharing (including, for example, primary care sharing of data for hospital use in an emergency care summary, electronic transmission of letters and discharge summaries, and automated laboratory results transmission), but hospitals being at various stages of EMR implementation. NHS Scotland has invested significantly in staff training in improvement and introduced a number of centrally led national safety and QI programmes,24 largely (but not exclusively) based on the IHI Model for Improvement. Additional participants with particular expertise or known interest in measurement were purposively recruited from England and the USA. All participants were actively involved in service improvement across various settings including social care, mental health, public health, medicine for the elderly, maternity, neonatal and paediatric care.
Data collection and analysis
Phase 1: semistructured interviews to formulate the framework
Twelve face-to-face semistructured interviews and three telephone interviews each lasting for approximately 1 hour explored participants’ understanding of balancing measures as part of a broader discussion about QI methods in health and social care. Individual interviews followed a topic guide based on the published literature and two pilot interviews. Data were analysed according to the principles of the framework approach25 by developing codes and categories from the transcripts and grouping them into a preliminary coding matrix. The Diffusion of Innovation literature26–30 was used to reinterpret the initial matrix and generate a more structured framework reflecting participants’ conceptualisation of balancing measures. The researcher who conducted the interviews (MT) coded all transcripts with a selection of transcripts and the emerging framework reviewed by a second experienced researcher (BG) to refine the coding.
Phase 2: focus group interviews to refine the framework
Two focus groups were conducted to explore the current understanding of balancing measures in QI and to elaborate the framework generated in phase 1. The draft framework was shared in a briefing paper prior to the focus group meeting and was used to inform initial discussions within the groups. Focus groups were facilitated by two experienced moderators (MT and BG), lasted about 75 min each and took place on a single day. Interviews and focus groups were conducted in a non-directive manner, with participants encouraged to talk openly and with relative freedom to steer the discussion. The main researcher kept a journal with field notes reflecting on the research process, including prior assumptions that might have influenced the findings. Data were analysed using an iterative and stepwise process. The framework developed in phase 1 was used as a coding matrix in the analysis. Codes from focus group transcripts were grouped into subthemes, which were then allocated to one of the domains of the initial framework. One researcher (MT) coded all data and the wider team met regularly to reach consensus on the final framework structure, discuss additional categories and resolve any disagreement.
All interviews and focus group data were audio-recorded, transcribed verbatim and analysed using NVivo V.11.
Semistructured interviews with 15 participants and 2 focus groups with 24 participants (two of whom were also interviewed in phase 1) were completed. Participants had a wide range of roles in improvement and implementation science. Thirty-two participants came from Scotland, four from England and one from the USA (table 1).
Phase 1: semistructured interviews
Identifying key themes and concepts
When asked about their overall understanding of balancing measures, participants initially emphasised negative consequences of improvement in other parts of the healthcare system, paralleling the IHI definition.
‘My understanding is that a balancing measure is essentially something that you put in place because you recognise that often you can go in with the best of intent to improve an issue, you can deliver the improvement but you just end up creating more problems somewhere else.’ (Improvement advisor)
Specific examples were again typically framed negatively, often as ‘adverse’ or ‘knock-on’ effects. Some of these were described as predictable from the outset, and measured routinely in the local improvement context.
‘The mental health safety programme has balancing measures around recovery, about being very clear that one way of improving safety could lead to less positive risk-taking, which would be a very negative unintended consequence. We always use the Scottish recovery indicator, making sure that we promote recovery-oriented practices and we’re not clamping down on folk.’ (Mental healthcare provider)
Other negative consequences were described as only emerging as a potential problem after initial implementation, requiring improvers to be sensitive to the possibility of harm, and to be ready to ask themselves ‘right, what are we going to put in place to measure these adverse effects and see whether the improvement is actually causing any harm?’ (Academic and public health specialist), in order to inform further investigation or action.
‘Work to increase rates of early discharge and reduce length of stay led to patients being discharged into inappropriate conditions which in turn caused an increase in costs and readmission rates (…) That should be a wee bit of a red flag for you to think ‘why is everybody coming back? Are they coming back in because of surgical site infections or because you didn’t get their medicines reconciliation right on discharge?’ (…)’ (Improvement advisor)
Less commonly, participants described unanticipated positive or beneficial consequences. Although they were often uncertain whether these could be considered ‘balancing measures’ since they did not balance the benefits of improvement, they were highly valued by those who had experience of them.
‘A QI initiative aimed at improving writing and reading skills in secondary schools led to a reduction in absence rates as a result of better students’ engagement with different activities across the school (…) It was actually quite surprising and certainly a delightful outcome that we can now flip into a new piece of work to support children to become more engaged across their whole learning journey.’ (Provider of social care services)
However, in practice, the use of balancing measures was perceived to be rare in large-scale healthcare improvement programmes.
‘Most safety programmes haven’t paid much attention to balancing measures. From forty-nine pages of measures [in a safety improvement programme], there’s probably only two or three balancing measures like readmission rates, average length of stay or reintubation rates when reducing the time patients spend on a ventilator after surgery (…)’ (Policymaker health and social care)
Formulating the framework
In summary, when first asked about balancing measures, participants typically started from the position that measures should be implemented to assess undesirable unintended consequences of improvement work. However, their subsequent description of balancing measures also included unanticipated desirable consequences, and considerable discussion of the extent to which all consequences were predictable from the outset. Drawing on the Diffusion of Innovation literature,26–30 we developed an initial framework that describes the range of consequences that improvement could have, in terms of their desirability and the extent to which they were anticipated when planning improvement.
Four types of consequence were defined at this stage and described as goals, trade-offs, classic negative unintended consequences and serendipities (figure 1, sent to phase 2 participants before the focus groups).
Phase 2: focus group interviews
Mapping key themes and concepts
Similar to the individual interviews, focus group participants initially described balancing measures in terms of trade-offs, that is, negative unintended consequences of QI that were expected from the outset.
‘A lot of potential consequences are known at the start. ‘Oh, we need to actually count that, it will be an interesting balancing measure’. In a recent project focused on improving growth by early enteral feeding and maximise use of parenteral nutrition, the rates of necrotising enterocolitis and community-acquired bloodstream infections had reasonable potential for a balancing measure.’ (Provider of neonatology services)
However, as in the individual interviews, participants discussed several examples when undesirable consequences only became apparent after implementation, with examples from the same area of care targeted by improvement, as well as other parts of the wider system.
‘Inducing pregnant women at 40 weeks aimed to decrease the risk of stillbirth and newborn death but led to the use of extra interventions such as continuous fetal monitoring (…) which in turn increased costs and decreased overall patient satisfaction. Also woman who had a serious medical need for an induction could not get on the schedule because all of the hospital beds were occupied by women being electively induced.’ (Provider of maternal and infant healthcare)
Participants also mentioned desirable unintended consequences referring to ‘serendipitous side effects or bonuses which are not planned as original programme outcomes’ (academic and primary care provider), which they said were important to consider in order to obtain a balanced view of the overall impact of improvement interventions.
‘The Book Bug sessions were established to strengthen attachment between parents and children by encouraging them to share and enjoy books together. One of the measures, which wasn’t a balancing measure in the first instance but turned into one, was an increased interest from parents to improve their own literacy, bearing in mind that they had a young child that would need supported through school.’ (Public health specialist)
However, even when unintended consequences were clearly identified, concerns were raised about the difficulty of creating or implementing a fully balanced set of measures, since data were not usually available from the outset unless routinely accessible from an existing source.
‘I think we struggle with balancing measures. We always know we should think about them beforehand, but don’t know how to deal with what comes up during the project (…) I think in safety we probably talk more about negative expected consequences, and the unexpected ones are the tip of the iceberg stuff (…) I don’t think we become aware of them very often and we tend to then think ‘oh it would have been nice to have data on that at the beginning’. (…) they almost feel like a missed opportunity.’ (Academic capacity building)
Barriers and facilitators to using balancing measures
In terms of measure design, the majority of interviewees found the distinction between ‘process’, ‘outcome’ and ‘balancing’ measures in some of the improvement literature confusing, since balancing measures could relate to processes and outcomes depending on the context.
‘We tend to be quite prescriptive about the family of measures and putting things into baskets of process and outcome and balancing measures is not always helpful. I don’t think we pay enough attention to balancing measures and I’m not sure whether they’re the right ones either (…) Readmission rates and average length of stay are balancing measures, but they could also be outcomes or processes that we might measure.’ (Academic and palliative care provider)
Participants broadly perceived balancing measures to be important and relatively underused but reflected on the increasing burden of data collection in already resource-constrained systems.
‘The time that we spend collecting or looking for data is time we don’t spend delivering patient care, so there’s a cost to this. Having balancing measures could be disproportionately expensive (…) just one of those things when measures are added on and on and nothing’s changing. You’re just collecting for the sake of collecting. You need to consider these measures very carefully or it’s a waste of peoples’ time.’ (Provider of geriatric healthcare)
However, there was a general agreement that engaging those involved in delivering care in the choice and design of measures from the outset would likely lead to better understanding of the rationale for measuring and could help minimise the burden of data collection.
‘If the work is owned by the frontline staff, if it’s their piece of improvement and if they’ve developed their own balancing measures then they’re not going to think that measurement is too onerous in the same way as other would if they don’t understand why they’re measuring.’ (Policymaker education and early years)
More importantly, the overall process of considering unintended consequences and implementing balancing measures was perceived to have value in its own right in terms of improving staff engagement with improvement and overcoming resistance to change.
You find a lot of latent resistance because people are genuinely worried about an unintended consequence and they don’t engage in the work. You can introduce your checklist and it is fantastic, but it really annoys the staff because ‘this is just going to take up a huge amount of time’ (…) Using a balancing measure can convince your communities that improvement is needed and could be a goodwill builder if people know that you’re monitoring and taking their concerns seriously. (Academic community engagement)
Refining the framework
Figure 2 shows a revised version of the framework that takes account of focus group findings, including the language used (eg, ‘expected’ rather than ‘anticipated’). Desirability was described as a clear dichotomy, but expectations were perceived as more of a spectrum. While an initial measurement plan can define consequences expected from the outset (goals and trade-offs), participants thought that improvement programmes might need to plan for a ‘pause’ after implementation to account for unexpected consequences, both desirable and undesirable. The language of ‘serendipities’ and ‘classic negative unintended consequences’ was disliked, and renamed. The four types of consequences in the revised framework (figure 2) were therefore: improvement goals: the expected and desirable consequences of the improvement programme, defined by the initial measurement plan; improvement trade-offs: the expected but undesirable consequences of the improvement programme, and implicitly believed to be smaller in magnitude than the goals (and so an acceptable compromise); pleasant surprises: unexpected and desirable consequences emerging after implementation; unpleasant surprises: unexpected and undesirable consequences emerging after implementation.
All four consequences can be measured using either process or outcome measures and can arise in the same area of care targeted by improvement, or elsewhere in the health and social care system.
Summary of findings
Participants started by discussing balancing measures in terms of undesirable consequences which were expected before or early in implementation (trade-offs) and which could offset some of the intended benefits of improvements (goals). Although a range of examples were discussed, most participants agreed that such measures were relatively rarely used. Participants additionally emphasised that many consequences only became apparent after implementation, and these unexpected consequences could be either desirable or undesirable (pleasant or unpleasant surprises) and could accrue in the same part of the system as the improvement work, or other parts. There was frequent confusion as to what a balancing measure should measure, since the implication of many existing framings7 8 is that balancing measures are distinct in some way from process and outcome measures, rather than any type of consequence being measurable in terms of processes and outcomes. Involving front-line staff in identifying unintended consequences and balancing measure design was perceived to increase engagement with improvement and reduce resistance to change. Balancing measures were seen as a necessary and integral part of evaluating the impact of an improvement programme, as well as a pragmatic way of engaging sceptics constructively by understanding their legitimate concerns around implementation. However, the value of designing and implementing balancing measures has to be offset against their cost in the context of overall measurement burden.
Strengths and limitations of the study
A strength of the study is that it drew on both empirical data from a purposively wide range of stakeholders and existing literature on unintended consequences. A limitation is that the sample was largely recruited from Scotland which may limit generalisability. However, NHS Scotland has a history of centrally led, and broadly successful efforts to introduce system-wide improvement interventions, most commonly based on the IHI Model for Improvement including training and implementation of national safety programmes in acute hospitals, mental healthcare and primary care.24 Participants therefore had experience of a number of improvement programmes to draw on, although limited implementation of electronic medical records in hospitals means that perceptions of the burden of data collection will at least partly reflect that data used in national improvement programmes currently almost entirely consist of bespoke data collected by clinical staff. Findings were consistent across the diverse range of stakeholders (including those outside in Scotland), and we believe that the measurement issues faced by improvement programmes in Scotland are likely to be relevant in other countries and systems worldwide.
Comparison with existing literature
The existing improvement literature on measurement design emphasises the importance of developing a balanced set of measures during the planning of an improvement programme,7–10 31–33 often distinguishing between process and outcome measures for goals, and balancing measures for expected undesirable consequences (trade-offs) which are easily predictable from the outset (box 1). However, participants in this study found this framing too narrow because they were concerned about unexpected undesirable consequences (unpleasant surprises) and valued unexpected desirable consequences (pleasant surprises), neither of which could be defined prior to intervention implementation.
Although there are some studies of trade-offs,34–36 and pleasant 37 and unpleasant 38 39 surprises (table 2), published improvement interventions rarely report data relating to unintended consequences.15–19 40 This may partly reflect publication bias, since authors are known to emphasise positive results and ‘tuck away’41 negative contextual features and failures.23 However, it also likely reflects more general lack of consideration or measurement of unintended consequences, consistent with an observed preoccupation with measuring prespecified local processes and outcomes (goals).42–44 The implementation of PDSA cycles in healthcare, for example, has been criticised for often involving an oversimplified ‘Do, Do, Do’ approach15 focused on little and often measurement and delivery of goals at the expense of thinking ahead and looking to the future (for trade-offs) and reflecting on potential hazards during implementation (for surprises).45 46
Implications for improvement programme design
Balancing measures are an integral and core element of commonly used improvement models like the IHI Model for Improvement,7 8 but they are sometimes poorly specified and do not appear to be commonly implemented in practice.15–19 40 Based on the literature and the findings of this study, we believe that rather than focusing on balancing measures to implement at the start of improvement, improvers and leaders at all levels of management should consider how best to achieve a balanced accounting of the overall impact of improvement across the life of a programme. This requires consideration of all four types of consequence, any of which can be measured in terms of process and outcome (figure 2). Such a balanced accounting of impact can be achieved by articulating clear assumptions and formulating explicit predictions for both goals and trade-offs before implementation,14 40 47 and having a planned improvement pause after implementation to deliberately step back from goal delivery to take stock and reflect on potential surprises.46 48 In an ideal world, improvers would consult the available evidence base and seek external input from key stakeholders in order to identify potential trade-offs, speculate on and investigate potential surprises, and if necessary, to design relevant process and outcome measures to account for them. However, improvement takes place in resource-constrained environments, which will confine what is possible, including, for example, the feasibility of measurement in other areas of a complex system. Focusing on a balanced accounting rather than balancing measures also emphasises that qualitative methods have much to offer both for the identification of trade-offs before implementation, and for understanding surprises after implementation where retrospective measurement may be difficult.49 50
Implications for reporting QI projects
Few improvement reports mention unintended consequences, despite the Standards for Quality Improvement Reporting Excellence (SQUIRE) guidance14 including a requirement that reporting should include ‘unintended consequences, such as unexpected benefits, problems, failures or costs associated with the intervention’ (standard 13e). Of note though is that the SQUIRE explanation and elaboration for this standard51 focuses more on exploring variation in implementation effectiveness and does not provide any examples of significant elaboration of unintended consequences. As the volume of publications in QI is growing, modification of SQUIRE to clarify that improvement reports should report any measured or qualitatively assessed unintended consequences, or report that these were not assessed, would be helpful to contextualise any evidence presented about the achievement of improvement goals.
This study is largely based on analysis of data from interviews carried out in Scotland which has an integrated single-payer healthcare system and relatively well-developed QI infrastructure.52 However, improvement interventions in complex systems will often result in unintended consequences irrespective of context, so we believe that the conclusions apply more widely, although the ability of improvers to evaluate or measure unintended consequences will vary, being lower in more fragmented healthcare systems. Overall, the evidence is that improvement programme measurement is usually focused on evaluating intended goals, with minimal use of balancing measures which are typically monitoring trade-offs expected before implementation. We conclude that a more balanced accounting of the effects of improvement should consider goals and predictable trade-offs early in the design of an improvement programme, and also pause to take stock of pleasant and unpleasant surprises after a period of implementation.
This work was undertaken by and on behalf of the Scottish Improvement Science Collaborating Centre (SISCC). We are grateful to our improvement experts for their time and willingness to cocreate the ‘balanced accounting’ framework. We also acknowledge Julie Anderson and Anita Lee who provided invaluable feedback on the revised versions of this paper.
Contributors MT and BG were responsible for planning the study and led the data collection and analysis. TD, NMG and DC contributed to data analysis. MT drafted and led the writing of the manuscript. BG, TD, NMG and DC participated in critically appraising and revising the intellectual content of the manuscript. All authors read and approved the final manuscript.
Funding This paper presents core work from the Scottish Improvement Science Collaborating Centre, funded by the Scottish Funding Council, Health Foundation, Chief Scientist Office, and NHS Education Scotland with in-kind contributions from participating partner universities and health boards. The funding bodies had no role in the design of the study and collection, analysis and interpretation of data, and in writing the manuscript.
Competing interests None declared.
Patient consent Detail has been removed from this case description/these case descriptions to ensure anonymity. The editors and reviewers have seen the detailed information available and are satisfied that the information backs up the case the authors are making.
Ethics approval The University of Dundee Research Ethics Committee granted ethical approval for this study (UREC 15069).
Provenance and peer review Not commissioned; externally peer reviewed.
Data sharing statement The data used and/or analysed during the current study are available from the corresponding author on reasonable request.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.