Article Text
Abstract
Background Large-scale improvement programmes are a frequent response to quality and safety problems in health systems globally, but have mixed impact. The extent to which they meet criteria for programme quality, particularly in relation to transparency of reporting and evaluation, is unclear.
Aim To identify large-scale improvement programmes focused on intrapartum care implemented in English National Health Service maternity services in the period 2010–2023, and to conduct a structured quality assessment.
Methods We drew on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews guidance to inform the design and reporting of our study. We identified relevant programmes using multiple search strategies of grey literature, research databases and other sources. Programmes that met a prespecified definition of improvement programme, that focused on intrapartum care and that had a retrievable evaluation report were subject to structured assessment using selected features of programme quality.
Results We identified 1434 records via databases and other sources. 14 major initiatives in English maternity services could not be quality assessed due to lack of a retrievable evaluation report. Quality assessment of the 15 improvement programmes meeting our criteria for assessment found highly variable quality and reporting. Programme specification was variable and mostly low quality. Only eight reported the evidence base for their interventions. Description of implementation support was poor and none reported customisation for challenged services. None reported reduction of inequalities as an explicit goal. Only seven made use of explicit patient and public involvement practices, and only six explicitly used published theories/models/frameworks to guide implementation. Programmes varied in their reporting of the planning, scope and design of evaluation, with weak designs evident.
Conclusions Poor transparency of reporting and weak or absent evaluation undermine large-scale improvement programmes by limiting learning and accountability. This review indicates important targets for improving quality in large-scale programmes.
- health services research
- healthcare quality improvement
- health policy
- obstetrics and gynecology
- womens health
Data availability statement
Data are available upon reasonable request. Please contact the first author.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
Statistics from Altmetric.com
- health services research
- healthcare quality improvement
- health policy
- obstetrics and gynecology
- womens health
WHAT IS ALREADY KNOWN ON THIS TOPIC
Large-scale improvement programmes are a key strategy for addressing unwarranted variations in quality and safety of care, but their impact is mixed and often limited.
Previous research suggests a number of features of improvement programmes that need to be optimised, but how well these quality criteria are routinely met remains unknown.
WHAT THIS STUDY ADDS
Many large-scale maternity improvement initiatives in the English National Health Service—including some major national programmes of the last decade—lack an evaluation report.
Where an evaluation report was available, quality and design of programmes against prespecified criteria was highly variable, often demonstrating significant weaknesses.
HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY
Poor transparency of reporting and weak evaluation in large-scale improvement programmes undermine learning and accountability; explicit attention to features of quality is necessary to improve the design, conduct and impact of large-scale programmes.
Introduction
Variations in quality and safety of healthcare have remained troubling and persistent across health systems globally. Efforts to address these challenges often take the form of large-scale improvement programmes,1–4 including, for example, multiorganisational collaborative approaches, major initiatives commissioned by policy and professional bodies, implementation programmes and research projects. These programmes are variably effective, with often disappointing results.5–10 Clarity is, however, now emerging on some of the key features of ‘what good looks like’ for such programmes.4 10–13 In this article, we report a study that both identifies large-scale improvement initiatives in a clinical area experiencing major patient safety challenges, and offers a structured quality assessment of improvement programmes where an evaluation report was retrievable.
The available literature suggests that a number of features are especially important in large-scale improvement programmes. First, such programmes should be well specified and reported14–16 to ensure shared understanding of what the programme comprises and its mechanisms.2 10 17 A second feature of high-quality improvement programmes is that the interventions they use and their delivery should be supported by best available clinical evidence.11 18–21 Third, high-quality programmes should recognise and meet the requirements for implementation support in participating organisations.2 22–25 Such support needs to be sensitive to the highly heterogenous nature of local capability, which has been implicated in variable responses to improvement programmes,2 10 16 22–24 with lower performing organisations having distinctive support needs.11 13 25–31 Fourth, consistent with published policy objectives,32–34 programmes should explicitly address inequalities between socioeconomic and ethnic groups.35 Fifth, patient and public involvement (PPI) has an important role in enhancing the impact of improvement efforts.36–42 Sixth, improvement programmes benefit from use of formal published theories, models and frameworks from implementation science to guide their work.43 Finally, an important feature of good improvement programmes is a commitment to sound evaluation,11 20 21 including, where possible, assessment of effectiveness, process evaluation and economic evaluation.44–46
The extent to which large-scale improvement programmes routinely meet these seven criteria is unclear. Many programmes, including those commissioned or delivered by national-level organisations, are conducted in a context where expectations of some features (eg, programme specification) may be insufficiently explicit, and incentives for high-quality evaluation and reporting may be lacking.14 15
English National Health Service (NHS) maternity services are an important example of where quality problems are especially prominent in public discourse47–49 arising from high-profile organisational failures,50–53 evidence of persistent unwarranted variation in outcomes,54–56 rising clinical negligence claims,49 culture and workforce challenges57 58 and inequalities linked to socioeconomic status and ethnicity.33 35 59–63 These challenges have not yet proved tractable, despite multiple large-scale improvement initiatives.64 65 Maternity services are therefore an important setting in which to assess quality of large-scale improvement programmes, particularly in relation to their reporting and evaluation.
We aimed to identify large-scale improvement programmes that had been implemented in English NHS maternity services between 2010 and 2023 and, for those with an available evaluation report, to conduct a structured quality assessment based on the selected features identified above.
Methods
Design
Our design was a review with two components: a search for large-scale improvement initiatives implemented in English NHS maternity services in the period 2010–2023 and, for those that met definitional criteria as improvement programmes and had a retrievable evaluation report, a structured quality assessment. We focused specifically on programmes that primarily addressed quality or safety of intrapartum care, which has been consistently implicated in variations in adverse clinical outcomes in maternity care.66–68 Initiatives primarily focused on antenatal or postnatal care were therefore not in scope.
Our initial exploratory work found that the majority of maternity improvement initiatives were not research projects and had not been reported in the academic literature; relevant information was mostly available in diverse sources such as websites, policy documents and programme reports. To ensure that both our search and our assessment of programmes was nonetheless structured and systematic, our review was informed by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) extension for Scoping Reviews guidance69 (online supplemental material A). We also completed all five steps in the Arksey and O’Malley framework,70 although not sequentially.
Supplemental material
As our review was not intended as a full scoping review of research, we did not register the review in an online database; we did, however, produce a protocol that was used to guide the conduct of the review (online supplemental material B).
Supplemental material
Eligibility criteria
We developed prospective criteria to guide the identification of eligible improvement programmes and sources of evidence.
Identification of improvement programmes
Initial scoping identified that a large number of highly heterogeneous improvement efforts had taken place in English NHS maternity services in the period we were studying. The following types of initiatives, strategies and interventions were excluded from our review: incident investigation and inspection programmes; national clinical audits and confidential enquiries (eg, the Mothers and Babies: Reducing Risk through Audits and Confidential Enquiries - MBRRACE programme)71; organisational restructuring, major system change and service transformation programmes; health technology assessments and trials of digital technologies without an existing evidence base; clinical guidelines or recommendations without an accompanying implementation programme; and single-site quality improvement projects. Additionally, programmes implemented in clinical specialties other than maternity, outside the English NHS, or implemented in full before 2010 were excluded.
Only initiatives that met our definitional criteria as improvement programmes, had an evaluation report available and focused on intrapartum care were eligible for quality assessment. For this purpose, we defined ‘improvement programmes’ as encompassing a set of planned activities applied at a scale larger than local quality improvement projects, and requiring participation of more than one organisation or clinical service (see box 1 for our full definition).1–3 72
Definition of ‘improvement programme’
For the purpose of our study, we defined a healthcare improvement programme as a set of planned activities:
Seeking to address a known quality or safety deficit; or seeking to implement evidence-based recommendations or standards of care or practice.
Implemented at scale, that is, in two or more healthcare organisations or clinical services.
With the characteristics of an organised programme, for example, with a structured set of goals, resources, a programme team and report.
Primarily concerned with improving clinical care quality or safety including structures, processes or outcomes.
Only improvement programmes where an evaluation report could be retrieved were included in our quality assessment, since exploratory work indicated it would not be possible to make reliable judgements about programme quality and reporting without such a report. We defined an ‘evaluation report’ as a published assessment of programme design, implementation or outcomes, including formative and/or summative evaluation activities.44 We classified reports as ‘retrievable’ where they were available for full-text review following their identification in search results, or were otherwise publicly available (eg, published on organisational websites). In determining eligibility for quality assessment, evaluation reports were included without regard to where they had been published (eg, in academic or grey literature) or to the design and quality of the evaluation.
To ensure comprehensive quality assessment of programmes where an evaluation report was available, we supplemented our analysis of evaluation reports with available information from policy reports, programme reports, website entries, peer-reviewed research articles, reviews and study protocols. We excluded editorials, viewpoints, commentaries and letters, non-English language articles and sources published before 2010. Consistent with our focus on intrapartum care, we also excluded sources that primarily addressed quality or safety of care in the antenatal or postnatal periods, or that had only limited focus on intrapartum care (eg, the Getting It Right First Time - GIRFT programme).
Information sources and search strategy
Information sources
We searched two research databases, determining that this number was both proportionate to our aims and consistent with published guidance regarding the conduct of scoping studies.73 Database searches were performed in MEDLINE via Ovid and CINAHL via Ebsco from 1 January 2010 to 8 February 2023. These databases were chosen because of their high subject relevance to maternity care. Subject headings (eg, Medical Subject Headings) and free text search terms and synonyms were included. We did not apply restrictions to publication type. Filters were applied for ‘England’ or ‘NHS’, adapted from Ayiku et al.74 The database searches were designed and performed by a health librarian (IK) in collaboration with JM.
A series of structured searches were performed to identify literature or other information relevant to maternity improvement programmes in Google, Google Scholar and websites of national organisations active in UK maternity quality and safety, supplemented by purposive hand searches. Online searches were based on the search strings ‘maternity safety programme’, ‘maternity quality programme’ and ‘maternity improvement programme’, and were performed independently by two researchers (JM and BA).
Search strategy
Our search strategy was based on a modified ‘PICOS’ framework:
Patient population—women receiving NHS care during the intrapartum period.
Intervention—quality and safety improvement programmes.
Comparison—not applicable.
Outcome—clinical and other outcomes related to quality and safety of maternity care.
Setting—NHS maternity services in England.
Four studies identified during preliminary scoping searches were used as ‘golden bullets’75 to assess and improve the sensitivity and specificity of the search strings in identifying relevant literature.76–79 We supplemented structured searches with forward and backward citation tracking of a purposively selected sample of included studies to improve the sensitivity of the search.
The full list of information sources, search strategies for database and grey literature searches and search record templates are provided in online supplemental material C.
Supplemental material
Eligibility screening
Bibliographic database search results were deduplicated in EndNote, imported into Rayyan80 and screened on the basis of title and abstract. Screening of search results from non-bibliographic sources was performed onscreen by JM and BA; for each search, the first 100 (Google Search and organisational websites) and 500 (Google Scholar) search results were screened and unique sources identified by consensus. Screening of all search results and full-text sources to identify improvement programmes eligible for assessment and relevant sources of evidence was performed independently by two researchers (JM and BA); disagreements regarding the eligibility of both programmes and sources were resolved by discussion.
Data categories and charting
Data categories
For improvement programmes eligible for quality assessment, we charted data under the seven categories in table 1. These categories correspond to selected features of quality we had identified from the wider improvement literature, as well as identifying basic programme characteristics.
Our assessment of programmes was supported by published standards in two areas. First, we supported our assessment of programme specification (category 1 in the charting framework) and implementation support (category 3) by using a modified Template for Intervention Description and Replication (TIDieR) checklist,81 which was adapted to apply to improvement programmes (table 2).
Second, our assessment of programme evaluation (category 7 in the charting framework) was supported by a multi-item checklist informed by UK government guidance on programme evaluation (the ‘Magenta Book’)44 (table 3).
Charting process
The charting process was supported by a tool built in Microsoft Excel that mapped to the charting framework in table 1 (online supplemental material D). Consistent with scoping review methodology,70 we developed the data items for extraction into the charting framework iteratively, modifying them as new data were identified and analysis progressed. The charting tool was piloted using a small sample (n=2) of sources and amended prior to formal charting. Two researchers (JM and BA) independently charted all data for six of the seven data categories; data relating to use of theories, models and frameworks were charted by JM. Disagreements in assessment gradings were resolved by discussion.
Supplemental material
Appraisal of evidence and reporting of findings
Though conducting a quality assessment of eligible programmes (based on the seven features in table 1) was a key objective of our analysis, we did not seek to review evidence of effectiveness of programmes, nor did we aim to conduct an appraisal of the methodological quality of individual sources of evidence, as these were not goals of our study. Consistent with the norm in scoping reviews,70 we also did not formally aggregate or synthesise evidence, instead developing summaries of the data organised by the charting framework. Key findings and themes are reported under the categories of this framework in the Results section82 and are summarised in a series of supplemental tables.
Patient and public involvement
Patients and the public were not involved in the design or conduct of the review.
Results
We identified 1434 records via bibliographic databases and searches of the grey literature and other sources. From these, 93 full-text sources were retrieved and assessed for eligibility, including one evaluation report that was not initially available, but was subsequently retrieved through personal correspondence.83 The process by which sources of evidence were identified and screened to determine their eligibility is summarised in our PRISMA diagram (figure 1).
Following full-text eligibility screening, we found 38 improvement initiatives (reported across 50 sources) that could not be included in the quality assessment. These initiatives are documented, together with exclusion reasons, in online supplemental material E. Of these, 14 initiatives—including most major initiatives in English maternity services that were implemented during the time period under study, such as the NHS England’s Maternity and Neonatal Safety Improvement Programme and the Maternity Safety Support Programme—lacked a retrievable evaluation report. A further 13 initiatives did not meet our definitional criteria as improvement programmes (eg, because they did not report implementation in two or more clinical services). We also excluded nine initiatives that focused primarily on antenatal or postnatal care from the quality assessment, since our scope was limited to intrapartum care. Two other initiatives that had not been implemented in the relevant setting or time period were also excluded at this stage.
Supplemental material
We identified 15 initiatives implemented in maternity services in England between 2010 and 2023 that had a principal focus on intrapartum care, met our definitional criteria as improvement programmes and had a retrievable evaluation report. These 15 initiatives were included in the structured quality assessment and were reported across 43 sources of evidence. Of these sources, grey literature constituted the majority (24). Peer-reviewed academic journal articles constituted 19 sources, with nine taking the form of original research articles reporting findings of evaluations.30 76–79 84–87 Other article types included study protocols (n=5),88–92 quality improvement reports (n=3)93–95 and review articles (n=2).96 97
We report the findings of our quality assessment of the 15 programmes organised around the seven features of ‘what good looks like’ below, with additional data in online supplemental material F.
Supplemental material
Feature 1: quality of programme specification
Reporting of the basic characteristics of the programmes included in our quality assessment was reasonable (online supplemental material F: table 1). For example, it was possible to identify the clinical setting, target recipients, programme type and source of funding for all 15 programmes. Information on the scale of the programme was available for 14 of the 15 programmes, and on time period of implementation for 13 programmes.
Beyond these basic characteristics, completeness of programme description varied markedly between programmes and was generally of low quality (online supplemental material F: table 2). No programme described all 10 items of the modified TIDieR checklist in full. Five of the included programmes provided no description for half of the checklist items. Two programmes (Saving Babies’ Lives Care Bundle and Prevention of cerebral palsy in pre-term Labour - PReCePT) provided full description of six checklist items, while three programmes (Maternity Incentive Scheme, Maternity Safety Training Fund and the Safer Births Project) did not describe any checklist item in full.
Completeness of description also varied between checklist items. Goals (described in full for 11 programmes) were generally well described, as were mechanisms of action and theories of change (described in full for eight programmes). In contrast, frequency, duration and time period of local implementation, local tailoring and modifications, processes for assessing or maintaining fidelity and outcome of fidelity assessment were described poorly. For example, no programmes offered a full description of frequency, duration and time period of local implementation or of local tailoring and modifications, and only three programmes offered any level of description for the latter item. Items relating to fidelity were particularly poorly described; only two programmes (Saving Babies’ Lives Care Bundle and PReCePT) offered any level of description of a fidelity assessment outcome. Only the Saving Babies’ Lives Care Bundle provided a full description of both the processes and outcome of fidelity assessment.
Feature 2: use of evidence-based interventions
Programmes varied in the extent to which they explicitly based their improvement interventions on evidence. Of the 15 programmes we assessed, only eight reported that their interventions were developed with explicit reference to published evidence (online supplemental material F: tables 3 and 4). Of these, most (n=6) based their interventions on recommendations from national clinical guidance or quality standards, for example, those published by Royal Colleges and the National Institute for Health and Care Excellence. Two programmes (PReCePT and Perinatal Excellence to reduce injury in preterm birth - PERIPrem) cited a range of studies in support of their interventions, including systematic reviews and meta-analyses, randomised controlled trials and cohort studies.79 95 98 99 The remaining seven programmes did not cite the evidence base for interventions used, if any.
Feature 3: description of implementation support for services
Completeness of description of the implementation support available to clinical services (final column, online supplemental material F: table 2) was poor. Only three programmes (Obstetric anal sphincter injury Care Bundle - OASI-CB, PReCePT and PERIPrem) were judged to have described implementation support and activities in full. Whether or not implementation support had been provided was unclear for seven programmes, as no level of description was offered.
We did not identify any programme report that described customisation of implementation support for challenged services. The PReCePT programme did offer enhanced implementation support to a subset of participating services, though the authors did not report whether this was targeted specifically at challenged services.100 Recruitment to the Labour Ward Leadership programme was partially informed by Care Quality Commission (CQC) inspection reports, with ‘Trusts with identified problems’ given priority.83 Two programmes91 100 101 accounted for variation in local service contexts in their evaluation methods by using, among other criteria, CQC inspection ratings to inform sampling of study participants. The interim evaluation of the Maternity Incentive Scheme102 stated that additional support was provided to Trusts that were not in compliance with all 10 incentivised actions to enable them to ‘achieve full compliance’, but we did not find evidence that this support was targeted to support challenged services. In contrast, NHS trusts in ‘special measures’ and those in receipt of support from national regulators were specifically excluded from participating in the Each Baby Counts Learn & Support programme.103 An incidental finding was the potential for programmes to exacerbate inequalities of resourcing between services, as noted by a PERIPrem report (parentheses added):
In Trusts where there was a pre-existing QI (quality improvement) culture and a desire to embed new practice and change, implementation was easier. Those with active hospital QI teams were able to access additional support and training for the project.104
Feature 4: commitment to reducing inequalities
Although we identified several evaluation reports that adjusted for socioeconomic status, ethnicity or levels of deprivation in their analysis,78 79 84 95 100 101 105 we did not find any examples among the 15 programmes we assessed that identified the reduction of health or care inequalities as an explicit goal (online supplemental material F: table 4).
Feature 5: patient and public involvement
Of the 15 programmes we assessed, only seven made explicit use of PPI practices. These included five that demonstrated comprehensive attention to involvement at all stages of the programme (including design, development, implementation, evaluation and dissemination). The remaining eight programmes did not mention of the role of patients or the public (online supplemental material F: table 4).
Feature 6: use of formal published theories, models or frameworks
A minority (n=6) of assessed programmes made explicit use of formal published theories, models or frameworks from implementation science to guide programme implementation or evaluation (online supplemental material F: tables 4 and 5).106 We identified 12 documented examples of use of theories, models or frameworks from four of Nilsen’s proposed five categories,107 including implementation theories (n=6), classic theories (n=2), determinant frameworks (n=2) and evaluation frameworks (n=2).
Feature 7: programme evaluation
Programmes varied in their reporting of the planning, scope, design and conduct of evaluation (online supplemental material F: table 6). Evidence of a prospective evaluation plan (eg, in the form of a published protocol) was identified for seven of the 15 programmes, and four programmes conducted a pilot that informed programme development or implementation.
The evaluations we assessed were dominated by weak designs, often relying on post hoc methods of data collection, such as self-report questionnaires and staff surveys. Of the 19 peer-reviewed research articles identified (which collectively reported on nine programmes), five76–79 84 reported evaluations of effectiveness, but only two studies reported on cost-effectiveness.78 79 Only three evaluation reports76 84 100 (including one preprint100) employed a randomised design. Three evaluations employed quasiexperimental approaches.78 79 92 Only four articles reported findings from process evaluations, implementation research or qualitative studies.30 85–87
Discussion
Large-scale improvement programmes have been a key strategy for addressing quality deficits in healthcare globally, but the programmes assessed in our review of one exemplar clinical area often fell short on key features of quality. Though a large number of improvement initiatives have been undertaken in a 13-year period in maternity services, a particularly challenged clinical specialty in the English NHS, many—including a large number of major national programmes of the last decade—did not meet a basic requirement of a retrievable evaluation report. This represents a major threat to learning and accountability. Among programmes with a focus on intrapartum care that offered some form of evaluation report and could be assessed, there was considerable variability and very often evident flaws in transparency and quality of programme specification, use of evidence-based interventions, implementation support, PPI, use of formal published theories, models and frameworks, and evaluation. Notably, no programme that we quality assessed had explicitly set reduction of inequality as a goal. Our findings are unlikely to be unique to maternity settings or to the English NHS. These findings, and the methods used to generate them, are likely to be of relevance to many other clinical areas targeted by large-scale improvement programmes in healthcare settings internationally.
A first step in improving the quality of improvement programmes is to ensure that they are sufficiently well specified to permit identification of their components and the mechanisms through which they work, not least so that they can be scaled with fidelity if shown to be effective,10 14 108 and modified or abandoned if not effective. Programmes included in our quality assessment demonstrated highly variable completeness of programme description, with some demonstrating weaknesses across several items of a modified TIDieR checklist.81 We also identified an important lack of transparency in reporting relating to the implementation support available to participating services, despite its recognised role in effective improvement,25 29 including in maternity care.30 100 Crucially, despite recurrent reported organisational degradations in NHS maternity services, we were unable to identify any examples of adaptation of implementation specifically to account for the context of challenged services, with one programme actively excluding challenged units.103 The extent to which programmes explicitly grounded their interventions in evidence or drew on theories, models or frameworks from implementation science43 to guide implementation and evaluation was also highly variable, with many examples of poor reporting practice.
Second, a commitment to evaluation and public reporting of all findings should be seen as fundamental to high-quality commissioning of large-scale improvement programmes in healthcare. Many (14) of the initiatives we identified—including some of the high-profile national maternity programmes of the last decade—lacked a retrievable evaluation report. In these cases, it is not possible to determine whether the programmes worked (made a difference to outcomes) and should be scaled, to assess whether these programmes represent a good use of resources, or to identify how programme design or implementation might have been improved. Even where evaluation reports were available, they often demonstrated substantial weaknesses. Future programme design and delivery should be organised to facilitate the use of rigorous evaluation designs that allow reliable assessments to be made about effectiveness and cost-effectiveness, as well as what works, what doesn’t and why across diverse clinical settings.44
Third, despite repeated policy commitments to improve equity,32 34 our findings add to concerns about inequalities in NHS maternity care.33 35 59 We found no examples of improvement programmes included in our quality assessment that identified the reduction of health and care inequalities as an explicit goal. Indeed, there was some evidence of programme design having potential to contribute to widening inequalities between high-performing and low-performing services, which is likely to impede efforts aimed at improving equity.33 Finally, despite the emphasis placed by national policy on including those who use maternity services in the design and delivery of improvement programmes,41 42 PPI appeared to be lacking in over half of the programmes we assessed. Improving the impact of large-scale improvement efforts on the quality and safety of care in future will likely require these gaps between these enshrined policy objectives and programme design to be closed.
Strengths and limitations
Our adaptation of the principles of scoping review methodology was successful in addressing our aims, given that most programmes we identified were not research projects and therefore unsuitable for full scoping or systematic review designs. The design of our search strategy was improved by specialist librarian input, by pilot testing the search strings to test their capacity to identify relevant literature, by selecting databases with relevant scope and by supplementing our structured searches with extensive hand searches. We sought to lend rigour to the charting process by requiring that assessment decisions be agreed by two researchers, by pilot testing the data charting tool to improve its reliability and by basing our assessment criteria, where possible, on published standards relating to programme specification81 and evaluation.44
Our selection of the TIDieR checklist81—developed to improve published descriptions of healthcare interventions, including complex interventions—on which to base our assessments of reporting was appropriate given the sociotechnical nature of improvement programmes. Our modified checklist is likely to be valuable in future studies to enable better description of features specific to improvement programmes. Relatedly, our definition of ‘improvement programme’ offers clarity to those engaged in programme commissioning, design and evaluation regarding specific aspects of these interventions that distinguish them from other types of improvement intervention in healthcare.
We acknowledge several limitations. It is possible that relevant sources of evidence relevant to eligible programmes were missed by our search strategy. The list of quality features we identified was necessarily selective and may not be comprehensive. It remains possible that researcher-related factors may have biased our assessments in some non-transparent way. Practical considerations meant our search was limited to programmes implemented in England since 2010, meaning that potentially important learning from programmes implemented elsewhere in the UK and prior to 2010 was excluded. Restriction of scope to intrapartum care may have excluded some programmes with different characteristics of quality and reporting.
Conclusions
Transparent reporting and high-quality evaluation are critical to learning and accountability in healthcare systems facing persistent quality of care challenges. This review of large-scale maternity improvement programmes in the English NHS since 2010 has identified widespread poor practice in programme design, transparency of reporting and evaluation. These findings are both cause for concern and unlikely to be unique to this clinical setting. Our study suggests important targets for improving the design, delivery, evaluation and reporting of large-scale programmes in healthcare to maximise their impact on quality and safety, ensure accountability, including for how resources are used, and better aggregate learning to improve care for patients.
Data availability statement
Data are available upon reasonable request. Please contact the first author.
Ethics statements
Patient consent for publication
References
Supplementary materials
Supplementary Data
This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.
Footnotes
Twitter @jgmcgowan, @ilk21, @LisaHinton4, @graham_p_martin, @MaryDixonWoods
Contributors MD-W is the guarantor of the study. Conceptualisation: JM, MD-W. Data curation: JM, BA. Formal analysis: JM, BA. Funding acquisition: JM, MD-W. Investigation: JM, BA, IK. Methodology: JM, BA, IK, LH, TD, GPM, MD-W. Project administration: JM, IK. Resources: JM, IK, MD-W. Software: JM, IK. Supervision: GPM, MD-W. Validation: LH, TD, GPM, MD-W. Visualisation: JM. Writing—original draft: JM, MDW. Writing—review and editing: all authors.
Funding The Health Foundation's grant to THIS Institute.
Competing interests TJD: Research and Innovation lead for PROMPT Maternity Foundation and has a part-time appointment at NHS Resolution where he leads on a Safety Action within the Maternity Incentive Scheme.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.