“Authors, editors and publishers all have ethical obligations with regard to the publication of the results of research. Authors have a duty to make publicly available the results of their research on human subjects and are accountable for the completeness and accuracy of their reports.” - Declaration of Helsinki, Ethical principles for biomedical research, 2008

Despite concerns that academic medicine is in crisis,1,2 authors and publishers of biomedical literature are increasingly productive. Over 9,000 articles relating to anesthesia were indexed by PubMed (National Library of Medicine, Bethesda, MD, USA) in 2011 compared with approximately 6,000 articles in 1990. As the number of publications in medicine has increased, the quality of reporting in biomedical literature has not necessarily improved.3 Original research has been hindered by inconsistencies, by omissions and errors in reporting by authors, and by inadequate policing of reporting standards by peer reviewers and journals. Complete and accurate reporting of original research in the biomedical literature is essential for healthcare professionals to translate research outcomes appropriately into clinical practice.

This narrative review aims 1) to inform investigators, peer reviewers, and authors of original research in anesthesia on reporting guidelines for frequently reported study designs; 2) to describe the evidence supporting the use of reporting guidelines and checklists; and 3) to discuss the implications of widespread adoption of reporting guidelines by biomedical journals.

Background

Deficiencies of reporting are diverse and common.4 Inadequate reporting may affect all aspects of published research, but perhaps it is most important in the reporting of methods, research interventions, and outcome measures. These fundamentals may be reported either incompletely or not at all, and these inadequacies can influence the interpretation, translation, and application of published research. Selective reporting of research methodology is prevalent.5,6 Blumle et al. found that almost all clinical trials had sample populations that differed from their protocol eligibility criteria or failed to report some aspects of the eligibility criteria,6 which could affect their external validity. From 2005 to 2006, there were inadequate descriptions of research interventions in more than half of the reports in several high-impact publications, limiting implementation of the interventions into clinical practice.7 Similarly, selective reporting of outcomes is common in biomedical literature. In a systematic review of bias in publication and outcome reporting, Dwan et al. found that at least one primary outcome was changed, introduced, or omitted in 40-62% of published studies when compared with the protocol.8 Methodological issues, such as power calculation, primary outcomes, randomization, and handling of attrition, were found to have been reported adequately in less than half of randomized controlled trials (RCTs).4 These deficiencies, omissions, and errors in reporting occur in all types of publications.5 Additionally, the publication of negative results in the biomedical literature is decreasing as a consequence of several factors,9 and this can lead to a positive-outcome bias.

Reporting deficiencies have been recognized to impair the appropriate interpretation of research.3,10 Independent groups of research methodologists and journal editors have subsequently promoted standardized reporting of research in the biomedical literature in an effort to improve the quality, completeness, and accuracy of original research.10,11 Early efforts resulted in a landmark publication of the CONSORT (CONsolidated Standards Of Reporting Trials) Statement in 1996.12 This evidence-based guideline for reporting parallel-group RCTs represented a milestone for investigators, authors, editors, and publishers of biomedical research as it heralded the development of reporting guidelines as a subspecialty of research methodology. Since then, the number of reporting guidelines has grown substantially each year, and there are currently more than 200 guidelines available to authors of research in healthcare, each applicable to specific study designs or research specialties.13

The EQUATOR (Enhancing the QUAlity and Transparency Of health Research) Network (http://www.equator-network.org) defines reporting guidelines as “statements that provide advice on how to report research methods and findings. Usually in the form of a checklist, flow diagram or explicit text, they specify a minimum set of items required for a clear and transparent account of what was done and what was found in a research study, reflecting in particular issues that might introduce bias into the research.”14 In an effort to improve quality and transparency of reporting of original research, biomedical journals have endorsed some of these reporting standards, some of which have become a minimum standard for submitting original research to biomedical journals for peer review.15

The recent incorporation of the CONSORT, STROBE (Strengthening The Reporting of OBservational studies in Epidemiology), and PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) checklists (Appendices A, B, & C, available as electronic supplementary material) into the editorial policy of the Canadian Journal of Anesthesia reflects the increasing use of reporting guidelines and checklists by the biomedical literature.16-18 In addition, guidelines are being used not only by authors and publishers to maintain quality and accuracy of reporting, but also by reviewers to improve peer review of original research submitted to journals by providing a systematic approach for peer reviewers to assess the completeness of reported research.

While there are many validated reporting guidelines for various study designs and research specialties, for the purpose of this review, we focus on the reporting guidelines that anesthesiologists are most likely to use when writing or reviewing research reports.

Description of reporting guidelines by study design

Randomized controlled trials

In 1994, two groups, namely, the Standards of Reporting of Trials group and the Asilomar Working Group on Recommendations for Reporting of Clinical Trials in the Biomedical Literature, independently published recommendations for reporting RCTs.10,11 The groups subsequently amalgamated to become the CONSORT group,19 and their reporting guideline for parallel-group RCTs, the CONSORT Statement, was published in 1996.20 Eleven stakeholders, including editors, authors, clinical epidemiologists, and statisticians,20 used a modified Delphi process21 to refine the original CONSORT Statement from the previous independent efforts. The Delphi technique, which is used in the development of many reporting guidelines, is a structured communication technique for determining consensus agreement. It uses the principles of anonymity, structured information flow, and feedback to optimize communication between experts. The CONSORT group aimed to develop a guideline using the minimal number of descriptors to maintain adequate standards of reporting.20 They included only those items that could result in bias in the estimates of the effects of interventions if not reported, for which there was empirical evidence and for which “common sense” dictated inclusion in the guideline despite a lack of empirical evidence.20

The original CONSORT consisted of 21 items that referred mainly to the methods, results, and discussion of reports of RCTs and identified key pieces of information deemed necessary to evaluate the internal and external validity of the study. The CONSORT Statement has since been revised twice.22,23 The most recent publication, the 2010 CONSORT Statement, consists of a checklist (Appendix A, available as electronic supplementary material) and flow diagram. The checklist includes 25 items in six categories: title and abstract, introduction, methods, results, discussion, and other (funding and registration). The flow diagram includes data from four phases of the trial process: enrolment, allocation, follow-up, and analysis. As well as clarifying some items and improving the consistency of style, the current version continues the evolution of the original statement by recognizing and incorporating emerging empirical evidence.8,24-28

In addition to the CONSORT Statement and checklist, the CONSORT group also publish an accompanying “explanation and elaboration” document to “enhance the use, understanding, and dissemination” of the Statement.16 This publication explains the rationale, provides empirical evidence, and offers examples of good reporting for each of the checklist items. This approach is being used in other reporting guidelines,18,29-32 and these articles of explanation and elaboration are valuable resources for authors unfamiliar with such guidelines.

CONSORT has also been extended to apply to several other types of clinical trials and reports. These include reports of journal and conference abstracts,29 harms in RCTs,33 non-inferiority and equivalence RCTs,34 cluster RCTs,35 herbal intervention RCTs,36 non-pharmacological treatment interventions,37 pragmatic trials,38 and controlled trials of acupuncture.39 Unofficial extensions of CONSORT have also been developed, including modified reporting guidelines for behavioural medicine RCTs and eHealth interventions.40,41 These extensions are based on the CONSORT Statement, but they also include essential reporting items for important aspects of trial design or outcomes that are specific to the respective studies and are not included in the original statement.

The CONSORT group have produced 12 reporting guidelines for RCTs.42 More than 150 journals have endorsed the revised CONSORT Statement,15 and it has become the template for the development and design of most other reporting guidelines.

Observational studies

Published research is often observational in design.43 Observational studies may be strikingly different in study design, ranging from case reports and case series to cohort, case-control, and cross-sectional studies. Although observational studies are reported more frequently in the biomedical literature, when reporting guidelines were initially developed, the greater focus was on the formulation of guidelines for RCTs. There could be a variety of reasons for this emphasis, the most likely being the difficulty in summarizing guidelines for the various study designs among observational studies compared with RCTs. As a result of the diverse study designs found in observational studies and the need for field-specific observational study designs, more than 30 guidelines have been published for reporting various observational studies.

The STROBE initiative (http://www.strobe-statement.org) was established in 2004 to assist authors in reporting observational studies. The subsequent STROBE Statement consists of a checklist (Appendix B, available as electronic supplementary material) of 22 items and is intended to provide guidance for reporting cohort, case-control, and cross-sectional studies.44 The six domains explored by STROBE include title and abstract, introduction, methods, results, discussion, and other information. The 22-item checklist consists of 18 items that are common to all three study designs and four items that are specific to each study design, i.e., participants, statistical methods, descriptive data, and outcome data. Due to the diversity seen in observational studies, the STROBE initiative does “not aim at standardized reporting” but encourages authors to convey essential information outside of a regulated style and terminology. Similar to the CONSORT initiative, several extensions of STROBE have been developed to assist authors in reporting specialty-specific observational studies, including genetic association studies, molecular epidemiology, studies of adverse event reporting45-47 and other aspects of reporting, such as conference abstracts. STROBE is not intended to be used with other observational study designs, such as case reports or case series, as each of these study designs has purpose-specific reporting guidelines.48,49

Systematic reviews

Systematic reviews are considered the criterion standard in the hierarchy of research methods50 and are being increasingly performed and published. They are preferred by healthcare professionals and policymakers as comprehensive summaries of evidence and by researchers as a method of strengthening available evidence by collating results. Despite increased popularity, however, the conduct and reporting of systematic reviews have been largely unregulated to the extent that the National Library of Medicine does not index either “systematic review” or “meta-analysis” as publication types.51

QUOROM (QUality Of Reporting Of Meta-analyses) was the first guideline developed with the intention of improving reporting of meta-analyses of RCTs.52 It was developed in 1996 at a conference of 30 clinical epidemiologists, clinicians, statisticians, editors, and researchers using a modified Delphi technique.21 The original QUOROM Statement consists of a 21-item checklist and a flow diagram.52 This was updated, revised, and expanded in 2005 to become the PRISMA Statement.53 Published in 2009, PRISMA was developed to include evaluations of quality of reporting of both systematic reviews and meta-analyses. The PRISMA Statement consists of a 27-item checklist (Appendix C, available as electronic supplementary material) and a four-phase flow diagram. The checklist is structured in seven domains: title, abstract, introduction, methods, results, discussion, and funding. Several items differ from QUOROM, for example, the review question should be designed in a PICO (Population, Intervention, Comparison, and Outcome) format, and there should be a full description of access to the study protocol and at least one electronic search strategy.18 The flow diagram requests data from four phases of the review process: identification, screening, eligibility, and included studies. The authors intended PRISMA to improve the reporting of meta-analyses, but it “can also be used as a basis for reporting systematic reviews of other types of research, particularly evaluations of interventions. PRISMA may also be useful for critical appraisal of published systematic reviews. However, the PRISMA checklist is not a quality assessment instrument to gauge the quality of a systematic review.”53 The PRISMA group has recently published PRISMA-Equity, an extension to the original statement for reporting systematic reviews with a focus on health equity,54 and the group is also developing three further extensions for reporting systematic reviews of abstracts, harms, and protocols.

As a result of inherent inabilities, practical considerations, or ethical concerns, some measures cannot be randomized, and observational studies are needed to evaluate outcomes. As a consequence of different study designs and the inherent risk of bias and confounding in observational studies, meta-analyses of these studies require careful consideration and another reporting guideline was developed in 1997. The MOOSE (Meta-analysis Of Observational Studies in Epidemiology) guideline is intended to guide authors in reporting systematic reviews of observational studies only.55 Guided by the results of a systematic review of the conduct and reporting of meta-analyses in observational studies, a conference of 27 experts developed this checklist, which consists of 35 items across six domains: background, search strategy, methods, results, discussion, and conclusions. It differs from PRISMA by considering each of the different study designs and assessing the risk of confounding and heterogeneity in observational studies.

Other study designs

As indicated by the numerous extensions of the CONSORT Statement, a “one-size-fits-all” approach cannot be applied to reporting guidelines. Methodological differences and key concerns for specific items that could bias estimates of the effects of interventions for some study designs or types of interventions have necessitated the development of specialty-specific reporting guidelines. Examples of reporting guidelines for other study designs include TREND (Transparent Reporting of Evaluations with Non-randomized Designs),56 SQUIRE (Standards for QUality Improvement Reporting Excellence),30 COREQ (COnsolidated criteria for REporting Qualitative research),57 STARD (STAndards for the Reporting of Diagnostic accuracy studies),58 and GRRAS (Guidelines for Reporting Reliability and Agreement Studies).59

Adherence to reporting guidelines by biomedical journals

It was reported recently that more than 150 biomedical journals endorse CONSORT for reporting RCTs.15 Although most editors of high-impact biomedical journals endorse CONSORT for reports of RCTs,60 this differs from the “Instructions to Authors” published by some journals. In 2003, 22% (36/167) of high-impact medical journals mentioned the CONSORT Statement in their published “Instructions to Authors”, and in 2007, this increased to 38% (62/165),60,61 but only 37% (23/62) of these journals required authors to use CONSORT when preparing a manuscript.60

Biomedical journals endorse other reporting guidelines less readily than CONSORT. The uptake of reporting guidelines for systematic reviews has been disappointing. In a sampling of 146 leading biomedical journals in 2011, only 27% of the included journals referred to the PRISMA statement in their “Instructions to Authors”, and most of the journals used ambiguous language to describe what was expected of authors for reporting systematic reviews.62

Despite increasing levels of endorsement of reporting guidelines by biomedical journals, adherence to reporting guidelines is still disappointing. A review of all RCTs evaluating healthcare interventions in humans published in December 2000 revealed very poor adherence to the CONSORT Statement.4 These investigators found that adherence to CONSORT items, including power calculation, primary outcomes, random sequence generation, allocation concealment, and handling of attrition, was adequately described in less than half of included reports.4 Similarly, Hopewell et al. compared the quality of RCT reports indexed in PubMed in 2000 and 2006. They found that adherence to some CONSORT items improved only minimally, and the overall quality of reporting remained below acceptable standards.63

Improving quality of reporting of published research

As endorsement of reporting guidelines by biomedical journals has increased, the empirical evidence that use of reporting guidelines improves quality of manuscripts and published reports has become more substantial.64 Evidence remains sparse for reporting guidelines other than CONSORT, but overall, the results are encouraging.

Between 2000 and 2006, there were only minimal improvements in the quality of RCTs published in PubMed.63 Nevertheless, a recently updated Cochrane review by Turner et al. confirmed that the CONSORT checklist improves the completeness of RCT reports published in biomedical journals.64 When comparing journals that endorse CONSORT with those that do not, most outcomes assessing completeness of reporting in RCTs appeared to favour journals endorsing CONSORT; however, only five of these outcomes (allocation concealment, introduction, sample size, sequence generation, and total sum score) were found to differ with statistical significance. As an example, Turner et al. found that allocation concealment was reported adequately in only 45% of RCTs in journals endorsing CONSORT compared with 22% in other journals – both reported rates are far less than ideal.64

Similarly, when comparing completeness of reporting in journals before and after endorsement of CONSORT, only a few outcomes (7/27) were statistically significant after endorsement.64 The CONSORT participant flow diagram has also improved the quality of RCT reporting.65 In PubMed core clinical journals published in 2009, journals endorsing CONSORT were more likely to publish a flow diagram (62% vs 29%); however, many of these diagrams remained incomplete, particularly with respect to reporting reasons for exclusion before randomization.65

The quality of reporting for meta-analyses has also improved since the introduction of reporting guidelines. In a cohort of meta-analyses of diagnostic research conducted, Willis and Quigley found that compliance with PRISMA guidelines was poor overall, but the quality of included meta-analyses improved. Five of the PRISMA items were found to have been reported more completely after the introduction of PRISMA, i.e., eligibility criteria, risk of bias across studies (methods), study selection results, results of individual studies, and risk of bias across studies (results).66

Similar improvements in quality have been found in reports of other study designs. Smidt et al. assessed the quality of reporting for diagnostic accuracy studies in 12 medical journals before and after the publication of the STARD statement.67 They found some improvement in the overall completeness of reporting (median, 11.9 vs 13.6 items, respectively), but again, the minimum expected reporting remained suboptimal. None of the articles published before STARD reported more than 20 of the 25 checklist items, whereas only 2% (3/141) did so after the introduction of STARD.67

Some reporting guidelines have not been as successful in improving the reporting of publications in biomedical journals. Whereas the reporting of CONSORT items in clinical trials of acupuncture had improved consistently with time, Prady et al. found that the introduction of the specialty-specific reporting guideline, STRICTA (STandards for Reporting Interventions in Controlled Trials of Acupuncture), resulted in “little meaningful evidence of change” in the completeness of reporting. In a cohort of 90 peer-reviewed journal articles, only two of 32 items (rationale and intervention needle type) in the checklist had improved since publication of the guideline.68 The validity of these results is questionable, however, considering that compliance with expected standards of reporting prior to the publication of STRICTA was already reasonably high (48.1%), and the sample size was probably underpowered. In addition, this study was conducted only three years after the publication of STRICTA, possibly before journal editors, peer reviewers, and authors were fully aware of the reporting guideline.

Using reporting guidelines for scientific peer review

Most journals rely on scientific peer review for determining the merits and quality of research submitted for publication69; however, it has been described as being “expensive, slow, prone to bias, open to abuse, anti-innovatory, and unable to detect fraud”.70 Reviewers often do not, or cannot, identify weaknesses in design, analysis, or interpretation of clinical trials.71 In an effort to develop a “gold standard” of best practice, the COPE (Committee On Publication Ethics) code of conduct for journal editors and publishers suggests that editors “should provide guidance to reviewers on everything that is expected of them” (http://publicationethics.org/resources/code-conduct) (accessed November 26, 2012). To improve the scientific peer-review process, many journals are using reporting guidelines to assess completeness of reporting and quality.72

Considering the improvement in methodological quality seen when authors use reporting guidelines, we might speculate that the methodological quality of manuscripts submitted to journals would further improve when peer review also uses reporting guidelines. Cobo et al. studied a peer-review system that uses reporting guidelines and evaluated the quality of manuscripts submitted to a biomedical journal.73 In this study, 51 of 126 consecutive manuscripts considered suitable for publication in Medicina Clínica received an additional review based on reporting guidelines after conventional peer review. Using a manuscript quality assessment instrument,74 these investigators found an increase in the overall quality of the revised manuscript when the additional review incorporating the relevant reporting guideline was compared with conventional peer review alone.73

The EQUATOR Network has endorsed the use of reporting guidelines for peer review, suggesting that they “will increase the completeness, clarity and transparency of research papers without restricting researchers’ creativity”.75 But it is also possible that this practice would distract peer reviewers from evaluating aspects of research not measured by reporting guidelines, such as the relevance, usefulness, novelty, or importance of the research being evaluated. The EQUATOR Network does provide guidance and examples of how journals can best implement reporting guidelines into the peer-review process (www.equator-network.org/resource-centre/editors-and-peer-reviewers/editors-and-peer-reviewers/#editperr) (accessed November 26, 2012).

Future developments and directions

To ensure the optimal quality of reporting guidelines, they need to be developed to the highest standards and disseminated and implemented effectively, and their impact needs to be evaluated.

Processes used by different groups to develop reporting guidelines have been inconsistent. These variations in methodology can potentially reduce the robustness and effectiveness of the guideline. For example, an inadequate search strategy used to identify empirical evidence of biases relevant to a particular study design could result in either incomplete or inappropriate guideline content, thereby limiting the effectiveness of the guideline. Moher et al. have proposed a strategy for future development and implementation of reporting guidelines.76 Based on their collective experience in developing reporting guidelines, these authors suggested a framework for guideline development that extends from the initial steps of identifying the need for a new reporting guideline through to continued updating of the established guideline. These experts recognize that other methods of developing reporting guidelines can be used effectively; however, they suggest that a strategy of expert consensus opinion mediated by both a modified Delphi exercise and structured discussions is optimal to ensure the involvement of all stakeholders. In addition, whichever process is used for guideline development, it is essential to have complete and transparent descriptions and explanations of guideline development and content.

Although reporting guidelines are among the most cited articles in the biomedical literature, authors are often not aware of particular items required by the respective reporting guidelines prior to manuscript submission.77 This problem could be improved by better dissemination of reporting guidelines and by ensuring that research projects are designed to include the essential reporting items.78 Recently, the SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials) Statement was published with the aim of improving the quality of study protocols for clinical trials.78 The SPIRIT Statement does not aim to standardize design or conduct of clinical trials; rather, it aims to ensure that a complete description of the planned study is provided, which should influence the quality of study design and subsequent reporting.

Reporting guidelines aim to improve the completeness and quality of reporting of original research by ensuring that the minimal essential information is conveyed by authors.20 However, the ability of guidelines to improve the quality of published research can be diminished by other factors influencing research outcomes, particularly by statistical design and analysis. Many articles contain errors in statistical analysis.79,80 Inadequate knowledge of biomedical statistics, inappropriate analysis of data, and even a lack of data integrity can lead authors to misleading conclusions and ultimately undermine valid research. Most reporting checklists use broad guidelines to describe the statistical methods used, for example, the CONSORT checklist asks authors to describe “statistical methods used to compare groups for primary and secondary outcomes” and “methods for additional analyses”. This approach, i.e., to describe only basic statistical analysis, appears to be very reasonable, as there are many correct approaches to statistical analysis, and in most cases, the guidelines’ explanation and elaboration articles provide discussion and examples of essential elements for reporting statistical analysis. Nevertheless, there must also be a compromise between the practicality of guideline usability and including certain inessential items. A rigorous approach to the appraisal of statistical design and analysis of data, especially before publication in the biomedical literature and entry of research outcomes into the public domain, may improve accuracy of research reporting and limit inappropriate translation of research outcomes into clinical practice. These considerations could lead to the development of reporting guidelines with an increased focus on statistical design and analysis.79

A lack of consistency in the endorsement and implementation of reporting guidelines by biomedical journals has been problematic.77 Some reporting guidelines suggest a preferred approach for journals to implement their respective guidelines,23 but often the language used in the “Instructions for Authors” to endorse the guidelines is vague and differs amongst journals.61 Additionally, to ensure that authors comply with reporting guidelines, editors and peer reviewers must check individual checklist items – a practice that can be time consuming and may be ineffective depending on the approach of the reviewer. These issues may be assisted in the future by emerging Web-based platforms and technologies. For example, Web-based applications could ensure that a uniform approach is used to endorse reporting guidelines across journals. Furthermore, Web-based content management systems that allow authors to tag checklist items in manuscripts or facilitate automated text mining for essential data could improve adherence to reporting guidelines.77

Ensuring optimal development and implementation strategies alone is not sufficient to ensure that reporting guidelines are meaningful. In a systematic review by Moher et al., most reporting guidelines studied had no evaluation, nor was there any intention to evaluate the effect of the guideline on the completeness of reporting.42 When the effect of reporting guidelines has been evaluated, investigators have for the most part considered only the CONSORT Statement, and there is a dearth of evidence evaluating the effectiveness of reporting guidelines for other study designs. It is important that future reporting guidelines evaluate their impact and have a framework to identify new evidence of biases for the relevant study design.

Conclusions

The use of reporting guidelines is common among journals, peer reviewers, and authors. Reporting guidelines improve the quality of published research in biomedical journals. However, the quality of research in the biomedical literature remains suboptimal despite increased adherence by authors. Reporting guidelines continue to be refined and applied to different processes of trial design, peer review, and publication. With a plethora of new reporting guidelines being developed, users need to be cautious that these guidelines are being developed with the same level of scrutiny and rigour as more established guidelines and that the resulting interventions are meaningful.