Outcome reporting recommendations for clinical trial protocols and reports: a scoping review

Background Clinicians, patients, and policy-makers rely on published evidence from clinical trials to help inform decision-making. A lack of complete and transparent reporting of the investigated trial outcomes limits reproducibility of results and knowledge synthesis efforts, and contributes to outcome switching and other reporting biases. Outcome-specific extensions for the Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT-Outcomes) and Consolidated Standards of Reporting Trials (CONSORT-Outcomes) reporting guidelines are under development to facilitate harmonized reporting of outcomes in trial protocols and reports. The aim of this review was to identify and synthesize existing guidance for trial outcome reporting to inform extension development. Methods We searched for documents published in the last 10 years that provided guidance on trial outcome reporting using: an electronic bibliographic database search (MEDLINE and the Cochrane Methodology Register); a grey literature search; and solicitation of colleagues using a snowballing approach. Two reviewers completed title and abstract screening, full-text screening, and data charting after training. Extracted trial outcome reporting guidance was compared with candidate reporting items to support, refute, or refine the items and to assess the need for the development of additional items. Results In total, 1758 trial outcome reporting recommendations were identified within 244 eligible documents. The majority of documents were published by academic journals (72%). Comparison of each recommendation with the initial list of 70 candidate items led to the development of an additional 62 items, producing 132 candidate items. The items encompassed outcome selection, definition, measurement, analysis, interpretation, and reporting of modifications between trial documents. The total number of documents supporting each candidate item ranged widely (median 5, range 0–84 documents per item), illustrating heterogeneity in the recommendations currently available for outcome reporting across a large and diverse sample of sources. Conclusions Outcome reporting guidance for clinical trial protocols and reports lacks consistency and is spread across a large number of sources that may be challenging to access and implement in practice. Evidence and consensus-based guidance, currently in development (SPIRIT-Outcomes and CONSORT-Outcomes), may help authors adequately describe trial outcomes in protocols and reports transparently and completely to help reduce avoidable research waste.


Background
Clinical trials, when appropriately designed, conducted, and reported, are a gold-standard study design for generating primary evidence on treatment efficacy, effectiveness, and safety. In clinical trials, outcomes (sometimes referred to as endpoints or outcome measures) are measured to examine the effect of the intervention on trial participants. The findings of the trial thus rest critically on the trial outcomes. As data accumulate across different clinical trials for specific interventions and outcomes, the outcome data published in clinical trial reports are ideally synthesized through systematic reviews and meta-analyses into a single estimate of effect that can inform clinical and policy-making decisions. This evidence generation and knowledge synthesis process enables the practice of evidence-based medicine. This process is facilitated by the complete and prospective definition of trial outcomes. Appropriate outcome selection and description are important for obtaining ethical and regulatory approvals, ensuring the trial team conducts the trial consistently and, ultimately, provides transparency of methods and facilitates the interpretation of the trial results.
Despite the importance of trial outcomes, it is well established in the biomedical literature that key information about how trial outcomes were selected, defined, measured, and analysed is often missing or poorly reported across trial documents and information sources [1][2][3][4][5][6][7][8]. A lack of complete and transparent reporting of trial outcomes limits critical appraisal, reproducibility of results, and knowledge synthesis efforts, and enables the introduction of bias into the published literature by leaving room for outcome switching and selective reporting. There is evidence that up to 60% of trials change, omit, or introduce a new primary outcome between the planned trial protocol and the published trial report [3,[9][10][11][12]. Secondary outcomes have been less studied, but may be even more prone to bias and inadequate reporting [12,13]. Deficient outcome reporting, either through selective reporting of the measured outcomes or incompletely pre-specifying and defining essential components of the reported outcome, facilitates undetectable data "cherry-picking" in the primary reports and has the potential to impact the conclusions of systematic reviews and meta-analyses [14,15].
Although there is an established need among the scientific community to improve the reporting of trial outcomes [5,[16][17][18][19], it remains unknown what actually constitutes useful, complete reporting of trial outcomes to knowledge users. The well-established Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT) [20] and Consolidated Standards of Reporting Trials (CONSORT) [21] reporting guidelines provide guidance on what to include in clinical trial protocols and reports, respectively. Yet although SPIRIT and CONSORT provide general guidance on how to report trial outcomes [20,21], and have been extended to cover patient-reported outcomes [22,23] and harms [24], there remains no standard evidence-based guidance that is applicable to all outcome types, disease areas, and populations for trial protocols and published reports.
An international group of experts and knowledge users [25] has therefore convened to develop outcome-specific reporting extensions for the SPIRIT and CONSORT reporting guidelines. Originally referred to as the SPIRIT-InsPECT and CONSORT-InsPECT (Instrument for reporting Planned Endpoints in Clinical Trials) reporting extensions, the final products will be referred to as the SPIRIT-Outcomes and CONSORT-Outcomes extensions in response to stakeholder and end-user input. These extensions will be complementary to the work of the Core Outcome Measures in Effectiveness Trials (COMET) Initiative and core outcome sets; core outcome sets standardize which outcomes should be measured for particular health conditions, whereas SPIRIT-Outcomes and CONSORT-Outcomes will provide standard harmonized guidance on how outcomes should be reported [26].
The SPIRIT-Outcomes and CONSORT-Outcomes extensions are being developed in accordance with the methodological framework created by members of the Enhancing Quality and Transparency of Health Research Quality (EQUATOR) Network for reporting guideline development, including a literature review to identify and synthesize existing reporting guidance [27]. The protocol to develop these guidelines has been published previously [28]. An initial list of 70 candidate trial outcome reporting items was first developed through an environmental scan of academic and regulatory publications, and consultations with methodologists and knowledge users including clinicians, guideline developers, and trialists [28][29][30]. These 70 items were organized into ten descriptive categories: What: description of the outcome; Why: rationale for selecting the outcome; How: the way the outcome is measured; Who: source of information of the outcome; Where: assessment location and setting of the outcome; When: timing of measurement of the outcome; Outcome data management and analyses; Missing outcome data; Interpretation; and Modifications.
The purpose of this scoping review was to identify and synthesize existing guidance for outcome reporting in clinical trials and protocols to inform the development of the SPIRIT-Outcomes and CONSORT-Outcomes extensions. The results of this scoping review were presented during the web-based Delphi study and the inperson consensus meeting. A scoping review approach, which is a form of knowledge synthesis used to map concepts, sources, and evidence underpinning a research area [31,32], was selected given the purpose of this review. The specific research questions that this review sought to address were: what published guidance exists on the reporting of outcomes for clinical trial protocols and reports; does the identified guidance support or refute each candidate item as a reporting item for clinical trial protocols or reports; and does any identified guidance support the creation of additional candidate items or the refinement of existing candidate items?

Methods
This review was prepared in accordance with the PRISMA extension for Scoping Reviews reporting guideline (see Additional File 1: eTable 1) [33]. The protocol for this review has been published elsewhere [30,34]. This scoping review did not require ethics approval from our institution.

Eligibility criteria
Documents that provided guidance (advice or formal recommendation) or a checklist describing outcomespecific information that should be included in a clinical trial protocol or report were eligible if published in the last 10 years in a language that our team could read (English, French, or Dutch). Dates were restricted to the last 10 years from the time of review commencement to focus the review to inform the update and extension of existing guidance provided by CONSORT (published in 2010) and SPIRIT (published in 2013) on outcome reporting and to increase feasibility related to the large number of documents identified in our preliminary searches. There were no restrictions on population, trial design, or outcome type. We only included documents that provided explicit guidance ("stated clearly and in detail, leaving no room for confusion or doubt" [35], such that the guidance must specifically state that the information should be included in a clinical trial protocol or report) [36]. An example of included guidance follows from the CONSORT-PRO extension: "Evidence of patient-reported outcome instrument validity and reliability should be provided or cited, if available" [36].

Information sources
Documents were searched for using: an electronic bibliographic database search (MEDLINE and the Cochrane Methodology Register; see eTable 2 in Additional file 2 for search strategy), developed in close consultation with an experienced research librarian, and searched from inception to 19 March 2018; a grey literature search; solicitation of colleagues; and reference list searching. Eligible document types included review articles, reporting guidelines, recommendation/guidance documents, commentary/opinion pieces/letters, regulatory documents, government reports, ethics review board documents, websites, funder documents, and other trialrelated documents such as trial protocol templates.
The grey literature search methods included a systematic search of Google (www.google.com) using 40 combinations of key words (e.g., "trial outcome guidance", "trial protocol outcome recommendations"; see eTable 3 in Additional file 3 for a complete list). The first five pages of the search results for each key term were reviewed (10 hits per page, leading to 2000 Google hits screened in total). Documents were also searched for using a targeted website search of 41 relevant websites (e.g., the EQUATOR Network, Health Canada, the Agency for Healthcare Research and Quality; see eTable 3 in Additional file 3) identified by the review team, solicitation of colleagues, and use of a tool for searching health-related grey literature [37]. Website searching included screening of the homepage and relevant subpages of each website. When applicable, the term "outcome" and its synonyms were searched for using the internal search feature of the website. We searched online for forms and guidelines from an international sample of ethics review boards, as ethics boards are responsible for evaluating proposed trials including the selection, measurement, and analyses of trial outcomes. We restricted the ethics review board search to five major research universities and five major research hospitals (considered likely to be experienced in reviewing and providing guidance on clinical trials) in four English-speaking countries: United States, United Kingdom, Canada, and Australia (see eTable 3 in Additional file 3). This approach helped to limit the search to a manageable sample of international ethics review board guidance. To ensure diverse geographic representation of documents from ethics review boards, as some countries yielded substantially more documents than others, documents were randomly selected from each of the four selected countries (i.e., 25% of documents were from each country), amounting to approximately half of the number of the total ethics review board documents initially identified.
Additional documents and sources from experts were obtained by contacting all founding members of the "In-sPECT Group" [25]. This included 18 trialists, methodologists, knowledge synthesis experts, clinicians, and reporting guideline developers from around the world [28]. We asked each expert to identify documents, relevant websites, ethics review boards, and additional experts who may have further information. All recommended experts were contacted with the same request. Given the comprehensiveness of our search strategies and the large number of documents identified as eligible for inclusion, we performed reference list searching only for included documents identified via Google searching, as this document set encompassed the diversity of sources and document types eligible for inclusion (e.g., academic publications, websites).

Selection of sources of evidence
A trained team member (L. Saeed) performed the final electronic bibliographic database searches and exported the search results into EndNote version X8 [38] to remove all duplicates. All other data sources were first deduplicated within each source manually, and then deduplicated between already screened sources, leaving only new documents to move forward for "charting" (in scoping reviews, the data extraction process is referred to as charting the results) [32,33].

Initial screening
All screening and data charting forms are available on the Open Science Framework [39]. Titles and abstracts of documents retrieved from the electronic bibliographic database search were screened for potential eligibility by one of two reviewers with graduate-level epidemiological training (AM, EJM) before full texts were thoroughly examined. The two reviewers assessed 90 citations as a practice set and reviewed the results with a senior team member (NJB). The reviewers then screened a randomly selected training set of 100 documents from the electronic bibliographic database search and achieved 93% observed agreement and 71% chance agreement, yielding a Cohen's κ score of 0.76 (substantial agreement [40]). The remaining search results were then divided and each independently screened by one of the two reviewers, with periodic verification checks performed by NJB. One reviewer (AM) screened and charted all website search results. Documents gathered from the ethics review board searches (by L. Saeed) and from the solicitation of experts moved directly to full-text review and charting by EJM.

Full-text screening
The reviewers (AM, EJM) performed full-text screening for eligibility using a similar process as for title and abstract screening. A sample of 35 documents identified from title and abstract screening were assessed for eligibility. The observed agreement rate was 94% (33 of 35 documents). The included documents (n = 14) were charted in duplicate, and the reviewers examined their charting results and resolved any discrepancies through discussion. Following review of the agreement results by a senior team member (NJB), the remaining search results were divided and independently screened and charted by one of the two reviewers, with periodic verification checks performed by NJB. Full-text screening and reasons for exclusion were logged using a standardized form [39] developed using Research Electronic Data Capture (REDCap) software [41].

Data charting process
The included documents proceeded to undergo data charting using a standardized charting form [39] developed using REDCap software [41]. Prior to data charting, 11 documents were piloted through the full-text screening form and the charting form by EJM and AM (AM was not involved in developing the forms), and the forms were modified as necessary following review of the form testing with NJB and MO. The reviewers (AM, EJM) charted data that included information such as characteristics of the document (e.g., publication type, article title, last name of first author, publication year, publisher) as well as the scope and characteristics for each of the specific recommendations extracted from each included document (e.g., whether the recommendation was specific to clinical trial protocols or reports, or specific to type of outcomes, trial design, or population). Given the nature of this review, a risk of bias assessment or formal quality appraisal of included documents was not performed. To help gauge the credibility of recommendations gathered, we categorized the type(s) of recommendation as made with supporting empirical evidence provided within the source document (e.g., based on findings from a literature review or expert consensus methods) and/or citation(s) provided to other documents (e.g., citation provided to an existing reporting guideline), or neither.

Synthesis of results
Recommendations identified within the included documents were compared with the candidate outcome reporting items to support, refute, or refine item content and to assess the need for the development of additional candidate items. To achieve these aims, the reviewers (AM and EJM) mapped each recommendation gathered to existing candidate items or one of the ten descriptive categories, supported by full-text extraction captured in free text boxes within the charting form. Recommendations that did not fall within the scope of any existing candidate items or categories were captured in free text boxes. Eight in-person meetings were held by members of the "InsPECT Operations Team" [25, 28] over a 2month period to review these recommendations and to develop any new candidate reporting items or refine existing candidate items to better reflect the concepts/ wording in the literature. Attendance was required by the review lead author (NJB), the senior author (MO), and at least three other members of the Operations Team (EJM, AM, L. Saeed, A. Chee-a-tow). After completion of data collection, the mapping results of recommendations to each candidate item were reviewed by NJB in their entirety and finalized by consensus with the two reviewers (EJM, AM). The wording of the candidate items was then clarified as necessary and finalized by the Operations Team. Data analysis included descriptive quantitative measures (counts and frequencies) to characterize the guidance document characteristics and their recommendations.

Results
The full dataset is available on the Open Science Framework [39]. The electronic database literature search yielded 2769 unique references, of which 153 documents were found to be eligible and included (Fig. 1). The Google searches (2000 hits assessed in total) led to the inclusion of 62 documents. An additional seven documents were identified and included from the targeted website search (41 websites assessed). There were five documents included from 12 experts (33 were contacted in total), 15 documents from 40 ethics review boards websites, and two from reference list screening. In total, 244 unique documents were included (Fig. 1).
The majority of the included documents were published by academic journals (72%; Table 1). Other publishers include hospitals, universities, and research organizations as well as governments and nongovernmental organizations. All but one document were published in English. The types of documents included varied but were primarily literature reviews (30%), recommendation/guidance documents (24%), commentary/ opinion pieces/letters (12%), or reporting guidelines (14%; Table 1).
Of the included documents, 45 (18%) had a primary focus on trial outcome reporting (e.g., the SPIRIT-PRO reporting guideline [22], a journal commentary on selective outcome reporting [42]). Approximately 40% of the documents were focused on specific age group(s) and/or clinical area(s). Of the 18 documents with a focus on a specific age group, most (n = 12) were focused on paediatric populations ( Table 2). The clinical areas ranged widely (Table 2), with the highest numbers of documents focused on the areas of oncology (n = 15), mental health (n = 10), and oral and gastroenterology (n = 10). Approximately one-third of all included documents (n = 85) came from such discipline-specific documents ( Table 2).
There were 1758 trial outcome reporting recommendations identified in total within 244 eligible documents. The median number of unique outcome reporting recommendations per guidance document was 4 (range . Assessment of the focus of each recommendation (Table 3) showed that most recommendations were specifically focused on clinical trial protocols (43%) and/or reports (44%). Others were focused on outcome reporting in trial documents generally, ethics boards submissions, and clinical trial proposals in grant applications (Table 3). Only 15% of recommendations focused on a   Anaesthesiology, cardiovascular and metabolism, endocrinology, nephrology, obesity, ophthalmology, palliative care, physical rehabilitation, radiology, and urology (n = 1 each) specific trial phase and/or design (Table 3), although nearly half (n = 836, 47%) focused on a specific outcome classification (e.g., primary, secondary) or type (e.g., patient-reported outcomes or harms; Table 3 and Additional file 4: eTable 4).
Of all the recommendations identified, approximately 40% were not supported by any empirical evidence or citations; the remaining 60% were supported by empirical evidence provided within the document and/or citations to other documents ( Table 4). The type of empirical evidence provided was most often generated from literature reviews, and/or through expert consensus methods (Table 4). Supporting citations to other documents were provided for about one-third of all recommendations (Table 4); cited documents included a wide range of sources, although were often existing reporting guidelines or guidance documents such as SPIRIT, CON-SORT, and their associated extensions.
Comparison of each of the 1758 recommendations with the initial list of 70 candidate items led to the development of an additional 61 unique candidate reporting items (Table 5). Team discussions produced two additional candidate reporting items, producing a list of 133 candidate reporting items categorized within the ten descriptive categories. One item was excluded by consensus by the Operations Team as the recommendation was consistent with recognized poor methodological practice, yielding 132 candidate reporting items in total ( Table 5). The number of candidate items that mapped to each of the ten descriptive categories was variable (range 2-41 items per category), with the largest number of candidate items mapped to the "Outcome data management and analyses" (n = 41 items) and the "How: the way the outcome is measured" (n = 26 items) categories ( Table 5). Most of the recommendations made (n = 1611, 91%) could be mapped to a specific candidate reporting item; 153 (9%) were general in nature and were mapped generally to the appropriate category. For example, the recommendation "state how outcome was measured" would be too general in nature to map to a specific candidate item and instead would be mapped to the overall "How: the way the outcome is measured" category. No documents provided an explicit recommendation that refuted or advised against reporting any of the 132 candidate items.  The number of documents containing an outcome reporting recommendation supporting the description of each of the 132 candidate items ranged widely (median 5, range 0-84 documents per item, from a total possible sample of 244 documents; Table 5 and Additional file 5: eFigure 1). Of the 132 candidate reporting items, 104 were applicable to both trial protocols and reports, 24 were not applicable to trial protocols (e.g., pertained specifically to known trial results), and 4 were not applicable to trial reports (e.g., pertained to trial planning only). Comparison with the items and concepts covered in SPIRIT 2013 showed that 78 of the 108 (72%) candidate items relevant to protocols are not currently covered either completely or in part by items in the existing SPIRIT checklist. Comparison with items covered in CONSORT 2010 showed that 106 of the 128 (83%) candidate items relevant to trial reports are not currently covered either completely or in part in the existing CONSORT checklist.

Discussion
We performed a review of clinical trial outcomereporting guidance that encompassed all outcome types, disease areas, and populations from a diverse and comprehensive range of sources. Our findings show that existing outcome reporting guidance for clinical trial protocols and trial reports lacks consistency and is spread across a large number of sources that may be challenging for authors to access and implement in research practice. These results suggest that evidence and consensus-based guidance is needed to help authors adequately describe trial outcomes in protocols and reports transparently and completely to help minimize avoidable research waste.
This review provides a comprehensive, evidence-based set of reporting items for authors to consider when preparing trial protocols and reports. The large number of documents included suggest there is much interest in improving outcome reporting in clinical trial documents. Identified outcome reporting items covered diverse concepts that we categorized across ten categories, and the number of items within each category ranged widely. However, authors wishing to use the reporting items identified in this review would face the challenge of trying to describe a large number of reporting concepts into what is typically expensive journal "real estate" (i.e., limited space for competing papers). To date, no published consensus exists on which of these items are essential and constitute best practice to report. For example, it seems unlikely that authors would commonly have the space allowance to provide descriptions of all 41 items within the "Outcome data management and analyses" category, and it is unknown-in the absence of a consensus process-which of these items may be appropriate or necessary to report for any given trial.
Notably, a considerable number of the recommendations we identified are not covered in content or in  If a primary outcome, provide a rationale for classifying the outcome as primary Specify if a relevant core outcome set is publicly available (e.g., via www.cometinitiative.org/), and if so, if the outcome is part of a core outcome set. If applicable, specify which core outcome set the outcome is part of If applicable, describe discrepancies between the selected outcome and outcomes shown to be of interest to relevant stakeholder groups (e.g., through a core    Describe or provide reference to an empirical study that establishes the responsiveness of the outcome measurement instrument in the study sample    Outcome data management and analyses (n = 41 items) Category of "Outcome data management and analyses" in general (recommendation not specific to any candidate item) 24 10 14 3 1 6

Data management and processes
Describe outcome data entry, coding, security and storage, including any related processes to promote outcome data quality (e.g., double entry, range checks from outcome data values). Reference to where details of data management procedures can be found, if not included   If applicable, describe any analyses conducted to assess the risk of bias posed by missing outcome data (e.g., comparison of baseline characteristics of participants with and without missing outcome data) Provide justification for methods to handle missing outcome data. This should include: assumptions underlying the missing outcome data mechanism with justification (including analyses performed to support assumptions about the missingness mechanism); and how the assumed missingness mechanism and any relevant features of the outcome data would influence the choice of statistical method(s) to handle missing outcome data including sensitivity analyses 19 8 6 9 0 4 Interpretation (n = 11 items) Category of "Interpretation" in general (recommendation not specific to any candidate item) Outcome domain in this context refers to a relatively broad aspect of the effect of illness within which an improvement may occur in response to an intervention; domains may not be directly measurable themselves, so outcomes are selected to assess change within them [43] c A new item generated through Operations Team discussions when the scoping review findings were reviewed for new items principle in the existing SPIRIT and CONSORT reporting guidelines [20,21]. Currently, SPIRIT requires more information on trial outcomes to be reported, and in greater detail, than CONSORT [20,21]. The results of this review, however, showed that most of the candidate items had a similar number of supporting documents that advocated for their inclusion in protocols and in reports, with a few notable exceptions. For example, 24 documents explicitly supported describing the time period(s) for which the outcome is analysed in trial protocols, but only three suggested including this in trial reports. The exclusion of a clear statement of the planned time period(s) of analyses in trial reports enables the possibility of data analysis "cherry-picking" (e.g., multiple unplanned analyses are performed for multiple measurement time points, with only results for the significant analyses being reported). Consulting other trial documents, such as trial protocols and statistical analyses plans [44], may help mitigate the need for such information in the trial report itself. However, these trial documents may not be publicly available [45] and one must also consider the burden on the knowledge user of needing to consult multiple information sources in an era of transition to online publication methods and free sharing platforms.
In order to identify the minimum set of reporting items it is necessary to include in all clinical trial protocols and reports, respectively, the results of this scoping review were consolidated and presented during the recently held international Delphi survey and expert Consensus Meeting to determine which candidate items should be included or excluded in the SPIRIT-Outcomes and CONSORT-Outcomes extensions and to develop the wording of the final reporting items. This protocol for this process has been described in detail elsewhere [46] and the results are being prepared for publication as part of the extension statements.

Strengths and limitations
We used a scoping review methodology [32] to map guidance on trial outcome reporting from multiple information sources in an attempt to capture guidance produced and used by relevant stakeholders, including from academic journals, regulatory and government agencies, and ethics review boards [30]. Sensitivity and accuracy may have been reduced by not completing screening and charting in duplicate, although the reviewer training results and periodic data checks by the senior reviewer as well as the fact that all reviewers have graduate-level epidemiological training may have limited this risk. Furthermore, the mapping of every recommendation extracted to each candidate item was verified by the senior reviewer and all of the mapping results presented achieved consensus.
The development of new candidate reporting items followed a planned standardized process of team review and discussion that aimed to minimize item content redundancy and ensure correct interpretation of the extracted recommendations [30]. There may be relevant documents published outside the included date range, and the language restrictions employed yielded a sample of documents that were almost entirely published in English. The international ethics review board websites search represented a convenience sample and therefore may not be representative, for example, of guidance provided by non-English speaking and/or smaller institutions. We were limited to documents that were publicly available or available through our institutional access; in particular, ethics review boards may provide guidance to local investigators that is not publicly available to access. However, using sensitive search methods, saturation was reached such that no new items were identified well prior to the end of document review and charting. Most new items were identified in the initial stages of the review.
Our review focused on the quantity of documents supporting each recommendation and did not formally assess their quality. To help gauge the credibility of gathered recommendations, we categorized the type(s) of underpinning empirical evidence for each recommendation. Indeed, some candidate items were supported by multiple well-recognized sources and had an empirical evidence base or process that underpinned the recommendations as to why this item is recommended to be reported (e.g., from a systematic review or Delphi process). Others were less frequently recommended for reporting or did not provide supporting empirical evidence, but still may have important implications and merit for reporting. For example, a clear recommendation to "identify the outcomes in a trial report as planned (i.e., pre-specified) or unplanned" was found in only one document. However, selective outcome reporting and outcome switching has been well documented in trial reports, is often difficult to detect, and has been shown to impact treatment estimates in meta-analyses [3,[9][10][11]14]. The results from the Delphi and consensus processes will help clarify the relative importance and acceptability of the candidate items by an international group of expert stakeholders.

Conclusions
There is a lack of harmonized guidance to lead authors, reviewers, journal editors, and other stakeholders through the process of ensuring that trial outcomes are completely and transparently described in clinical trial protocols and reports. Existing recommendations are spread across a diverse range of sources and vary in breadth and content. The large number of documents identified, despite limiting our search to the last decade, indicate a substantial interest and need for improving outcome reporting in clinical trial documents. To determine which outcome reporting recommendations constitute best practices for outcome reporting for any clinical trial, a minimum, essential set of reporting items will be identified through evidence and consensus-based methods and ultimately developed into the SPIRIT-Outcomes and CONSORT-Outcomes reporting guidelines.