Using PubMed, we identified all phase III clinical trials with primary results published between January 1, 2016, and June 30, 2017, in journals with a 2016 Journal Impact Factor of 10 or greater (Supplementary Table 1), linked to an NCTID, and with results reported in ClinicalTrials.gov(Supplementary Figure; list provided in Supplementary Table 2). For all trials, we compared the information and the results reported in the publication and in ClinicalTrials.gov for the following study features: cohort characteristics (completion rate, age, sex, race/ethnicity), intervention details, primary efficacy endpoints, and serious adverse events. These four study features were examined because they were (1) objectively comparable between the two sources and (2), in our estimation, the most important when weighing the design, significance, and interpretation of a trial.
For cohort characteristics, information was deemed concordant if the type of properties reported and the values for each were the same between sources. For intervention details, information was deemed concordant when the dosage, time course, and frequency of the intervention matched. For primary efficacy endpoints, information was deemed concordant when the measure description and the reported results matched. For serious adverse events, information was deemed concordant when the event description and reported results matched. Study features (cohort characteristics, trial intervention details, primary efficacy endpoint, serious adverse events) could not be compared between the two sources when they were reported in different formats. For example, if the age distribution was reported as the number of participants in certain age ranges (18–30, 31–45, ...) rather than as the mean age, or if adverse events were stratified in one source as serious vs. non-serious while in another were reported in aggregate. We conducted a cross-sectional analysis, characterizing the rate of reporting and consistency in the information and results reported for all study features between the two sources using descriptive statistics; all analyses were performed using Excel (version 16.24) and RStudio (version 1.1.447).