Skip to main content

Table 1 Steps taken to process the data

From: Some data quality issues at ClinicalTrials.gov

1a. Steps taken to process the data with each record representing one trial
Number Processing step Records selected Records rejected Table(s) with the details
1 Downloaded files 112,013   Additional file 3: Table S1
2 Selected trials with the start date between 1/1/2005 and 12/31/2014 79,838 32,175 Additional file 4: Table S2
3 Selected trials with “drug:” or “biological:” in the intervention field 64,496 15,342 Additional file 5: Table S3
4 Selected trials with a completion date or primary completion date 63,786 710 Additional file 6: Table S4 and Additional file 7: Table S5
5 Selected trials registered with a US authority 35,121 28,665 Additional file 8: Table S6 and Additional file 9: Table S7
6 Selected trials that listed both the investigator’s name and role 31,392 3729 Additional file 10: Table S8
7 Selected trials where the number of investigators’ names matched the number of their roles 31,375 17 Additional file 11: Table S9
1b. Steps taken to process the data, with each row in Additional file 12: Table S10 and Additional file 13: Table S11 representing one investigator
Number Processing step Names selected Names rejected Table(s) with the details
8 In trials where there were multiple names, separated them to create pairs of single names and their corresponding roles 71,359   Additional file 12: Table S10
9 Selected trials where the investigator’s name was that of a real individual 60,787 10,572* Additional file 13: Table S11
  1. *These 10,572 “names” came from 8907 NCT IDs