Global mapping of randomised trials related articles published in high-impact-factor medical journals: a cross-sectional analysis

Background Randomised controlled trials (RCTs) provide the most reliable information to inform clinical practice and patient care. We aimed to map global clinical research publication activity through RCT-related articles in high-impact-factor medical journals over the past five decades. Methods We conducted a cross-sectional analysis of articles published in the highest ranked medical journals with an impact factor > 10 (according to Journal Citation Reports published in 2017). We searched PubMed/MEDLINE (from inception to December 31, 2017) for all RCT-related articles (e.g. primary RCTs, secondary analyses and methodology papers) published in high-impact-factor medical journals. For each included article, raw metadata were abstracted from the Web of Science. A process of standardization was conducted to unify the different terms and grammatical variants and to remove typographical, transcription and/or indexing errors. Descriptive analyses were conducted (including the number of articles, citations, most prolific authors, countries, journals, funding sources and keywords). Network analyses of collaborations between countries and co-words are presented. Results We included 39,305 articles (for the period 1965–2017) published in forty journals. The Lancet (n = 3593; 9.1%), the Journal of Clinical Oncology (n = 3343; 8.5%) and The New England Journal of Medicine (n = 3275 articles; 8.3%) published the largest number of RCTs. A total of 154 countries were involved in the production of articles. The global productivity ranking was led by the United States (n = 18,393 articles), followed by the United Kingdom (n = 8028 articles), Canada (n = 4548 articles) and Germany (n = 4415 articles). Seventeen authors who had published 100 or more articles were identified; the most prolific authors were affiliated with Duke University (United States), Harvard University (United States) and McMaster University (Canada). The main funding institutions were the National Institutes of Health (United States), Hoffmann-La Roche (Switzerland), Pfizer (United States), Merck Sharp & Dohme (United States) and Novartis (Switzerland). The 100 most cited RCTs were published in nine journals, led by The New England Journal of Medicine (n = 78 articles), The Lancet (n = 9 articles) and JAMA (n = 7 articles). These landmark contributions focused on novel methodological approaches (e.g. the “Bland-Altman method”) and trials on the management of chronic conditions (e.g. diabetes control, hormone replacement therapy in postmenopausal women, multiple therapies for diverse cancers, cardiovascular therapies such as lipid-lowering statins, antihypertensive medications, and antiplatelet and antithrombotic therapy). Conclusions Our analysis identified authors, countries, funding institutions, landmark contributions and high-impact-factor medical journals publishing RCTs. Over the last 50 years, publication production in leading medical journals has increased, with Western countries leading in research but with low- and middle-income countries showing very limited representation.


Background
Randomised controlled trials (RCTs) are considered one of the simplest and most powerful tools for assessing the safety and effectiveness of treatment interventions [1][2][3]. When appropriately designed, conducted and reported, RCTs can produce an immediate impact on clinical practice and patient care [4].
The evolution of RCTs has been an enduring and continuing process [5][6][7][8][9][10][11][12][13][14][15]. Since the 1970s the publication landscape for RCTs has exhibited an exponential growth. For example, a 1965-2001 bibliometric analysis of the literature identified 369 articles published in 1970 compared to 11,159 published in 2000 [5]. The development of clinical trial registries (such as clinicaltrials.gov) [9,10], the exponential increase in journals publishing trial protocols, results and secondary studies, and growing support for data-sharing policies [11,12] have created an open research environment of transparency and accountability. Furthermore, the publication of reporting guidelines (such as CONSORT and SPIRIT) [4,[13][14][15] have served to facilitate the transition between research and reporting to ensure standardisation and ease of readability.
RCTs published in major medical journals are highly cited and have an instrumental role in clinical practice and health policy decisions [5,16,17]. Previous studies have focused on the quality of the reporting of methods and results of RCTs [18][19][20][21][22] and publication practices [23][24][25][26][27][28] in selected samples of articles published in high-impact-factor (IF) medical journals. However, to the best of our knowledge, no mapping studies have been conducted on major medical journals to investigate the most common subjects, most productive scientists and countries, most prolific journals and "citation classics" across multiple specialties.
The objective of this study was to describe and characterise the global clinical research publication activity through RCT articles published in high-IF medical journals during the past decades.

Eligibility criteria
This cross-sectional analysis investigated RCT-related articles (that is, primary RCTs, secondary analyses and methodology papers using clinical data) published in major medical journals. We excluded narrative reviews, systematic reviews, meta-analyses, pool-analyses, letters and newspaper articles. All RCT-related articles indexed in PubMed/MEDLINE had to be published in one of the major medical journals with an IF exceeding 10 (2016 IF according to the Journal Citation Reports [JCR] published in June 2017). These medical journals were chosen because they were identified as publishing clinical research with scientific merit and clinical relevance (see Table 1 for a list of the included medical journals).

Search
On March 22, 2018, we systematically searched MED-LINE through PubMed (National Library of Medicine, Bethesda, MD, United States) for all RCT-related articles published in high-IF medical journals (from inception to December 31, 2017). A senior information specialist (AA-A) and a clinical epidemiologist (FC-L) designed an electronic literature search using a validated research methodology filter for RCTs (with 97% specificity and 93% sensitivity) [29]. The search was peer reviewed by members of the study team, including a second (senior) information specialist (RA-B). The full search strategy is provided in Additional file 1. On May 7, 2018, we searched the Web of Science (WoS) (Clarivate Analytics, Philadelphia, Penn., United States) by using PubMed IDs (PMIDs) from the PubMed/MEDLINE searches. Merging MEDLINE with other citation indices such as the WoS combines the advantages of MEDLINE (e.g., Medical Subject Headings [MeSH], a comprehensive controlled vocabulary for indexing journal articles) with the relational capabilities and data of the WoS [30].

Data extraction and normalisation
For each included article, raw (meta) data on the journal and article titles, subject category, the year of publication, keywords, and the authors' names, institutional affiliation(s), funding source, and country was downloaded online through the WoS by one researcher (A-AA). We also used the WoS to determine the extent to which each article had been cited in the scientific peerreview literature using the "times cited" number (that is, the number of times a publication has been cited by other publications). Two researchers (FC-L, RA-B) independently verified the data to minimise potential information errors. A process of normalisation was conducted by two researchers to bring together the different names of an author or country and the keywords (further details are available in Additional file 2). Specifically, one researcher (AA-A) checked the names by which an individual author appeared in two or more different forms (for example, "John McMurray" or "John J. McMurray" or "John J.V. McMurray") using coincidence in that author's place(s) of work as the basic criterion for normalisation (for example, University of Glasgow, Scotland, United Kingdom) [31], and a second researcher (FC-L or RA-B) verified the data. A threshold of 30 articles was applied to review 200 names by which an individual author appeared in two or more different forms.
We extracted both "author keywords" and "keyword plus," which are automatically assigned by the WoS from the titles of the references of the articles, as topical (also called textural, linguistic or sematic) data [32]. To ensure consistency in the data, one researcher (RA-B) corrected keywords by unifying grammatical variants and using only one keyword developed to name the same concept (for example, "randomized trial" or "randomized clinical trial" or "randomized controlled trial" or "randomised controlled trial"). In addition, the same researcher (RA-B) removed typographical, transcription and/or indexing errors, and a second researcher (FC-L) verified the data. All potential discrepancies were resolved via consensus amongst these investigators. All these data were collected and entered into a Microsoft Access® (Microsoft, Seattle, WA, United States) database between May 7, 2018, and January 9, 2019.

Data analysis
We analysed data for the number of articles, citations, signatures (or total number of authors included in all the articles of each author), collaboration index (that is the mean number of author's signatures per article), countries, journals and keywords. Data were summarised as frequencies and percentages for the categorical items. The most prolific authors (>100 articles), countries (>100 articles), funding institutions (>100 articles), and the most cited papers ("top-100 citation classics") were identified. Network plots were generated for intense scientific collaboration between countries (applying a threshold of 100 articles in collaboration).
We conducted an exploratory analyses of topical data using a set of unique keywords and their frequencies to examine the topic coverage, major topics ("word clouds" of keywords) and their interrelations ("co-words networks") in RCT articles. The main goal in topical analyses is to understand the topical distribution of a dataset, i.e. what topics are covered and how much of each topic is covered in a scientific discipline [32]. The Table 1 Included high-impact-factor medical journals   General medicine journals (with an IF > 10 most frequently used keywords were identified for the most prolific journals (with at least 1000 articles). Based on the most frequently used keywords (with at least 500 articles), a word cloud was created from text that the user provides and more emphasis was placed on words that appear with greater frequency in the source text. A "co-words network" was created to illustrate the cooccurrence of highly frequent words in the articles (applying a threshold of 100 articles in collaboration). The network analysis was carried out with the use of PAJEK (University of Ljubljana, Slovenia) [33], a software package for large network analysis that is free for noncommercial use to construct network graphs. The PRISMA checklist [34] (http://www.prisma-statement.org/) guided the reporting of the present analysis (and is available in Additional file 3).

Results
A total of 39,329 records were identified by the PubMed/MEDLINE search (Fig. 1), and 39,305 articles met the study inclusion criteria (Additional file 4) after 24 records had been excluded (Additional file 5). Table 2 details the general characteristics of the articles.

Publication trend
The number of articles increased exponentially over the period 1965-2017 (Fig. 2). Approximately 60% (n = 23, 635) of the articles have been published since 2000.

Authors, institutions and countries
Most articles (62.3%; n = 24,496) were written by seven or more authors, and only 11.4% (n = 4469) of the articles were written by three or fewer authors. The first authors of the articles were based most commonly in North America and Western Europe; first authors from the United States were responsible for 36.9% (n = 14, 508) of the articles (Table 2). We identified 17 authors who published 100 or more articles (   Overall, 154 countries worldwide contributed to the analysed articles. The publication productivity ranking for countries (Table 4) was led by the United States (n = 18,393 articles, with 3.4 million citations), followed by the United Kingdom (n = 8028 articles, with 1.3 million citations), Canada (n = 4548 articles, with 1.0 million citations) and Germany (n = 4415 articles, with 0.9 million citations). A total of 37 countries had at least 100 articles in co-authorship. Figure 3 shows a visual representation of the most intense collaborative network between these 37 countries, in which we can see the relationships of some countries with respect to others and the position that each occupies in the network.

Most cited articles
Overall, included articles received 5.9 million citations, of which 83.1% of the citations (n = 4,950,604) corresponded to 15,142 (38.5%) articles with more than 100  citations. In addition, 641 (1.63%) articles with more than 1000 citations accounted for 20.7% of the total citations (n = 1,234,462). The most cited articles by number of citations ("100 citation classics") are listed in Table 6. All of the most cited papers were published in English. These most cited articles were published in nine journals, led by The New England Journal of Medicine, with 78 articles, followed by The Lancet (n = 9) and JAMA (n = 7). The list of most cited papers contained innovative research methodologies. For example, the most cited article was a method paper published in The Lancet ("Bland-Altman method") [35]. This seminal paper changed how method comparison studies are performed in clinical research. The list of the most cited papers also reflected important studies examining the health effects of pharmacological interventions on patients with chronic diseases. Common themes in major advances in health interventions included diabetes control [36][37][38][39][40][41]; the effects of hormone replacement therapy in postmenopausal women [42,43]; therapies for diverse cancers such as glioblastoma, colorectal cancer, breast cancer, melanoma and hepatocellular carcinoma [44][45][46][47][48][49][50]; important interventional studies in the field of clinical cardiology, such as lipid-lowering statin therapy trials, antihypertensive trials, and antiplatelet and/or antithrombotic trials [51][52][53][54][55][56][57][58][59][60][61][62][63].

Discussion
In this cross-sectional analysis, we presented a global mapping of RCT-related articles published in high-IF medical journals for the period 1965-2017. We identified the most prolific scientists, institutions and countries, main funding sources, most common subjects and topics, "citation classics" and most prolific high-IF medical journals from multiple specialties over the last 50 years. In general, we found a strong clustering of articles published in British and American medical journals (The Lancet, Journal of Clinical Oncology, The New England Journal of Medicine, The BMJ, Circulation, JAMA, JACC and Diabetes Care accounted for 53% of the RCTrelated articles). Many of these journals have been developed by active medical associations, both nationally and internationally. We hypothesize that different publishing patterns between journals may potentially reflect editorial policies and/or preferences, with some general medicine journals (such as The Lancet and The New England Journal of Medicine) and specialty journals (such as Journal of Clinical Oncology and Circulation), being more interested in and/or promoting the publication of RCTs. In contrast, a substantial number of these articles are behind publication paywalls (very few of the medical journals in our study sample are Open Access), and thus, research results may not be accessible to a large fraction of the scientific community and society as a whole, including clinicians (and patients) who may want them to help inform their clinical practice.
The results of this study highlight the expanding collaborative networks between countries in multiple regions, revealing a discernible scientific community, with the most productive countries having an important number of collaborations. Publication activity efforts were global during the study period, with articles from scientists and institutions in more than 150 different countries. However, the scientific community is centred on a nucleus of scientists from Western countries, with the most intense global collaborations taking place among the United States, United Kingdom and Canada. The presence and influence that these countries have on biomedical research [64][65][66] may be due to their large multi-stakeholder research partnerships, greater financial investment in clinical research, and high population of active scientists and research centres compared to other countries.
Publication activity worldwide shows that low-and middle-income countries have low levels of articles in high-IF medical journals. Difficulties in healthcare, education and research systems, information access and communication, language barriers and economic and  The Lancet (67), The BMJ (34), PLOS Medicine (26) institutional instability all represent challenges (and clear disadvantages) for productivity in low-and middleincome regions. In addition, restrictions and difficulties in conducting clinical research in resource-poor situations result in the exclusion of many of these countries from the planning, conduct and publication of RCTs [67][68][69]. As might be expected, our results support previous findings that low-and middle-income countries [31,70,71] had minimal contributions in articles published in major medical journals. For example, a previous study [70] showed that most of the authors of original papers published in five high-impact general medical journals (including The New England Journal of Medicine, The Lancet, JAMA, The BMJ and Annals of Internal Medicine) were more frequently affiliated with institutions in the same country as the journal. To address some of these problems, scientists, institutions and funders should promote collaborations (beyond historical, cultural and political factors) to share knowledge, expertise and innovative methodologies for clinical research. This may involve partnerships with Western countries to support capacity and resource development and research training. RCT-related articles were published most often in high IF medical journals devoted to general and internal medicine, cardiology and oncology (nearly 57% of all articles). Similarly, the lists of the most cited articles identified topics which reflect major advances in the management of chronic conditions (such diabetes, cardiovascular disorders and cancer). The large relative productivity in general internal medicine, cardiology and oncology may be explained by the important role of randomised evidence to novel treatments and preventive strategies for these chronic diseases. In line with previous            research [72][73][74][75], most of these highly cited RCTs addressed interventions for burdensome conditions that are health priorities in Western countries [76,77]. Funding of (international, collaborative) RCTs may come from varying sources including commercial and non-commercial sponsors. However, previous analyses of RCT-related articles published in high-IF journals have suggested that study sponsors may influence how RCTs are designed, conducted and reported, sometimes serving financial rather than public interests [78]. Given that research funding is often restricted, the scientific community is responsible for using the available resources most efficiently when exploring research priorities to afford knowledge users and population health needs [76,77,79,80]. Our findings suggest that women are vastly underrepresented in the group of most prolific scientists publishing in high-impact medical journals. This is in direct contrast to recent studies that have identified a gender gap in research publications [81][82][83][84]. For example, a previous study [84] showed that women in first authorship positions increased from 27% in 1994 to 37% in 2014 in leading medical journals (including Annals of Internal Medicine, JAMA Internal Medicine, The BMJ, JAMA, The Lancet and The New England Journal of Medicine), but progress has plateaued or declined since 2009. An urgent need exists to investigate the underlying causes of the potential gender gap to help identify publication practices and strategies to increase women's influence [82,84].
Several limitations exist in our study. First, we characterised the knowledge structures generated by a large number of articles published in major medical journals that are included in the WoS database.  However, our results are limited to a subset of all clinical-trial-related articles published in 40 leading medical journals. We suspect that these articles represent those that have great implications for clinical practice and that are relevant to clinical practice guidelines and healthcare regulators. Although the publication production analysed has been drawn from an exhaustive analysis of the biomedical literature, possibly, the search missed some relevant articles (and journals). Some reports may be published in journals without being indexed as RCTs, making them difficult to identify. Second, as in many bibliometric analyses, the normalisation of the different names of an author, country and funding sources is fundamentally important to avoiding potential errors. We conducted a careful manual validation of the references and textual data to avoid typographical, transcription and/or indexing errors. However, we recognize this procedure does not assure complete certainty. Third, the affiliation addresses of authors do not necessarily reflect the country where the research was conducted or the research funding source. Fourth, topical analysis that extracts a set of unique keywords, word profiles and co-words may indicate intellectual organization in publication production, albeit with inherent limitations [85,86]. Fifth, the use of citation analysis carries some problems [87][88][89][90][91]. A potential length time-effect bias exists, which puts the more recent articles at a disadvantage. In addition, the biomedical literature is rich in barriers and motivations for publication and citation preferences [87], including self-citation (bias towards one's own work) [88], language bias (bias towards publishing and citing English articles), omission bias (bias whereby competitors are purposely not cited), and selective reporting and publication bias (bias in which "negative" results are withheld from publication and citation) [89][90][91][92]. In addition, citations are also treated as equal regardless of whether research is being cited for its positive contribution to the field or because it is being criticized. Finally, our methods represent only a mapping approach, which could be complemented further by more detailed analyses such as by examining the content (e.g. differences in journal or author characteristics between publicly funded and industry-funded studies, designs/methodology, etc.), the reporting and the reproducible research practices through research of research ("meta-research") studies [92][93][94][95][96][97][98].

Conclusion
The global analysis presented in this study provides evidence of the scientific growth of RCT-related articles published in high-IF medical journals. Over the last 50 years, publication activity in leading medical journals has increased, with Western countries (most notably, the United States) leading but with low-and middle-income countries showing very limited representation. Our analysis contributes to a better conceptualization and understanding of RCT articles and identified the main areas of research, the most influential publication sources chosen for their scientific dissemination and the major scientific leaders. Given the dynamic nature of the field, whether the growth trend remains the same in the coming years and how the characteristics of the field change over time will be interesting to see.