Comparison of randomized controlled trials discontinued or revised for poor recruitment and completed trials with the same research question: a matched qualitative study

Background More than a quarter of randomized controlled trials (RCTs) are prematurely discontinued, mostly due to poor recruitment of patients. In this study, we systematically compared RCTs discontinued or revised for poor recruitment and completed RCTs with the same underlying research question to better understand the causes of poor recruitment, particularly related to methodological aspects and context-specific study settings. Methods We compared RCTs that were discontinued or revised for poor recruitment to RCTs that were completed as planned, matching in terms of population and intervention. Based on an existing sample of RCTs discontinued or revised due to poor recruitment, we identified matching RCTs through a literature search for systematic reviews that cited the discontinued or revised RCT and matching completed RCTs without poor recruitment. Based on extracted data, we explored differences in the design, conduct, and study settings between RCTs with and without poor recruitment, separately for each research question using semi-structured discussions. Results We identified 15 separate research questions with a total of 29 RCTs discontinued or revised for poor recruitment and 48 RCTs completed as planned. Prominent research areas in the sample were cancer and acute care. The mean number of RCTs with poor recruitment per research question was 1.9 ranging from 1 to 4 suggesting clusters of research questions or settings prone to recruitment problems. The reporting quality of the recruitment process in RCT publications was generally low. We found that RCTs with poor recruitment often had narrower eligibility criteria, were investigator- rather than industry-sponsored, were associated with a higher burden for patients and recruiters, sometimes used outdated control interventions, and were often launched later in time than RCTs without poor recruitment compromising uncertainty about tested interventions through emerging evidence. Whether a multi- or single-center setting was advantageous for patient recruitment seemed to depend on the research context. Conclusions Our study confirmed previously identified causes for poor recruitment, i.e., narrow eligibility criteria, investigator sponsorship, and a reduced motivation of patients and recruiters. Newly identified aspects were that researchers need to be aware of all other RCTs on a research question so that compromising effects on the recruitment can be minimized and that a larger number of centers is not always advantageous.


(Continued from previous page)
Conclusions: Our study confirmed previously identified causes for poor recruitment, i.e., narrow eligibility criteria, investigator sponsorship, and a reduced motivation of patients and recruiters. Newly identified aspects were that researchers need to be aware of all other RCTs on a research question so that compromising effects on the recruitment can be minimized and that a larger number of centers is not always advantageous.
Keywords: randomized controlled trials , early termination of clinical trials, poor recruitment, systematic reviews, qualitative analysis Background Evidence-based health care relies on high-quality clinical research. Randomized controlled trials (RCTs) are the method of choice to assess preventive and therapeutic interventions and are a cornerstone in the final phase of drug development and in comparative effectiveness research. Conducting high-quality RCTs, however, is challenging. More than a quarter of RCTs do not reach the planned sample size, mostly due to poor recruitment of patients, and are prematurely discontinued [1,2]. Investigator-initiated RCTs are particularly prone [1]. Implications of poor patient recruitment and premature discontinuation of RCTs are that up to 70% of such trials remain unpublished, root causes of recruitment difficulties are not shared with the scientific community and may therefore be repeated in the future, research questions remain unanswered, and substantial amounts of scarce research resources are wasted [1,3].
To better understand the causes of poor recruitment, previous quantitative approaches using RCT protocols and registry information [1,2] as well as qualitative analyses from published reports and semi-structured interviews with trialists and other stakeholders in clinical research have already provided important insights [3,4]. That is, for instance, that investigator-initiated RCTs or RCTs in the acute care setting were found to be at much higher risk for discontinuation due to poor recruitment than industry-sponsored RCTs or RCTs in non-acute care settings [1,2]; or that insufficient preparation, overly narrow eligibility criteria, and prejudiced views of recruiters and patients on trial interventions are common reasons for poor recruitment [3]. As suggested by others previously [5], we undertook a systematic comparison of RCTs discontinued or revised for poor recruitment and RCTs completed as planned with the same underlying research question. We aimed to provide additional evidence on potential causes for poor recruitment specifically related to design aspects and contextspecific study settings of RCTs.

Methods
This is a matched qualitative study comparing RCTs that did not reach 90% of their originally planned sample size due to poor recruitment (cases; discontinued or revised RCTs with poor recruitment) to RCTs completed as planned (controls without poor recruitment) matching in terms of patient population and experimental intervention. The study is reported according to the Standards for Reporting Qualitative Research (SRQR, http:// www.equator-network.org/reporting-guidelines/srqr/) as described in Additional file 1.

Identification of discontinued and matching completed RCTs
In a previous study, we included 20 RCTs that were discontinued or revised for poor recruitment with the planned sample size being reported in a publication [1]. For these studies, we aimed to find matching RCTs reaching at least 90% of their originally planned sample size. We searched for systematic reviews that cited the discontinued or revised RCT using the "times cited" function in Web of Science (times cited "view all of the articles that cite this one" https://apps.webofknowledge. com) in January 2016. One reviewer (VG) screened the titles, abstracts, and full texts of potentially eligible systematic reviews for relevance. An eligible systematic review had to describe a literature search conducted in at least one electronic database (e.g., Medline), include RCTs, and had to have a research question similar to that of the discontinued or revised RCT. If more than one systematic review were eligible, we chose the most up-to-date, comprehensive systematic review (frequently a Cochrane review). If no systematic review was identified, we searched by the same means for similar narrative reviews. In case VG was in doubt about the eligibility of a systematic review, she involved a second reviewer (MB) for discussion and consensus decision.
From each eligible systematic review we retrieved the full text articles of included RCTs (i.e., potentially eligible matching RCTs) and collected a small set of preliminary data. These included whether the RCT was discontinued or revised for poor recruitment, planned and actually achieved sample size, the patient population, and the experimental and control intervention. This was done by two methodologically trained investigators (VG, BS) working independently and in duplicate. Any disagreements were resolved by consensus and, if needed, through involving another investigator (MB).
The matching criteria (inclusion criteria) were the tested intervention and included patient population, which needed to be similar enough so that the RCTs could be included in the same meta-analysis. Other trial characteristics such as comparator interventions, outcomes of interest, and trial settings already qualified as factors potentially associated with poor recruitment.

Data collection
The following data were collected from the eventually included RCTs: study design (e.g., superiority or noninferiority trial, factorial or parallel design, allocation ratio), sample size calculation, allocation concealment, blinding, reporting quality of the recruitment process, eligibility criteria, trial sponsor, country and place of patient recruitment, recruitment period, reporting of recruitment networks, support from a clinical trial unit or contract research organization, study population, interventions, comparators, and primary outcome. In addition, we also extracted selfreported reasons (if any) for poor recruitment of the discontinued or revised RCTs. Data were collected by one investigator (VG or BS) and checked by another (VG, BS or MB). Any disagreements were resolved by consensus.

Data analysis
We explored differences in the design, conduct. and study settings between RCTs with and without poor recruitment, separately for each research question, thus contextspecific, using semi-structured discussions (MB, BS, and VG). In a first round, we gathered differences between RCTs with and without poor recruitment, based on our extracted data; we systematically went through a prespecified checklist of items potentially associated with poor recruitment, including eligibility criteria, trial sponsor, single versus multiple centers, etc. (Additional file 2); and discussed each item in turn followed by observations beyond the checklist items (e.g., the chronology of RCTs building up the evidence base for a specific research question). When we reached agreement among us, the identified differences and potentially relevant observations were captured by VG in free-text form. In a second round, we reflected on the identified differences between RCTs with and without poor recruitment and other relevant observations for each research question and integrated the relevant information in order to come up with an explanation why for a certain research question one or more RCTs had serious recruitment problems while others did not have such problems. In addition, we considered the selfreported reasons for poor recruitment (if any).

Researchers' reflexivity
The three researchers who carried out the analysis have diverse disciplinary background and training such as medicine/ clinical epidemiology (MB), nutrition/health technology assessment (VG), and epidemiology/public health (BS). To better understand specific characteristics of oncology trials, we involved a practicing oncologist in the respective discussions. For the other topics, the analysis team members felt that they had sufficient knowledge to assess trial characteristics, and their methodological expertise with clinical trials likely strengthened the analysis. None of us knew any of the included RCTs or their investigators. During analysis, all researchers worked together as a team and extensively discussed the data interpretation to minimize bias.

Study sample
In a previous study, there were 20 published RCTs discontinued or revised for poor recruitment that reported the planned sample size [1]. For five of these RCTs our literature search could not identify a systematic or narrative review that cited the discontinued or revised RCT (Fig. 1). For the remaining 15 RCTs discontinued or revised due to poor recruitment representing 15 different research questions, we identified 110 matching RCTs. We excluded 48 RCTs because the planned sample size was not reported (n = 31); the RCT was discontinued for another reason than poor recruitment (n = 9); the article was not published in English (n = 3); the research question was not similar enough (n = 2); there was no full-text publication available (n = 2); or the same results were published multiple times (n = 1). Of the 62 matching RCTs, 48 were completed as planned and 14 were newly identified RCTs discontinued or revised for poor recruitment. Hence, for 15 separate research questions a total of 77 RCTs (29 with and 48 without poor recruitment) were available for comparative analyses. Twenty-five of the 29 RCTs with poor recruitment explicitly reported that they were discontinued due to recruitment problems; the remaining four RCTs [6][7][8][9] had recruitment problems but revised their original target sample sizes during the trial (reduction of the originally planned number of patients in each trial by about 50%) and met the revised targets. References of all included RCTs are provided in Additional file 3.

Research questions, recruitment, and reporting quality
Medical areas of the 15 included research questions were cancer research (n = 5), research in acute care (n = 8; including preterm infants [n = 2] and laboring women [n = 1]), surgery (n = 1), and infectious diseases (n = 1). The mean number of RCTs with poor recruitment was 1.9 per research question ranging from 1 to 4. Table 1 summarizes the recruitment characteristics for the RCTs with and without poor recruitment across all research questions. In general, RCTs with poor recruitment recruited fewer patients per year per recruiting center compared to RCTs that were completed as planned. More detailed recruitment characteristics for each research question and for all included RCTs, as well as general study characteristics are provided in Additional files 4, 5 and 6.
The reporting of the recruitment process was generally in included RCT publications with little detail. None of the articles reported on who actually recruited patients or the anticipated prevalence of eligible patients. Only 6% (5/77) of the RCTs reported on the anticipated recruitment duration, 51% (39/77) reported the location where patients were recruited, 27% (21/77) provided a detailed patient flow, and 90% (69/77) reported the actual recruitment period or duration (Additional file 7). Table 2 summarizes the differences observed between RCTs with and without poor recruitment for each research question as well as our context-specific conclusions on the possible reasons for poor recruitment. The most recurrent theme across research questions was that, in RCTs with poor recruitment, eligibility criteria were substantially narrower than in RCTs without poor recruitment (research questions #1, #2, #3, #4, #9, #12, #13 in Table 2).

Comparison of RCTs with and without poor recruitment
There was no consistent pattern as to whether an international or national multicenter setting or a single- Fig. 1 Selection of included randomized controlled trials. *Five of the original 20 discontinued RCTs for poor recruitment that were published and reported a sample size calculation were excluded, because we did not identify a systematic review or narrative review citing the RCT. **Including four RCTs with recruitment problems but revised original target sample sizes during the trial (reduction of the originally planned number of patients in each trial by about 50%) and meeting their revised targets. RCT randomized controlled trial Table 1 Recruitment characteristics of included randomized controlled trials with and without poor recruitment across research questions  [8] tested available drugs (beta-blocker, amiodarone, and sotalol) The fact that already available antiarrhythmic drugs and no new drugs were tested may have compromised the motivation of recruiters or patients; recruitment capacities might have been insufficient due to too few study centers; and narrower eligibility criteria for patients with ventricular arrhythmia may correlate with overestimated recruitment rates and with slow recruitment 2) Number of study centers: 3 to 4 times lower in RCT with poor recruitment 3) Eligibility criteria: more restrictive in the RCT with poor recruitment [8] with respect to usage of antiarrhythmic agents prior to enrolment and the implantable cardioverter defibrillator device (only one specific type allowed) #5: Prophylactic antibiotics compared to placebo or usual care in acute necrotizing pancreatitis RCTs with poor recruitment n = 3 RCTs without poor recruitment n = 0 No differences observed because only RCTs with poor recruitment were identified The fact that 3 RCTs were discontinued due to poor recruitment in this research area suggests that it seems generally problematic to recruit patients with acute necrotizing pancreatitis, which is associated with a high mortality rate (vulnerable patients)   RCTs without poor recruitment n = 1 1) Patient burden during follow-up: longer follow-up with more complex assessments of the neonates in the RCT with poor recruitment Laboring women in both RCTs had to be at high risk of preterm delivery but also not too advanced so that they could still be randomized; however, the RCT with poor recruitment imposed a higher burden on patients with longer follow-up and a more complex assessment of the neonates, probably reducing the motivation of patients and recruiters to participate; in contrast, the primary outcome (i.e., number of days from randomization to delivery) in the RCT without poor recruitment did not require any effort from patients. A few well-picked study centers may have been advantageous in this acute care setting. The motivation of recruiters and patients was probably further reduced in the trial with poor recruitment, when evidence on the effectiveness of the tested intervention (nitroglycerin) from other RCTs became available (compromised equipoise, Smith et al. 2007).
Only 2 RCTs were considered in this group of RCTs, because the majority (n = 7) of RCTs included in the same systematic review did not report any sample size calculation/target sample size potentially hiding recruitment problems in these trials 2) Primary outcome: the RCT with poor recruitment had a more complex primary outcome (neonatal morbidity associated with long-term morbidity and perinatal mortality vs. number of days from randomization to delivery) 3) Number of study centers: the RCT with poor recruitment had over 3 times more study centers than the RCT without poor recruitment 4) Publication chronology/available evidence: the RCT with poor recruitment was still ongoing when the RCT without poor recruitment was published #15: Vasopressin-containing regimen compared to epinephrine in cardiac arrest RCTs with poor recruitment n = 1 RCTs without poor recruitment n = 5 Number of study centers: the RCT with poor recruitment was a multicenter RCT whereas all but one RCT without poor recruitment (Gueugniaud et al. 2008) were single-center RCTs.
In this acute care setting with vulnerable patients a single center RCT may be advantageous due to higher motivation and better prepared/trained recruiters and trial team and closer monitoring of recruitment). The RCT with poor recruitment was conducted in 44 community-based emergency rooms. One other RCT without poor recruitment (Gueugniaud et al. 2008) had a similar number of study centers as the discontinued RCT, but authors reported that the network of centers was well stablished and experienced in running acute care RCTs.
RCT randomized controlled trial; All of the articles cited in Table 2 are referenced in Additional file 3 center setting was advantageous for patient recruitment in RCTs. In research question #4, for instance, investigating antiarrhythmic drugs, the RCT with poor recruitment had three to four times fewer study centers than RCTs without poor recruitment; or in research area #2 (metastatic breast cancer therapy) the RCTs with poor recruitment were all restricted to a national setting, while the RCTs without poor recruitment were all done in large international collaborations. On the other hand, single-center RCTs or settings with only a few, carefully chosen study centers may have worked better in settings with particular logistical challenges (e.g., question #8 on primary angioplasty versus onsite thrombolysis and question #15 testing therapies for resuscitation) or the inclusion of particularly vulnerable patients (e.g., research questions #6 and #7 focusing on the recruitment of preterm neonates) in the absence of well-established and experienced trial networks. We observed that investigator-sponsored RCTs are associated with a higher risk for poor recruitment than industry-sponsored RCTs; in research questions #1 and #12 all RCTs with poor recruitment were investigatorsponsored, while all RCTs without poor recruitment were industry-sponsored.
The chronology of RCTs, e.g., when a RCT is launched while other RCTs on the same research question have already been completed, appears to impact on recruitment. RCTs with poor recruitment were often initiated and published later, after other RCTs had already been successfully completed (research questions #2, #6, #10, #13, #14); i.e., evidence on the potential benefits and harms of a certain intervention was already available at some point during the conduct of such RCTs. This may have compromised the uncertainty about the tested treatments as an ethical precondition (equipoise), and the motivation of patients and recruiters for further randomization. In research questions #10 and #11 the control intervention(s) were already outdated, when the RCTs with poor recruitment were launched. In addition, in some RCTs with poor recruitment the burden for patients, such as numerous or invasive assessments during follow-up (research questions #1, #12, #14), or the burden for recruiters, such as the need to apply a complex scoring system in order to include patients (research question #9), was higher than in corresponding RCTs without poor recruitment. Furthermore, it happened that the tested interventions (already available drugs in RCTs with poor recruitment versus new drugs in RCTs without poor recruitment; research question #4) or the study design with side effects of the experimental drug being the primary outcome (research question #6) were less attractive in RCTs with poor recruitment than in corresponding RCTs without poor recruitment. If interventions to be compared in an RCT not only differed in the administered substance or drug but, in addition, in the route (e.g., intravenous or oral administration) or timing of application, then patients' preferences could have compromised their willingness to be randomized (research question #11 in Table 2).

Main findings
In this qualitative comparison between RCTs that did not achieve their originally planned sample size due to recruitment problems and RCTs that were completed as planned, we identified several reasons for poor recruitment. We found that RCTs with poor recruitment often had narrower eligibility criteria than RCTs without poor recruitment, were investigator-sponsored rather than industry-sponsored, and were less attractive for patients and recruiters due to higher burden, outdated control interventions, or already existing or accumulating evidence on benefits and harms of interventions from other RCTs. An existing network of study centers experienced in the conduct of RCTs is instrumental for successful recruitment, but whether one or a few study centers or an international multicenter design was more advantageous seemed to depend on the research context. With challenging settings such as acute care, vulnerable patients, or complex logistics, one or a few carefully chosen centers may be preferable due to closer monitoring of recruitment, potentially better prepared and motivated study staff, established procedures and, perhaps most important, efficient communication among the trial team. Prominent research areas in our sample were cancer research and research in acute care with, on average, more than one RCT with poor recruitment per research question suggesting that there are clusters of research areas typically prone to recruitment problems.

Strengths and limitations
The strengths of our study include a new systematic approach to qualitatively compare RCTs discontinued or revised due to poor recruitment and RCTs completed as planned, that had previously been suggested [5] but, to our knowledge, never been used to date. We included 77 RCTs from a broad range of settings and research topics and analyzed them specifically in their context, thereby strengthening the applicability of our results.
Our study has several limitations. First, the present qualitative analysis was limited to the information provided in publications of RCTs and did not include information from other sources such as study protocols or interviews with trialists. This might have constituted a selection because the majority of RCTs discontinued for poor recruitment were not published in a peer-reviewed journal [1]. Second, we were not able to assess reasons for poor recruitment that were not described (e.g., lack of funding, a theme recurrently coming up in an interview study on the topic [4]). Third, we did not comprehensively search the literature for reports of RCTs discontinued or revised due to poor recruitment, but pragmatically started out with a sample of discontinued or revised RCTs identified in a previous study [1], and we used existing systematic and narrative reviews to find matching RCTs. In addition, we excluded 31 RCTs from our analysis because articles did not report a planned sample size, and therefore we were unable to judge whether the originally planned sample size was achieved or not. Fourth, the reporting of the patient recruitment process in included RCT publications provided little detail (irrespective of whether the RCT struggled with recruitment or not) compromising the effectiveness of our qualitative analysis. Particularly the fact that many articles did not report the number of patients screened for eligibility, the number not meeting eligibility criteria, and the number of patients declining to participate, often limited a better understanding of potential recruitment problems. Fifth, one researcher extracted all relevant information from included RCT publications and another checked this information rather than two researchers extracting relevant data independently and in duplicate. We chose this approach for feasibility reasons risking a higher rate of extraction errors. However, information directly relevant for our interpretation of reasons underlying recruitment problems were actually verified by two methodologically trained researchers. Sixth, although our study captured a broad range of clinical areas, 12 of the 15 research questions were related to drug therapy, leaving uncertainty whether our findings are equally applicable to other interventions such as surgery, behavior change, or complex interventions. Finally, although we found evidence for saturation in our qualitative analysis, the size of our study sample was mainly determined by practical issues of our approach.

Comparison with other studies investigating poor recruitment in RCTs
Our results confirm several findings of previous studies using different methods. Investigator-sponsored (in the sense of investigator-initiated) RCTs, for instance, were found to be at higher risk for discontinuation due to poor recruitment than industry-sponsored RCTs by two studies using a quantitative approach with multivariable regression analysis [1,2]. The common interpretation is that industry sponsorship is associated with sufficient funding and better planning and conduct, factors that facilitate successful patient recruitment. Moreover, the acute care setting (e.g., emergency rooms, intensive care units, care for preterm neonates) seems particularly prone to insufficient recruitment [10].
Similar to the present study, a systematic review of published reports of RCTs discontinued due to poor recruitment found that overly narrow eligibility criteria and prejudiced views of recruiters and patients on trial interventions were the most frequent reasons for poor recruitment [3]. Prejudiced views of patients and recruiters may come from different sources. Our study found that RCTs with poor recruitment were often launched relatively late in the sequence of RCTs on the same research question. As evidence accumulates over time, the uncertainty about the benefits and harms of a certain intervention or about the superiority of one intervention over another (equipoise) may become increasingly compromised. Some control interventions were even considered outdated right from the start of an RCT (research questions #10 and #11 in Table 2), which confirms the observation by Habre et al. [11]. In some instances, the route of application of an intervention or the more complex logistics associated with an intervention were less appealing to patients or recruiters. This is also consistent with the finding by Bernardez-Pereira et al. that single-arm clinical trials were less prone to discontinuation due to poor recruitment compared to multiple-arm trials. That is because in single-arm trials all patients receive the same intervention and, thus, are not confronted with the fact that they could be randomized to a different, maybe less preferred, treatment [2].
Another common finding in RCTs with poor recruitment was a high burden or inconvenience for patients or recruiters due to trial procedures (e.g., many follow-up visits, blood draws, lengthy questionnaires or case report forms). This was mentioned previously as a problem in several other qualitative studies [3,[12][13][14].
Our study highlights two aspects about recruitment challenges to RCTs that are, so far, not prominent in the published literature. First, investigators need to be aware of all other RCTs on a research question, their timing, and the accumulating evidence base, so that potentially compromising effects on the recruitment to their own trial can be minimized. Second, apart from the notion that well-established networks of collaborating study centers are an asset for the successful recruitment of patients to RCTs [4], it seems that a larger number of centers is not always advantageous. The performance of a study center typically depends on its commitment toward an RCT, the enthusiasm, training, and quality of communication of staff; therefore, challenging settings such as acute care, vulnerable patients, or complex logistics require careful selection and close monitoring of participating centers.
Finally, our study documents the urgent need for a more detailed reporting of participant recruitment in RCTs, which is in line with previous reports [3,15,16]. Furthermore, we did not find that poor reporting of the recruitment process was a particular issue of RCTs discontinued or revised for poor recruitment. Indicating that, if they do indeed get published, the reporting quality of the recruitment process is similarly poor as in RCTs that were completed as planned. The current Consolidated Standards of Reporting Trials (CONSORT) statement [17] explicitly recommends reporting the number of patients assessed for eligibility, number of eligible patients, and number of consenting patients. In the context of RCT discontinuation due to poor recruitment, however, investigators should additionally describe how they projected the number of eligible and consenting patients; whether a pilot study including informed consent was done; whether recruitment was closely monitored; which measures were undertaken to improve recruitment; and the specific root causes for recruitment failure in their case, so that future recruitment failures in that area of research can efficiently be prevented.
Given its magnitude and global presence of the problem, the evidence base on what actually works to improve recruitment in RCTs is still astonishingly thin [18,19]. Based on the numerous analyses about the nature, extent, and causes for recruitment failure, it is time for international collaborative efforts to overcome the problem. Specifically, we need more randomized 'studies within a trial' (SWATs), i.e., promising interventions to improve recruitment need to be empirically evaluated within a host trial as propagated by the Trial Forge initiative, the Medical Research Council's Network of Hubs for Trials Methodology Research in the UK, and the Health Research Board's Trials Methodology Research Network in Ireland [20,21].

Conclusions
This qualitative comparison of RCTs discontinued or revised due to poor recruitment and RCTs completed as planned on the same research question complements previous efforts to identify risk factors for recruitment failure in RCTs, and to better understand the underlying mechanisms. Our study confirms previously identified causes such as narrow eligibility criteria, investigatorsponsorship, and a high burden of trial procedures for patients and recruiters, but also stresses the importance of considering the accumulating evidence and the timing of other RCTs on the same topic as well as carefully selecting and closely monitoring participating centers for RCTs in challenging settings. A more detailed reporting of patient recruitment in RCTs is urgently needed so that recruitment failure can provide lessons for other researchers in the future.