This article has Open Peer Review reports available.

# Bayesian accrual prediction for interim review of clinical studies: open source R package and smartphone application

- Yu Jiang
^{1, 2}Email author, - Peter Guarino
^{2, 3, 4}, - Shuangge Ma
^{2, 4}, - Steve Simon
^{5, 6}, - Matthew S. Mayo
^{7, 8}, - Rama Raghavan
^{7}and - Byron J. Gajewski
^{7}

**Received: **6 November 2015

**Accepted: **21 June 2016

**Published: **22 July 2016

## Abstract

### Background

Subject recruitment for medical research is challenging. Slow patient accrual leads to increased costs and delays in treatment advances. Researchers need reliable tools to manage and predict the accrual rate. The previously developed Bayesian method integrates researchers’ experience on former trials and data from an ongoing study, providing a reliable prediction of accrual rate for clinical studies.

### Methods

In this paper, we present a user-friendly graphical user interface program developed in R. A closed-form solution for the total subjects that can be recruited within a fixed time is derived. We also present a built-in Android system using Java for web browsers and mobile devices.

### Results

Using the accrual software, we re-evaluated the Veteran Affairs Cooperative Studies Program 558— ROBOTICS study. The application of the software in monitoring and management of recruitment is illustrated for different stages of the trial.

### Conclusions

This developed accrual software provides a more convenient platform for estimation and prediction of the accrual process.

## Keywords

## Background

Subject recruitment is critical and often very challenging in clinical research studies. Investigators frequently overestimate the pool of available subjects and underestimate the time needed to achieve the proposed sample size for their studies. This is known as Lasagna’s law [1] and as Muench’s third law [2]. Studies have shown that more than 80 % of clinical trial studies ran longer than their original accrual time goals [3]. A delay in study participant recruitment or an insufficient sample size can have serious deleterious consequences. Extending the recruitment time frame leads to increased costs, while a delay in study completion can lower the scientific impact or relevance. If the proposed sample size is not achieved, the study may be seriously underpowered. Therefore, it is important for researchers to monitor the accrual rate closely throughout the enrollment period of a study.

A number of studies have been published to model and predict patient accrual process. Both Barnard et al. [4] and Zhang and Long [5] recently reviewed prediction methods. The available accrual models include unconditional and conditional model, Poisson-based models, Brownian-motion-based models, and Bayesian models [4, 5]. Both Barnard et al. [4] and Zhang and Long [5] addressed the Bayesian methods developed by Gajewski et al. [6]. The Bayesian approach can utilize researchers’ previous experience from similar studies or clinical opinion and incorporate them as prior knowledge. When actual accrual data are available, the predictive distribution of the accrual data becomes the weighted average of the prior distribution and the actual observed data. As more data are collected, the weighting of the current observed data will increase while the weighting of prior information will decrease [6]. Such an approach provides an effective assessment of the accrual process.

Developing statistical methods are always important; however, providing reliable tools should also be a priority [7]. Typically, a clinical trial study team consists of clinicians, biostatisticians, project managers, software programmers, etc. Apart from biostatisticians, the majority of the other team members might not be familiar with the Bayesian framework, and it is challenging for them to use the algorithm for accrual prediction directly. Therefore, a simple and easy-to-use interface is in demand. One of the most commonly used and convenient approaches is to adopt the Bayesian computation into R, which is a free software environment for statistical analysis [8].

Although R and its packages are very popular in the area of statistics, many clinicians and project managers might not be familiar with them. A web calculator plus a smartphone application will make the tool more user-friendly and more convenient.

In this study, we translate our prediction model into an R *accrual* package. To make it easier, we applied the graphical user interface options in R, which only require researchers to input the original design information and the updated accrual data using simple point-and-click methods [8, 9]. The *accrual* package has been promoted on the RCRAN and is ready to be used by both statisticians and clinical researchers in evaluating and monitoring their subject accrual. We also derived a closed-form solution for the posterior prediction of the remaining subjects that can be recruited within a fixed time period, where the subject count is modeled as negative binomial. The credible interval for patient recruitment then can be calculated using a normal approximation for negative binomial. In a recent study [10], we also derived a closed-form solution for the posterior prediction of accrual with an inverse beta distribution. Based on the closed form, we developed a web browser and an Android version of the accrual calculator, which can be easily installed and used by a smartphone carrier.

The potential use of the accrual software for management and evaluation of the recruitment is discussed using the existing clinical studies conducted by the Department of Veterans Affairs Cooperative Studies Program Coordinating Center at West Haven, Connecticut [11, 12]. The application of the software is discussed in three aspects: (1) initial planning of the study, (2) interim review of the recruitment progress, and (3) evaluation of the site performance.

## Methods

### The general method and the closed form

*n*subjects. Based on previous trial experience and the potential available patient population, the investigator plans to finish recruitment in

*T*days. Suppose that the trial starts at time

*t*

_{0}, and that new patients enter the study sequentially, at

*t*

_{1},

*t*

_{2},

*t*

_{ m }… Then the waiting time for each successive patient is calculated as

*w*

_{ i }is assumed to follow an exponential distribution, that is

*θ*represents the average accrual time for the

*i*th subject. To apply the Bayesian constant accrual model [6], we assume that the prior distribution of

*θ*is inverse gamma, that is

*P*is defined as the investigator’s confidence of finishing the trial in the original planned time, measured on a 0–1 scale [6]. In the process of a trial, suppose that

*m*subjects have been collected within time

*T*

_{ m }(

*T*

_{ m }= ∑

_{ i= 1}

^{ m }

*w*

_{ i }). Then the posterior distribution for

*θ*is

*T*, assuming that the rest of subjects can be recruited after time

*T*

_{ m }are

*η*is

This formula shows that the posterior predictive distribution of *η* is negative binomial:

NB(*r*,*p*).

As discussed by Jiang et al. [10], the closed form of the time frame of accrual shows an inverse beta distribution. We can use a normal approximation approach (Additional file 1), which can greatly accelerate the speed of calculation. The normal approximation works well if *r* is large and *p* is neither too small nor too large. To meet the requirement, we recommend that at the very beginning of the trial, when *m* and *T*
_{
m
} are zero or small, the prior *P* should be relatively large (e.g., 0.5). After the trial starts, for example, when \( {T}_m=\frac{T}{2} \), *p* will be in the range of 0.5 and 0.75. The value of prior *P* will almost have no effect on the normal approximation.

The derivation of the closed form also makes it possible to adopt the Bayesian constant accrual model in Java, which lacks built-in sampling algorithms.

### R accrual package, web-based calculator and smartphone applications

*accrual 1.1*package has been promoted to R (> = 2.8.0) and is available for download from the Comprehensive R Archive Network. It includes an example data set, three major functions (described next), and a graphical user interface that provides a menu-driven access to these functions in R (Fig. 2). These three major R functions are

*accrual.n*,

*accrual.T*and

*accrual.plots*. The function

*accrual.n*calculates the prediction of the number of patients to be recruited in a fixed time. The function

*accrual.T*predicts the time to reach targeted sample size. The function

*accrual.plots*provides a panel of plots for data diagnostics. The

*accrual.gui*provides an interface for the users to choose any of the three options as needed. The supplementary document provides a full menu on how to use the R package and how to interpret the results with examples.

As discussed here and in Jiang el al. [10], the closed form of the time frame of accrual is inverse beta, and the closed-form solution of the remaining subjects that can be recruited in a fixed time is distributed as negative binomial. Using a normal approximation for both of the above distributions can greatly accelerate the speed of calculation.

With the development and widespread use of smartphones, the accrual smartphone application, compared with R packages or web-based application, can make the tool more user-friendly and convenient for clinical researchers. Using a closed-form solution for Bayesian accrual model and normal approximation, the methods are adopted into an Android application using Java. Figure 3b shows the use of an Android phone in the process of accrual monitoring.

## Results

Here we provide specific examples to illustrate how the accrual software can be used in the evaluation or management of a clinical trial. The ROBOTICS study was a randomized, multicenter, clinical trial to assess robot-assisted therapy for neuro-rehabilitation in chronic stroke patients [11]. The prediction of patient recruitment can be used at different stages of the study or for different purposes, such as study planning, interim review of the recruitment progress, and evaluation of the site performance.

### Initial planning of the study

*P*= 0.5. The graph and output can be used as a reference and guide for the investigator to plan the study, especially to estimate study budgets. As the total accrual in 24 months could be as low as 118, and the time to reach the full sample size could be as long as 31.7 months, the investigator would make some alternative plans, in case these extreme situations happens.

### Interim review of the recruitment progress

*P*= 0.5. The figure clearly shows that, starting from months 13 it is highly probable that the study will not be able to recruit 158 subjects within the study time frame given the current recruitment process. The investigator should consider strategies to increase the accrual rate, such as adding one more study center, or changing the study protocols. The final recruitment ended up as 126 in 24 months, which was within the credible interval of all predictions shown in Fig. 5.

Overall, if the predicted accrual rate is so slow that it threatens the ability to achieve the proposed sample size and increases the trial duration of a study, this objective assessment will allow the study sponsor or data monitoring committees [3] to suggest mid-course corrections in the trial, such as adding additional centers to a multicenter study, hiring additional study coordinators to broaden the search for volunteers, or updating the inclusion or exclusion criteria. Conversely, the method can also prevent a researcher from overreacting to slow accrual at the beginning of the study. If the accrual is faster than planned, the prediction model can provide an estimated closure date to avoid unnecessary patient recruitment.

### Evaluation of the site performance

The accrual software is designed for single-site recruitment. However, most clinical studies are currently conducted at several sites. In the case of ROBOTICS, the study was conducted at four sites. The potential patient populations for each site and the experiences of site investigators are often different at each participating site. In the monitoring and management of recruitment in a multicenter trial, it is critical to evaluate the performance of each participating site separately. The sites that are significantly slow or inefficient in enrollment should have focused attention to try and improve the situation, and if no improvement is made during a ‘probation’ period, the prudent response can be to end that site’s participation in the study.

The accrual software also applies to larger studies. Figure 6c and 6d display the recruitment process of another clinical trial study (TEAM-AD), a double-blind, placebo-controlled, randomized, clinical trial to assess the efficacy of α-tocopherol and memantine in Alzheimer’s disease [12]. The target sample size for the TEAM-AD was originally 840 patients, to be recruited from 14 sites over three years. At 12 months and 24 months, about 10 sites do not seem to meet the recruitment target when using traditional methods (red reference line). The results by Bayesian software show that the majority of sites perform in the 95 % confidence region, with approximately six sites well below expectation. Like the ROBOTICS study, the Bayesian method is more reasonable in evaluating site performance using both expected recruitment and current accrual data. Overall, the accrual software helps to recognize low performance. Early monitoring and identification of low performing sites will help study oversight committees make well informed decisions and work out efficient strategies to improve the enrollment of each site, which in turn improve the management of recruitment for the entire study.

## Discussion

Development of software can be as important as the development of novel statistical methods [7]. In this paper, we evaluated the completed ROBOTICS and the TEAM-AD clinical trials using the proposed software by demonstrating that the recruitment tool can be used at all stages of the study.

The R *accrual* packages, both the web-based and the smartphone application for patient accrual, are based on the assumption that the accrual rate is constant. Based on our previous findings [13], the assumption for constant accrual is reasonable and generally holds, especially for small single-site clinical trials conducted in an academic research institute. The performance of the constant accrual model has also been evaluated via real clinical studies and extensive simulation studies [10]. The general finding is that when the accrual is on target, a strong prior (a relative larger *P*, such as *P* = 0.5) performs better. When the accrual is off target, a weak prior (a smaller *P*, such as *P* = 0.01) works better in prediction. The constant accrual model performs well when the accrual rates vary slightly, for example, by being more slow at the beginning of a trial. Even in some situations where the model does not predict the completion time correctly, it can still recognize early that an accrual is off target.

In the constant accrual model, specification of the prior, *P* is critical. In general, if the investigators have rich experience in the study and follow the definition of *P* correctly to choose a reasonable value for *P*, it will be beneficial to model prediction. As illustrated in the simulation study, when the accrual is on target and the investigators have higher confidence, they tend to choose a larger *P,* which will lead to better prediction. When the accrual is slow, the investigators lack confidence and are more likely to choose a smaller *P*, which will also lead to a better prediction. In some situations where the users have a hard time specifying *P*, we suggest *P* =0.1 as a reasonable value to start with.

Overall, the software is robust when the assumption is only slightly violated [10, 13]. However, it should still be used with caution and it is worth examining the data distribution to check whether the assumptions are violated or not. In the current software, we only have diagnostic plots for data distribution. In the future, we will add more model diagnostic methods, such as testing the assumption for independence and sensitivity analysis [13]. We assume that the waiting time is distributed exponentially. A more general approach in the future is to model the waiting time using a Weibull distribution. For obvious non-constant accrual studies, we are working on linear piecewise regression models, where the accrual is divided into a certain number of stages. It has been shown that it is critical to choose the prior, *P*, in the prediction of accrual. To avoid poor choices of *P*, we have introduced adaptive priors, an accelerated prior and a hedging prior, which are shown to be more robust than a subjective prior when the constant accrual assumption is violated. The accrual software with these new priors will be available soon. In addition, the current accrual software is only designed for single-site clinical studies or the review of a single site. An upgrade version of the accrual software, which includes prediction for multisite recruitment, is under development.

## Conclusions

In summary, we present an R *accrual* package, and a built-in Android system using Java for web browser and mobile devices based on a previously developed Bayesian constant accrual model. The accrual software provides a convenient platform for researchers in the evaluation of the accrual process in clinical studies. We use specific examples to illustrate how the software can be used, as well as its strength and limitations. Future planned work will involve continued assessment of our assumptions. We also plan to build new models for both single-site and multisite clinical trials and to translate these methods into easy-to-use software for monitoring clinical trials.

## Abbreviations

ROBOTICS, robot-assisted upper limb neuro-rehabilitation; TEAM-AD, the trial of vitamin E and memantine in Alzheimer’s disease; Virginia, veterans’ affairs

## Declarations

### Acknowledgements

We sincerely thank two reviewers for providing helpful comments on earlier drafts of the manuscript. We thank Dr. Lili Garrard and Dr. Yang Lei for their comments and review of this manuscript.

### Funding

This work was supported by the National Institutes of Health (grant number P30 CA168524) and the Cooperative Studies Program of the Department of Veterans Affairs, Office of Research and Development.

### Authors’ contributions

YJ, SS, MSM, and BJG worked on the method design. YJ, BJG, and RR worked on the R package. YJ, PG, SM, and BJG prepared the manuscript. All authors read and approved the final manuscript.

### Competing interests

The authors declare that they have no competing interests.

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

## Authors’ Affiliations

## References

- Lasagna L. Problems in publication of clinical trial methodology. Clin Pharmacol Ther. 1979;255(2):751–3.View ArticleGoogle Scholar
- Bearman JE, Loewenson RB, Gullen WH. Muench’s postulates, laws and corollaries, or biometrician’s views on clinical studies (Biometric Note 4). Bethesda, MD: Office of Biometry and Epidemiology, National Eye Institute, National Institutes of Health; 1974.Google Scholar
- van der Wouden JC, Blankenstein AH, Huibers MJ, van der Windt DA, Stalman WA, Verhagen AP. Survey among 78 studies showed that Lasagna’s law holds in Dutch primary care research. J Clin Epidemiol. 2007;60:819–24.View ArticlePubMedGoogle Scholar
- Barnard KD, Dent L, Cook A. A systematic review of models to predict recruitment to multicentre clinical trials. BMC Med Res Methodol. 2010;10:63.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang X, Long Q. Modeling and prediction of subject accrual and event times in clinical trials: a systematic review. Clin Trials. 2012;9:681–8.View ArticlePubMedGoogle Scholar
- Gajewski BJ, Simon SD, Carlson SE. Predicting accrual in clinical trials with Bayesian posterior predictive distributions. Stat Med. 2008;27:2328–40.View ArticlePubMedGoogle Scholar
- Chambers J. Software for data analysis: programming with R. Berlin: Springer; 2008.View ArticleGoogle Scholar
- The R Project for Statistical Computing. http://www.R-project.org/. Accessed 18 Dec 2014.
- Hoffmann TJ, Laird NM. fgui: a method for automatically creating graphical user interfaces for command-line R packages. J Stat Softw. 2009;30:1–14.View ArticleGoogle Scholar
- Jiang Y, Simon S, Mayo MS, Gajewski BJ. Modeling and validating Bayesian accrual models on clinical data and simulations using adaptive priors. Stat Med. 2015;34:613–29.View ArticlePubMedGoogle Scholar
- Lo AC, Guarino PD, Richards LG, Haselkorn JK, Wittenberg GF, Federman DG, et al. Robot-assisted therapy for long-term upper-limb impairment after stroke. N Engl J Med. 2010;42:1772–83.View ArticleGoogle Scholar
- Dysken MW, Sano M, Asthana S, Vertrees JE, Pallaki M, Llorente M, et al. Effect of vitamin E and memantine on functional decline in Alzheimer disease: the TEAM-AD VA cooperative randomized trial. JAMA. 2014;311:33–44.View ArticlePubMedPubMed CentralGoogle Scholar
- Gajewski BJ, Simon SD, Carlson SE. On the existence of constant accrual rates in clinical trials and direction for future research. Int J Stat Probab. 2012;1:43–6.View ArticleGoogle Scholar