Randomized controlled trials (RCTs), especially ‘pragmatic’ RCTs that measure the effectiveness of interventions in realistic settings, are an attractive opportunity to provide information on cost-effectiveness
[1]. In the context of such a RCT, many aspects of treatment from clinical outcomes to adverse events and costs are measured at the individual level, which can be used to formulate an efficient policy based on cost-effectiveness principles. A growing number of trials incorporate economic endpoints at the design stage and there are established guidelines for conducting a cost-effectiveness analysis (CEA) alongside a RCT
[2, 3].

The statistic of interest in a CEA is the incremental cost effectiveness ratio (ICER), which is defined as the difference in cost (∆*C*) between two competing treatments over the difference in their health outcome (effectiveness) (∆*E*). With patient-specific cost and health outcomes at hand, estimating the population value of the ICER from an observed sample becomes a classical statistical inference problem. However, given the awkward statistical properties of cost data and some health outcomes such as quality-adjusted life years (QALYs), and issues around parametric inference on ratio statistics, many investigators choose resampling methods for quantifying the sampling variation around costs, health outcomes, and the ICER
[4]. In parallel-arm RCTs, this can be performed by obtaining a bootstrap sample within each arm of the trial and calculating the mean cost and effectiveness within each arm from the bootstrap sample; repeating this step many times provides a random sample from the joint distribution of arm-specific cost and effectiveness outcomes. This sample can then be used to make inference on (such as calculate the confidence or credible interval for) the ICER
[5].

Recently, such a framework for evaluating the cost and outcomes of health technologies has received some criticism
[6–8]. Specifically, critics argue that making decisions on the cost-effectiveness of competing treatments should be based on all the available evidence, not just those obtained from a single RCT
[8]. In this context, evidence synthesis is the practice of combining multiple sources of evidence (from other RCTs, expert opinion, and case histories) in informing the treatment decision, a task that is quantitatively performed using the Bayes’ rule
[9].

A conventional analysis of a clinical trial often involves making inference primarily on the effect size and secondarily on other aspects of treatment such as safety or compliance. These measures are conceptually distinct enough to be analyzed and reported separately and trialists have a full arsenal of standard statistical methods at their grasp for such analyses. Evidence synthesis is often conducted separately, usually through quantitative meta-analysis, after the results of several studies are available. An economist, on the other hand, does not have the luxury of dissecting RCT results into different components as cost-effectiveness is a function of all aspects of an intervention. As such, evidence external to the trial on any aspect of treatment has bearings on the results of the CEA. In addition, when a RCT is used as a vehicle for the CEA the incorporation of external evidence must be part of the analysis. Results of a CEA have direct policy implications and the economist cannot defer evidence synthesis to any subsequent stage
[8].

For trial-based CEAs, if external evidence on cost or effectiveness is available then the investigator can use standard parametric Bayesian methods to combine this information with trial results
[9]. This has been the dominant paradigm in the Bayesian analysis of RCT-based CEAs
[10–14]. However, prior information on cost and typical effectiveness outcomes such as QALY is rarely available and if it is, it is often inappropriate to transfer to other settings
[15, 16]. This is because such outcomes are, to a large extent, affected by the specific settings in the jurisdiction in which they are measured (such as unit prices for medical resources). On the other hand, evidence on the aspects of the intervention that relate to the pathophysiology of the underlying health condition and the biologic impact of treatment, such as the effect size of treatment or rate of adverse events, are less affected by specific settings and are therefore more transferable
[17]. This puts the investigator in a difficult situation for a RCT-based CEA as inference is made directly on cost and effectiveness using the observed sample, but evidence is available on some other aspects of treatment. One way to overcome this challenge is to create a parametric model to connect cost-effectiveness outcomes with parameters for which external evidence is available, and use Bayesian analysis, for example through Markov Chain Monte Carlo (MCMC) sampling techniques
[18]. But such a model must connect several parameters through link functions, regression equations, and error terms. This involves a multitude of parametric assumptions and there is always the danger of model misspecification
[19, 20]. In addition, even with the advent of generic statistical software for Bayesian analysis, implementing such a model and comprehensive model diagnostics are not an easy undertaking. For an investigator using resampling methods for the CEA who wishes to incorporate external evidence in the analysis, this paradigm shift to parametric modeling can be a challenge.

In this proof-of-concept study, we propose and illustrate simple modifications of the bootstrap approach for RCT-based CEAs that enable Bayesian evidence synthesis. Our proposed method requires a parametric specification of the external evidence while avoiding parametric assumptions on the cost-effectiveness outcomes and their relation with the external evidence. The remainder of the paper is structured as follows: after outlining the context, a Bayesian interpretation of the bootstrap is presented. Next, the theory of the incorporation of external evidence into such sampling scheme is explained. A case study featuring a real-world RCT is used to demonstrate the applicability and face validity of the proposed method. A discussion section on the various aspects of the new method and its strengths and weaknesses compared to parametric approaches concludes the paper.