 Methodology
 Open Access
 Published:
Understanding the cluster randomised crossover design: a graphical illustration of the components of variation and a sample size tutorial
Trials volume 18, Article number: 381 (2017)
Abstract
Background
In a cluster randomised crossover (CRXO) design, a sequence of interventions is assigned to a group, or ‘cluster’ of individuals. Each cluster receives each intervention in a separate period of time, forming ‘clusterperiods’. Sample size calculations for CRXO trials need to account for both the cluster randomisation and crossover aspects of the design. Formulae are available for the twoperiod, twointervention, crosssectional CRXO design, however implementation of these formulae is known to be suboptimal. The aims of this tutorial are to illustrate the intuition behind the design; and provide guidance on performing sample size calculations.
Methods
Graphical illustrations are used to describe the effect of the cluster randomisation and crossover aspects of the design on the correlation between individual responses in a CRXO trial. Sample size calculations for binary and continuous outcomes are illustrated using parameters estimated from the Australia and New Zealand Intensive Care Society – Adult Patient Database (ANZICSAPD) for patient mortality and length(s) of stay (LOS).
Results
The similarity between individual responses in a CRXO trial can be understood in terms of three components of variation: variation in cluster mean response; variation in the clusterperiod mean response; and variation between individual responses within a clusterperiod; or equivalently in terms of the correlation between individual responses in the same clusterperiod (withincluster withinperiod correlation, WPC), and between individual responses in the same cluster, but in different periods (withincluster betweenperiod correlation, BPC).
The BPC lies between zero and the WPC. When the WPC and BPC are equal the precision gained by crossover aspect of the CRXO design equals the precision lost by cluster randomisation. When the BPC is zero there is no advantage in a CRXO over a parallelgroup cluster randomised trial. Sample size calculations illustrate that small changes in the specification of the WPC or BPC can increase the required number of clusters.
Conclusions
By illustrating how the parameters required for sample size calculations arise from the CRXO design and by providing guidance on both how to choose values for the parameters and perform the sample size calculations, the implementation of the sample size formulae for CRXO trials may improve.
Background
Individually randomised trials are considered the ‘gold standard’ for evaluating medical interventions [1]. However, situations arise where is it necessary, or preferable, to randomise clusters of individuals, such as hospitals or schools, rather than the individual patients or students, to interventions [2, 3]. A cluster randomised trial will generally require a larger sample size compared with an individually randomised trial to estimate the intervention effect to the same precision [4].
In a twoperiod, twointervention, cluster randomised crossover (CRXO) design, each cluster receives each of the two interventions in a separate period of time, leading to the formation of two ‘clusterperiods’. In a crosssectional design, each clusterperiod consists of different individuals, while in a cohort design, each clusterperiod consists of the same individuals. The order in which the interventions are delivered to each cluster is randomised to control for potential period effects [5, 6]. Like in an individually randomised trial, this adaption has the benefit of reducing the required number of participants [7]. The key to understanding the CRXO design is to recognise how both the cluster randomisation and crossover aspects of the design lead to variation between individual responses in a trial; and how these aspects of the design give rise to similarities in the responses of groups of individuals.
Sample size formula have been published for the twoperiod, twointervention, crosssectional CRXO design [8,9,10]. These formulae require apriori specification of two correlations: the similarity between two individuals in the same clusterperiod, typically measured by the withincluster withinperiod correlation (WPC); and the similarity between two individuals in the same cluster, but in different clusterperiods, typically measured by the withincluster betweenperiod correlation (BPC). However, there is little guidance for informing the value of the BPC, nor on the sensitivity of the sample size to the chosen values of both correlations [11, 12].
A 2015 systematic review of CRXO trials found that both the cluster randomisation and crossover aspects of the design of the CRXO was appropriately accounted for in only 10% of sample size calculations and 10% of analyses [13]. This suggests that the CRXO design is not well understood.
The aims of this tutorial are to illustrate the intuition behind the CRXO design; to provide guidance on how to apriori specify the WPC and BPC; and perform sample size calculations for twoperiod, twointervention, crosssectional CRXO trials.
In the ‘Understanding the CRXO design’ section, we describe how the cluster randomisation and crossover aspects of the design leads to variation between individual responses in a twoperiod, twointervention, crosssectional CRXO design, using intensive care unit (ICU) length(s) of stay (LOS) as an example. In the ‘Performing a sample size calculation’ section, we outline how to perform sample size calculations and discuss how to specify values of the WPC and BPC for sample size calculations. In the ‘Common mistakes when performing a sample size analyses’ section, we outline common mistakes made by trialists when performing sample size calculations for CRXO trials and the likely consequences of those mistakes. We conclude with a general discussion, considering extensions and larger designs.
Understanding the CRXO design
In this section we illustrate graphically how the cluster randomisation and crossover aspects of the CRXO design leads to variation in the responses of individuals in a CRXO trial, and how these aspects of the design can be used to measure the similarity between individuals using the WPC and BPC.
We illustrate the sources of variation and measures of similarity that arise in the twoperiod, twointervention, crosssectional CRXO design by considering a hypothetical CRXO trial conducted in 20 ICUs over a 2year period. We consider the ICU LOS of all patients admitted to these 20 ICUs, and assume (for ease of exposition) that the number of patients in each ICU is infinitely large (or at least very large). As LOS is nonnormally distributed and right skewed, we use the logarithmic transform of ICU LOS throughout our illustration.
Each ICU is randomly assigned to administer one of two interventions to all patients admitted during the first year (period 1). In the subsequent year, each ICU administers the alternate intervention (period 2). All patients admitted to a single ICU over the 2year period can be thought of as belonging to a cluster. Within each ICU (cluster), the patients admitted during a 1year period can be thought of as belonging to a separate clusterperiod. Therefore, in each ICU (cluster) there are two clusterperiods.
The allocation of interventions to patients in the stratified, multicentre, parallelgroup, individually randomised trial (IRCT) design, the parallelgroup cluster randomised trial (CRCT) design, and the CRXO design are shown in Fig. 1. In each design, each intervention is given for one 12month period. In the IRCT design half the patients in each centre (ICU) receive each intervention. In the CRCT design, all patients in a single ICU are assigned the same intervention.
Variation in the length of stay between patients
To illustrate the sources of variation and measures of similarity that arise in the CRXO design, we assume that the true difference between interventions is zero. In the hypothetical situation where we have an infinite number of patients, the overall mean LOS for all patients in the trial will be equal to the true overall mean LOS for all patients who could be admitted to the 20 ICUs. The variation in LOS arises from both patient and ICU factors. In a CRXO design, the ICU (cluster) and the time period of admission (clusterperiod) are both factors that could affect the patient’s LOS and, therefore, explain some of the variation seen in patient LOS. For example, each ICU may have a different case mix of patients, different operating policies and procedures, and different staff. And within an ICU, changes to staff or policy over time could lead to differences in LOS between time periods. The following sections describe how the ICU and time period of admission can explain part of the variation in the LOS between patients.
Variation in the length of stay between ICUs
Each ICU has a true mean LOS for the infinite number of patients who could be hypothetically admitted to that ICU. When there is true variability between ICUs, the true mean LOS for each ICU will differ from the mean of all true ICU mean LOS. In the hypothetical situation where we have an infinite number of patients, the overall mean LOS for all patients and the mean of all true ICU mean LOS will be equal to the same true overall mean LOS.
Figure 2a, b, e and f show four scenarios that each illustrate variation in the true mean LOS across ICUs (red circles). The true mean LOS in each ICU may be similar and, therefore, close to the true overall mean LOS (black line) (Fig. 2a); or the true mean LOS of each ICU may be more dispersed about the true overall mean (Fig. 2b). The difference in the spread of true ICU mean LOS between Fig. 2a and b indicates greater variability in the true ICU mean LOS across ICUs in Fig. 2b than in Fig. 2a. The same comparison can be made between Fig. 2e and f.
Variation in the length of stay between time periods in an ICU
Within each ICU, there is also a true mean LOS for the infinite number of patients who could be hypothetically admitted in each 1year period (i.e. each clusterperiod). Figure 2a, b, e and f show also that there is variation in the difference between the true clusterperiod mean LOS (green circles) and the true ICU mean LOS (red circles). The true clusterperiod mean LOS may be similar to the true ICU mean LOS Fig. 2a); or the true mean LOS of each clusterperiod may be more dispersed about the true ICU mean (Fig. 2e). The difference in the spread of the true clusterperiod mean LOS between Fig. 2a and e indicates greater variability in true clusterperiod mean LOS within ICUs in Fig. 2e than in Fig. 2a. The same comparison can be made between Fig. 2b and f.
Variation in length of stay between patients in a clusterperiod
While there is a true mean LOS for all patients admitted in each clusterperiod, the individual patients within each clusterperiod will show variation in their LOS due to other patient factors (e.g. severity of their condition).
Two of the 20 example ICUs are depicted in Figs. 2c, d, g and h. ICU 1 is shown with solid lines and ICU 2 is shown in dashed lines. As previously, the mean LOS in each ICU is marked by a red line, and the mean LOS in each clusterperiod is marked by a green line. The distribution of the individual patient LOS within each clusterperiod follows a normal distribution, and is shown with four yellow or blue curves. The distribution of the LOS for patients receiving intervention S are coloured yellow, and the distribution of those receiving intervention T are coloured blue.
Within each clusterperiod, patients have a range of individual LOS centred at the true clusterperiod mean LOS (green line). Nonetheless, the patients in each clusterperiod are from distinct distributions labelled as A, B, C, and D in Fig. 2h (these labels apply also to Fig. 2c, d and g). In each clusterperiod, we assume that the variability of the individual patient LOS is the same, and hence the yellow and blue curves have the same shape and are only shifted in location between the four clusterperiods.
Summary of the sources of variation in the CRXO design
We have illustrated how the cluster randomisation aspect of the CRXO design leads to the formation of clusters of patients defined by ICU, while the crossover aspect of the design leads further to the formation of clusterperiods of patients within each cluster.
We have also illustrated how the cluster randomisation and crossover aspects of the CRXO design can lead to three sources (or components) of variation in the responses of patients in a CRXO trial: variation in the mean LOS between ICUs; variation in the mean LOS between clusterperiods; and variation between individual patient LOS within a clusterperiod.
The withincluster withinperiod correlation and the withincluster betweenperiod correlation
In this section we show how the three sources of variation outlined in the preceding section can be used to quantify the similarity in LOS between the groups of patients defined by ICU (cluster) and clusterperiod.
The withincluster withinperiod correlation (WPC) quantifies the similarity of outcomes from patients in the same clusterperiod. The withincluster betweenperiod correlation (BPC) quantifies the similarity of outcomes from patients in the same cluster, but in different periods. Specification of these two correlations are required to perform sample size estimates for a CRXO trial.
In the hypothetical circumstance where the LOS of an infinite number of patients admitted to each ICU is measured, we can determine the true WPC and BPC. In practice, the LOS can only be measured on a sample of patients, and the true WPC and BPC will be estimated from this sample of patients, with some amount of random sampling error.
We first describe the sources of variation underlying the BPC, and then extend the description to the WPC.
Withincluster betweenperiod correlation (BPC)
The BPC measures how much of the total variability in the LOS is due to variability in the ICU mean LOS or analogously how similar patient responses are within the same cluster, but in different periods. The formula for the BPC, η, is:
where σ ^{2}_{ C } is the variance in mean LOS between clusters (ICUs), σ ^{2}_{ CP } is the variance in mean LOS between clusterperiods, and σ ^{2}_{ I } is the variance in individual LOS within a clusterperiod.
The BPC measures the similarity between the LOS of two patients from the same ICU with one patient from the first period (clusterperiod C) and one patient from the second period (clusterperiod D).
The similarity between the LOS of patients in an ICU between clusterperiods arises from the variability in the ICU mean LOS only. We now refer to Fig. 2 to describe how this relationship between similarity and variability arises. As the ICU mean LOS (red lines/red circles) become more dispersed between ICUs, relative to the dispersion (i.e. distance) between clusterperiod mean LOS within an ICU (green lines/green circles), the distribution of the patient LOS (yellow/blue curves) in the clusterperiods A and B become more similar to each other, as do the distribution of patient LOS in clusterperiods C and D.
For example, in Fig. 2c there is little variation in the ICU mean LOS around the overall mean LOS (black line) and the distribution of patient LOS in clusterperiods A, B, C and D almost all coincide. As a result, the similarity between the LOS of patients in different clusterperiods within the same ICU (e.g. one patient from clusterperiod A and one patient from clusterperiod B) is comparable to the similarity between the LOS of patients in different ICUs (e.g. one patient from clusterperiod A and one patient from clusterperiods C or D). In contrast, in Fig. 2d, there is more separation between the ICU mean LOS and only the distributions of patient LOS from the same ICUs coincide (i.e. clusterperiods A and B, and clusterperiods C and D, coincide). As a result, the LOS of patients in different clusterperiods within the same ICU (e.g. one patient from clusterperiod A and one patient from clusterperiod B) are more similar to each other than to the patients in other ICUs (e.g. one patient from clusterperiod A and one patient from clusterperiods C or D). Hence, the BPC is larger in Fig. 2d than in Fig. 2c. The same comparison can be made between Fig. 2g and h.
The withincluster withinperiod correlation (WPC)
The WPC measures how much of the total variability in the LOS is due to variability in the ICU mean LOS and the clusterperiod mean LOS or analogously how similar patient responses are within a clusterperiod. The formula for the WPC, ρ, is:
The WPC measures the similarity in the LOS from two patients in the same clusterperiod, e.g. clusterperiod C.
The similarity between the LOS of patients within a clusterperiod arises from the variability in the ICU mean LOS and clusterperiod mean LOS. We now refer to Fig. 2 to describe how this relationship between similarity and variability arises. We describe the relationship in two parts: variability in the ICU mean LOS; and variability in the clusterperiod mean LOS.
As the ICU mean LOS (red circles/red lines) becomes more disperse, relative to the dispersion (i.e. distance) between the clusterperiod mean LOS (green circles/green lines), the distribution of the individual patient LOS (yellow/blue curves) in the four clusterperiods A, B, C and D become more distinct from each other, and hence patients within a clusterperiod appear more similar to each other. For example, in Fig. 2c there is little variation between the ICU mean LOS around the overall mean LOS (black line) and the distribution of patient LOS in clusterperiods A, B, C and D almost all coincide. As a result, the similarity between the LOS of two patients in clusterperiod A is comparable to the similarity between the LOS of one patient from clusterperiod A and one patient from clusterperiod B (or C or D). In contrast, in Fig. 2d, there is more separation between the ICU mean LOS and hence more separation of the patient LOS in ICUs 1 and 2. As a result, the LOS of two patients in clusterperiod A are more similar to each other than to one patient from clusterperiod A (cluster 1) and another patient from clusterperiods C or D (cluster 2). Hence, the WPC is smaller in Fig. 2c than in Fig. 2d. We note that the same comparison can be made between Fig. 2g and h.
Likewise, as the clusterperiod mean LOS (green circles/green lines) becomes more disperses, relative to the distance between the ICU mean LOS (red circles/red lines), the distribution of the individual patient LOS (yellow/blue curves) in the four clusterperiods A, B, C and D also become more distinct from each other, and hence patients within a clusterperiod become more similar to each other. For example, in Fig. 2d there is little variation between the clusterperiod mean LOS around the ICU mean LOS and thus the distribution of patient LOS in clusterperiods A and B (and equivalently C and D) almost coincide. As a result, the similarity between the LOS of two patients in clusterperiod A is comparable to the similarity between the LOS of one patient from clusterperiod A and one patient from clusterperiod B. In contrast, in Fig. 2h, there is more separation between the clusterperiod mean LOS and the distribution of patient LOS. As a result, the LOS of two patients in clusterperiod A are more similar to each other than to one patient from clusterperiod A and another patient from clusterperiod B (and even more similar than one patient from clusterperiod A and another patient from clusterperiods C or D). Hence the WPC is again smaller in Fig. 2d than in Fig. 2h. We note that the same comparison can be made between Fig. 2c and g.
Precision of the CRXO design compared to the parallelgroup cluster randomised design and parallelgroup, individually randomised design
In this section, we discuss how the WPC and BPC affect the precision of the estimate of the difference between interventions, and hence the sample size requirement, in a twoperiod, twointervention, crosssectional CRXO trial. We illustrate the two extremes of the CRXO design: when the precision in the CRXO design is equivalent to an IRCT design; and equivalent to a CRCT design. The allocation of interventions to patients in the IRCT, CRCT, and CRXO design are shown in Fig. 1.
To illustrate the effect of the WPC and BPC on precision (and equivalently the components of variation), we continue to assume that the true difference between interventions is zero. We consider a large sample of patients admitted to one cluster in a CRXO design, such that the sampling error in the estimated mean LOS for patients is assumed negligible. Therefore, in the single cluster shown in Fig. 3, the separation between the distribution of LOS from patients receiving intervention S (yellow curve) and intervention T (blue curve) arises solely from the variation in the mean LOS between clusterperiods (σ ^{2}_{ CP } ). In this section, we show which partitioning of the total variation in LOS into the components of variation leads to the most precision and to the least precision in the CRXO design.
In the CRXO design, the observed mean LOS of patients receiving each intervention can be compared within each cluster because each intervention is delivered in each cluster. As an illustration, in Fig. 3a, the observed difference in mean LOS between patients receiving each intervention could be due to a difference in true clusterperiod mean LOS (green lines) but not due to differences in the true ICU mean LOS because this component of variation is removed when the two interventions are compared within an ICU.
As the variation in the true clusterperiod mean LOS increases, and hence the separation between the green lines in Fig. 3a increases, the separation between the yellow and blue curves within an ICU increases. Correspondingly, from Eqs. 1 and 2, the difference between the WPC and BPC increases. In conclusion, increasing variability in the clusterperiod means leads to increasing uncertainty in the observed difference in the mean LOS between patients receiving each intervention.
In the CRXO design, precision is maximised when there is no variation in LOS between periods within a cluster. In this scenario the separation between the green lines in Fig. 3a shrinks and the yellow and blue curves coincide, yielding Fig. 3b. The LOS of two patients in the same clusterperiod are as similar as the LOS of two patients from the same ICU but in different clusterperiods. Also, from Eqs. 1 and 2, the WPC equals the BPC. Figure 3b now approximates the diagram that one would expect from an IRCT with two ICUs (with the mean LOS for each centre indicated by the green lines) and half the patients within each cluster receiving each intervention. This diagram arises in an IRCT because, for large sample sizes and under the assumption of no true differences between interventions, randomisation ensures that the distributions of LOS in each intervention (yellow and blue curves) are identical. The CRXO design will, therefore, have the same precision as an IRCT design.
Conversely, the precision of the CRXO design decreases when the clusterperiod variability increases. As the variability between periods within a cluster increases, the separation between the green lines, and correspondingly the yellow and blue curves, in Fig. 3a increases. The increased separation results in greater variability in the comparison of patient LOS in each intervention within each cluster. For a fixed total variability in ICU LOS, as the variability between periods within a cluster increases, the variability between different clusters must reduce. In the limiting case there is no variation at all between clusters (σ ^{2}_{ C } = 0), resulting in the BPC equalling zero (Eq. 1). In this case each clusterperiod effectively resembles a separate cluster (Fig. 3c). Two patients in different clusterperiods in the same ICU are no more similar than two patients in different ICUs. Therefore, there is no advantage to the crossover component of the CRXO design and the CRXO will have the same precision as a CRCT design.
In most situations, the BPC will lie between zero and the WPC. In the following section, ‘Performing a sample size calculation’, we discuss the effect of the BPC and WPC on the sample size required to be able to detect a specified true intervention effect in a CRXO trial with a given level of power, and provide guidance on how to choose values for the BPC and WPC for a sample size calculation.
Performing a sample size calculation
The sample size required to detect a specified true difference between interventions with a given level of power decreases as the precision of the estimate of the intervention effect increases. In the ‘Understanding the CRXO design’ section, we considered precision in the CRXO design when the true difference between interventions was assumed to be zero. However, even when the true difference is not zero, the effects of the WPC and BPC on precision described in the previous section continue to hold.
The sample size required for a CRXO trial increases as the clusterperiod variability increases, or equivalently as the difference between the WPC and BPC increases. As the value of the BPC increases from zero to the WPC, the sample size required for the CRXO design will decrease from that required for a CRCT design towards the sample size for an IRCT. Therefore, using an appropriate specification of the difference between the WPC and the BPC is essential for performing sample size calculations for the CRXO design.
We now illustrate how to perform a sample size calculation for a twoperiod, twointervention CRXO trial with a continuous and binary outcome using ICU LOS and inICU mortality data, respectively, from the Australian and New Zealand Intensive Care Society (ANZICS) Adult Patient Database (APD) [14, 15]. There are 37 tertiary ICUs in Australia and New Zealand, of which 25 to 30 might be expected to participate in a trial.
We compare the sample size requirement for number of individuals and number of clusters (ICUs) from the CRXO design with the requirement from the stratified, multicentre, parallelgroup, individually randomised design (IRCT) and the parallelgroup cluster randomised design (CRCT) conducted over one period.
Comparisons of the sample size requirements for these different designs can either be made by fixing the total number of clusters across all designs; or by treating the CRXO design as lasting twice as long, i.e. two periods, instead of one period as in the IRCT and CRCT designs. We take the latter approach here so that the WPC is the same in each period.
We include Stata dofiles to estimate the required sample size for each trial design, for a chosen set of sample size parameters (see Additional files 1 and 2).
The sample size formulae for a oneperiod IRCT design, a oneperiod CRCT design, and a twoperiod, twointervention, crosssectional CRXO design
The sample size formula for the total number of participants required for a normally distributed continuous outcome in a twoperiod, twointervention CRXO trial, across all clusters and interventions, assuming a constant number of participants recruited to each clusterperiod is [8]:
and for a oneperiod, twointervention CRCT:
and for a oneperiod, twointervention, parallelgroup IRCT, stratified by cluster, across all clusters and interventions is [16]:
where z_{α/2} and z_{β} are the standard normal values corresponding to the upper tail probabilities of α/2 and β, respectively; α is the twosided significance level, typically 0.05; 1 − β is the power to detect the specified difference (μ_{A} − μ_{B}) with probability α; σ^{2} is the variance of the outcome; μ _{ A } and μ _{ B } are the outcome means in each arm; m is the number of participants per clusterperiod; ρ is the WPC; and η is the BPC.
The formulae presented above include a correction for when the number of clusters small, as suggested in Eldridge and Kerry (p. 149) [2] and Forbes et al. [9]. This leads to an additional 4 m participants in the CRXO design and 2 m participants in the CRCT design. No correction is necessary for the IRCT because the number of individual participants will be large in the example settings.
For a binary outcome we can replace \( \frac{2{\sigma}^2}{{\left({\mu}_A{\mu}_B\right)}^2} \) with \( \frac{p_A\left(1{p}_A\right)+{p}_B\left(1{p}_B\right)}{{\left({p}_A{p}_B\right)}^2\kern1.25em } \) in the above formulae [12], where p _{ A } and p _{ B } are the proportions of the outcomes in each arm.
For the CRXO design, CRCT design and IRCT design, respectively, the formulae to determine the number of clusters (ICUs) needed to achieve the required number of participants are:
\( {n}_{CRXO}=\frac{N_{CRXO}}{2m} \), \( {n}_{CRCT}=\frac{N_{CRCT}}{m} \), and \( {n}_{IRCT}=\frac{N_{IRCT}}{m} \) _{.}
Australian and New Zealand Intensive Care Society – Adult Patient Database (ANZICSAPD): estimates of the WPC and BPC
The ANZICSAPD is one of four clinical quality registries run by the ANZICS Centre for Outcome and Resource Evaluation and collects deidentified information on admissions to adult ICUs in Australia and New Zealand. A range of data is collected during patients’ admissions, including ICU LOS and inICU mortality. In this section we use the ANZICSAPD data from 34 tertiary ICUs to estimate the correlations required to perform sample size calculations for CRXO trials. We estimate the values of the WPC and the BPC from two 12month periods of data between 2012 and 2013 (Appendix 1).
Continuous outcomes
We follow the methods of Turner et al. to estimate the WPC and BPC (Appendix 1). Using the ICU LOS data, the estimated WPC was \( \widehat{\rho}\kern0.5em =\kern0.5em 0.038 \), and the BPC was \( \widehat{\eta}\kern0.5em =\kern0.5em 0.032 \) (Table 1). The overall mean LOS was 5.3 loghours, with a standard deviation 1.39 loghours.
Binary outcomes
We follow the methods of Donner et al. to estimate the WPC and BPC (Appendix 1). Using the inICU mortality data, the estimated WPC was \( \widehat{\rho}=0.010 \), and the BPC was \( \widehat{\eta}=0.007 \). The overall mortality rate was 8.7%.
Sample size example for ICU LOS
Suppose we wish to design a twoperiod, twointervention, CRXO trial to have 80% power to detect a true reduction in ICU LOS of 0.1 loghours (1.1 h) using a twosided test with a TypeI error rate of 5%. In practice, the choice of reduction in ICU LOS should be the minimally clinically important reduction, determined in consultation with subject matter experts. A 0.1 loghours’ reduction is equivalent to a 10% reduction, and is a reasonable minimally clinically important reduction in ICU LOS.
The standard deviation is estimated to be 1.2 loghours (3.3 h). As an illustration, we assume that in a 12month period, 200 patients in each ICU will meet the inclusion criteria for the trial. The CRXO trial will, therefore, run for 2 years and include 400 patients per ICU, with 200 patients receiving each intervention in each ICU.
For comparison, we consider an IRCT and a CRCT run for a 12month period, with 100 patients receiving each intervention in each ICU in the IRCT and all 200 patients receiving one intervention in each ICU in the CRCT.
Using the estimates that we calculated from the ANZICSAPD data for the WPC and BPC, the total number of patients and ICUs for each design are summarised in Table 2 (see Appendix 2 for calculations).
The total number of participants required for the CRXO design is N _{ CRXO } = 10,564. To include 10,564 participants, we require n _{ CRXO } = 27 ICUs, each recruiting 200 participants in each of the two 12month periods. If instead we conducted a CRCT over a single 12month time period, the total number of participants required would be N _{ CRCT } = 39,065. Assuming that 200 patients are eligible in each ICU, we would need n _{ CRCT } = 196 ICUs. The total number of participants required for an IRCT conducted over a 12month period is N _{ IRCT } = 4345. With 200 patients per ICU (100 patients per intervention), the total number of ICUs required is n _{ IRCT } = 22.
In this example, the CRXO design required five more clusters (ICUs) than the IRCT design; however, the CRXO design is run for twice as long. The CRCT design would require 7.3 times as many clusters as the CRXO design. Given that there are only 37 tertiary ICUs in Australia and New Zealand, a CRCT trial would not be feasible.
We can examine the sensitivity of the CRXO sample size calculation to a different BPC. If the BPC was η = 0.010 rather than η = 0.032, then the CRXO design requires N _{ CRXO } = 30,433 participants. The total number of ICUs required to obtain the required number of participants is n _{ CRXO } = 77. The total number of ICUs required has now increased by 50, and the trial would no longer be feasible in the Australia and New Zealand region within tertiary ICUs only. Note that when the number of patients admitted in each clusterperiod is relatively large, we would observe a similar increase in the sample size if we had underestimated the WPC by 0.023, rather than overestimated the BPC by 0.023.
Sample size example for inICU mortality
In a second example, suppose that we wish to design a study to have 80% power to detect a true reduction in inICU morality from 8.7% to 7.2% (absolute difference of 1.5%) using a twosided test with a TypeI error rate of 5%. From the ANZICSAPD admission data, we estimate that in a 12month period, 1200 patients will be admitted in each ICU and eligible for inclusion in the trial. The total number of patients and ICUs for each design are summarised in Table 3 (see Appendix 2 for calculations).
For a CRXO design, using the estimates for the WPC, the BPC, and the clusterperiod size we calculated from the ANZICSAPD, the total number of participants required is N _{ CRXO } = 51,581. Since we expect 1200 patients in each ICU for each of the two 12month periods, the required number of ICUs is n _{ CRXO } = 22. If we had used a CRCT, the required number of participants is N _{ CRCT } = 134, 792. Assuming that 1200 patients admitted over a single 12month period, we would need n _{ CRCT } = 113 ICUs. The total number of participants required for the IRCT design is N _{ IRCT } = 10,090. For a trial run over 12 months, with 1200 patients per ICU (600 patients per intervention), the total number of ICUs required is n _{ IRCT } = 9.
In this example, the CRXO design required 2.4 times as many clusters (ICUs) as the IRCT design, and is run for twice as long. Despite the increase in required clusters, the CRXO is still a feasible design, unlike the CRCT design, which would require 5.1 times as many clusters as the CRXO design.
We can examine the sensitivity of the CRXO sample size calculation to a different BPC. If the BPC was η = 0.006, rather than η = 0.007, then the total number of participants required is N _{ CRXO } = 63,811. Since we expect 1200 patients for each clusterperiod, we would need to include n _{ CRXO } = 27 ICUs, i.e. 54 clusterperiods. This demonstrates that a small change in the assumed BPC can have a marked impact on the number of required ICUs and patients.
Unequal clusterperiod sizes
We have so far assumed that the clusterperiod size is constant. In reality, it is likely that different ICUs will include a differing number of participants [17, 18]. An extension to the sample size formula for this scenario is provided by [9]. When the analysis is based on unweighted clusterperiod means, the arithmetic mean in the sample size formula given for the CRXO design can be replaced by the harmonic mean:
We assume that the clusterperiod size is the same in each period within a cluster. For further extensions, see Forbes et al. [9].
From the ANZICSAPD data, we estimate that the harmonic mean is m _{ h } = 900. Therefore then the required number of patients is N _{ CRXO } = 41,208, and the required number of ICUs is:
Allowing for unequal clusterperiod sizes has increased the required number of clusters slightly from 22 to 23.
Guidance on how to choose the WPC and the BPC for the sample size calculation
As was seen in the ‘Understanding the CRXO design’ section, the difference between the WPC and BPC is key in determining the sample size for a CRXO design.
Approaches for choosing the withincluster intracluster correlation (ICC) in sample size calculations for parallelgroup CRCTs have been discussed [19,20,21,22]. Similar considerations apply when choosing the WPC in a CRXO design. In particular, because the ICC estimates are subject to large uncertainty [23], reviewing multiple relevant estimates of the ICC is recommended. These ICC estimates may be obtained from trial reports, lists published in journal articles or from routinely collected data.
Identification of the factors which influence the magnitude of the withincluster ICC can assist trialists in selecting ICC estimates that are relevant to their planned trial. Typically, the trial outcome itself is less predictive of the value of the ICC than factors such as: the type of outcome variable (i.e. process outcomes that measure adherence to protocol and policy or individually measured outcomes) [19], the prevalence of the outcome [20], the size of the natural cluster of individuals that the randomised clusters are formed from [20], and the characteristics of the individuals and clusters [22].
The duration of time over which the outcome variables were measured may also affect the value of the withincluster ICC. As the measurements of individuals within a cluster become further apart, the similarity between the measurements might be expected to decrease. Using an estimate of the withincluster ICC that was determined over a different duration of time than the intended period length of the planned trial assumes that there is no variation in the withincluster ICC over time, and we are unaware of any research investigating if this is justified.
In contrast, we are aware of only two publications reporting estimates of the BPC [24, 25]. Therefore, until reporting of the BPC becomes more common [26], estimates of the BPC are likely to rely on the analysis of routinely collected data, pilot or feasibility study data, or a reasoned bestguess. As for the withincluster ICC in cluster randomised trials, estimating the BPC from feasibility or a single routinely collected data source is likely to be subject to considerable uncertainty [27].
In forming a best guess, it is helpful to recognise that the difference between the WPC and BPC is a measure of changes over time within a cluster’s environment that affect the outcomes of each individual in that cluster (e.g. a change in policy in one ICU). Over short time periods or in clusters with stable environments and patient characteristics, it might be reasonable to expect little change over time and, therefore, the BPC will be similar to the WPC. However, if this assumption is untrue and the BPC is less that the WPC, a sample size calculation assuming that the two correlations are equal will lead to an underpowered study. It may be prudent to assume that the BPC is less than the WPC. To this end, suggestions have been made to set the BPC to: half the WPC [12]; and to 0.8 of the WPC [11].
In the ANZICSAPD the ratio of the BPC to WPC is 0.7 for ICU mortality and 0.8 for ICU LOS, which is consistent with the suggestion made by Hooper and Bourke [11]. In the absence of multiple estimates or precise estimates of the ICCs, a conservative approach in selecting the BPC is recommended to avoid an underpowered trial. Further, a sensitivity analysis exploring the effect of the choice of ICC on the sample size is recommended.
Common mistakes when performing sample size calculations and analyses
Many trialists have made strong assumptions about the values of the WPC and the BPC in their sample size and analysis methodology [13]. In this section we illustrate the consequences of using incorrect sample size methodology on the estimated sample size and power.
Assume the outcomes are independent
In a review of CRXO trials, 34% of sample size calculations made the assumption that the observations were independent [13]. There are two scenarios where this assumption is reasonably appropriate: when the WPC and the BPC are equal and the sample size calculation was stratified by centre; or when the WPC and the BPC are both zero.
The first scenario arises when the outcomes of two individuals in the same cluster are equally similar if the individuals are in different periods as if the individuals are in the same period (i.e. there is no change in the WPC over time within a cluster). In this fortuitous case the precision gained by crossover aspect of the CRXO design equals the precision lost by cluster randomisation (apart from a factor of 1WPC, which is usually small [16]). The second scenario arises when there is no similarity between the outcomes of any two individuals, which is unlikely.
The effect on power of assuming that the outcomes are independent will depend on the clusterperiod size and the difference between the WPC and the BPC. Loss of power will increase as both the difference between the two ICCs increases and the clusterperiod size increases.
We illustrate the potential effect on power and sample size assuming the outcomes are independent using a published sample size calculation. Roisin [28] estimated that the seven wards (clusters) participating in their trial required a minimum of 3328 patients to have 80% power to detect a reduction in proportion of hospital acquisition of methicillinresistant Staphylococcus aureus (MRSA) from 3% to 1.5%. From the ANZICSAPD data, we estimate a WPC of 0.010, and a BPC of 0.007 for inICU mortality in the ICU setting. As an example only, we assume that the estimates of the correlations for ICU mortality are similar to the correlations for ICU MRSA acquisition. Given that a total of 2505 patients were eligible for inclusion in the study, we determined the average clusterperiod size to be 179. From these estimates, we determine that a sample size of 5385 is required to achieve the specified power, which is a 62% increase from the published sample size requirement of 3328.
Assume a parallelgroup cluster randomised design instead of a cluster randomised crossover design
Another common approach when performing sample size calculations for CRXO trials is to use methods designed for parallelgroup CRCT trials. Applying CRCT sample size methodology to a CRXO design makes the assumption that: the BPC is zero; and that the WPC calculated over all periods in the trial is the same as the WPC calculated for a single period. Under the assumption that the BPC is zero, the outcomes of individuals within a cluster, but in different periods, are no more similar than outcomes of individuals in different clusters. That is, the individuals in different periods are assumed to be independent. When the BPC is not zero, the CRCT design effect does not account for the gain in precision achieved by the crossover aspect of the CRXO design, leading to a potentially overpowered trial. Trials that use CRCT sample size methods become progressively more overpowered as the true BPC becomes larger and the clusterperiod sizes increase.
We illustrate the potential effect on power and the sample size requirement using CRCT sample size methodology by means of a published sample size calculation. van Duijn [29] estimated that eight ICUs (clusters) participating in their trial would include 135 patient measurements per clusterperiod. Using CRCT sample size methodology, each of the 16 clusterperiods (two periods per ICU) were assumed to be separate clusters of 135 patients. van Duijn [29] assumed a withincluster ICC of 0.01, and hence they estimated that the trial required 1842 patients to have 80% power to detect a reduction in proportion of ICU patients with antibioticresistant gramnegative bacteria from 55% to 45%. From the ANZICSAPD data, we estimate a WPC of 0.010, and a BPC of 0.007, as in the example in the previous section. From these estimates, we determine that a sample size of 1623 is required to achieve the specified power, which is 12% less than the sample size required for a CRCT.
Discussion
Sample size calculations for CRXO trials need to account for both the cluster randomisation and crossover aspects of the design to ensure that an appropriate number of participants are recruited to adequately address the trial’s hypotheses. There are simple, sample size formulae available for a twoperiod, twointervention, crosssectional CRXO design; however, the implementation of these formulae has been limited [13]. Such limited use of the formula may be due to a lack of recognition that formulae are available, a lack of availability of estimates of the parameters required within the formulae, or a lack of trialists’ understanding of those parameters.
We have illustrated how the cluster randomisation and crossover aspects of the CRXO design give rise to similarity in both the responses of individuals within the same cluster and within the same clusterperiod; and have described the parameters required to perform sample size calculations for CRXO trials. We have provided guidance on how to choose the parameters required for the sample size calculation and perform sample size calculation using those parameters.
While our focus has been on the twointervention, twoperiod, crosssectional CRXO design, more complex designs with additional periods and interventions are possible. The sample size and analysis methodology is more complex in these designs. For example, in a design with more than two periods, additional assumptions are required about the similarity between individuals in the same cluster in the same time period, and 1, 2, or 3, etc. time periods apart. Careful consideration should always be given to whether cluster randomisation is necessary [30], and whether the risk of the intervention effect from one period carrying over to the next period is minimal [6].
In addition to consideration of the sample size methodology, it is also essential to appropriately account for the cluster and the clusterperiod in the analysis. Very few published trials do so [13]. Failure to account for the clusterperiod in an individual level analysis leads to inflated TypeI error rates [31]. Methods to analyse CRXO trials have been published by Turner et al. and Forbes et al. [5, 9].
Conclusions
Sample size calculations for CRXO trials must account for both the cluster randomisation and crossover aspects of the design. In this tutorial we described how the CRXO design can be understood in terms of components of variation in the individual outcomes, or equivalently, in terms of correlations between the outcomes of individual patients. We illustrated how to perform sample size calculations for continuous and binary outcomes, and provided guidance on selecting estimates of the parameters required for the sample size calculation.
Abbreviations
 ANZICSAPD:

Australia and New Zealand Intensive Care Society – Adult Patient Database
 BPC:

Withincluster betweenperiod correlation
 CRCT:

Cluster randomised controlled trial
 CRXO:

Cluster randomised crossover
 ICC:

Intracluster correlation
 ICU:

Intensive care unit
 IRCT:

Individually randomised controlled trial
 LOS:

Length of stay
 WPC:

Withincluster withinperiod correlation
References
 1.
Grimes DA, Schulz KF. An overview of clinical research: the lay of the land. Lancet. 2002;359(9300):57–61.
 2.
Eldridge S, Kerry S. A practical guide to cluster randomised trials in health services research. Chichester: Wiley; 2012.
 3.
Ukoumunne OC, Gulliford MC, Chinn S, Sterne JA, Burney PG. Methods for evaluating areawide and organisationbased interventions in health and health care: a systematic review. Health Technol Assess. 1999;3(5):iii–92.
 4.
Donner A, Birkett N, Buck C. Randomization by cluster. Sample size requirements and analysis. Am J Epidemiol. 1981;114(6):906–14.
 5.
Turner RM, White IR, Croudace T. Analysis of cluster randomized crossover trial data: a comparison of methods. Stat Med. 2007;26(2):274–89.
 6.
Parienti JJ, Kuss O. Clustercrossover design: a method for limiting clusters level effect in communityintervention studies. Contemp Clin Trials. 2007;28(3):316–23.
 7.
Hills M, Armitage P. The twoperiod crossover clinical trial. Br J Clin Pharmacol. 1979;8(1):7–20.
 8.
Giraudeau B, Ravaud P, Donner A. Sample size calculation for cluster randomized crossover trials. Stat Med. 2008;27(27):5578–85.
 9.
Forbes AB, Akram M, Pilcher D, Cooper J, Bellomo R. Cluster randomised crossover trials with binary data and unbalanced cluster sizes: application to studies of nearuniversal interventions in intensive care. Clin Trials. 2015;12(1):34–44.
 10.
Rietbergen C, Moerbeek M. The design of cluster randomized crossover trials. J Educ Behav Stat. 2011;36(4):472–90.
 11.
Hooper R, Bourke L. Cluster randomised trials with repeated cross sections: alternatives to parallel group designs. BMJ. 2015;350:h2925.
 12.
Donner A, Klar N, Zou G. Methods for the statistical analysis of binary data in splitcluster designs. Biometrics. 2004;60(4):919–25.
 13.
Arnup SJ, Forbes AB, Kahan BC, Morgan KE, McKenzie JE. Appropriate statistical methods were infrequently used in clusterrandomized crossover trials. J Clin Epidemiol. 2016;74:40–50.
 14.
Stow PJ, Hart GK, Higlett T, George C, Herkes R, McWilliam D, Bellomo R, Committee ADM. Development and implementation of a highquality clinical database: the Australian and New Zealand Intensive Care Society Adult Patient Database. J Crit Care. 2006;21(2):133–41.
 15.
Kaukonen KM, Bailey M, Pilcher D, Cooper DJ, Bellomo R. Systemic inflammatory response syndrome criteria in defining severe sepsis. N Engl J Med. 2015;372(17):1629–38.
 16.
Vierron E, Giraudeau B. Sample size calculation for multicenter randomized trial: taking the center effect into account. Contemp Clin Trials. 2007;28(4):451–8.
 17.
Konstantopoulos S. Power analysis in twolevel unbalanced designs. J Exp Educ. 2010;78(3):291–317.
 18.
Kerry SM, Bland JM. Unequal cluster sizes for trials in English and Welsh general practice: implications for sample size calculations. Stat Med. 2001;20(3):377–90.
 19.
Campbell MK, Fayers PM, Grimshaw JM. Determinants of the intracluster correlation coefficient in cluster randomized trials: the case of implementation research. Clin Trials. 2005;2(2):99–107.
 20.
Gulliford MC, Adams G, Ukoumunne OC, Latinovic R, Chinn S, Campbell MJ. Intraclass correlation coefficient and outcome prevalence are associated in clustered binary data. J Clin Epidemiol. 2005;58(3):246–51.
 21.
Gulliford MC, Ukoumunne OC, Chinn S. Components of variance and intraclass correlations for the design of communitybased surveys and intervention studies: data from the Health Survey for England 1994. Am J Epidemiol. 1999;149(9):876–83.
 22.
Adams G, Gulliford MC, Ukoumunne OC, Eldridge S, Chinn S, Campbell MJ. Patterns of intracluster correlation from primary care research to inform study design and analysis. J Clin Epidemiol. 2004;57(8):785–94.
 23.
Ukoumunne OC. A comparison of confidence interval methods for the intraclass correlation coefficient in cluster randomized trials. Stat Med. 2002;21(24):3757–74.
 24.
Martin J, Girling A, Nirantharakumar K, Ryan R, Marshall T, Hemming K. Intracluster and interperiod correlation coefficients for crosssectional cluster randomised controlled trials for type2 diabetes in UK primary care. Trials. 2016;17:402.
 25.
Feldman HA, McKinlay SM. Cohort versus crosssectional design in large field trials: precision, sample size, and a unifying model. Stat Med. 1994;13(1):61–78.
 26.
Hooper R, Teerenstra S, de Hoop E, Eldridge S. Sample size calculation for stepped wedge and other longitudinal cluster randomised trials. Stat Med. 2016;35(26):4718–28.
 27.
Eldridge SM, Costelloe CE, Kahan BC, Lancaster GA, Kerry SM. How big should the pilot study for my cluster randomised trial be? Stat Methods Med Res. 2016;25(3):1039–56.
 28.
Roisin S, Laurent C, Denis O, Dramaix M, Nonhoff C, Hallin M, Byl B, Struelens MJ. Impact of rapid molecular screening at hospital admission on nosocomial transmission of methicillinresistant staphylococcus aureus: cluster randomised trial. PLoS One. 2014;9(5):e96310.
 29.
van Duijn PJ, Bonten MJ. Antibiotic rotation strategies to reduce antimicrobial resistance in Gramnegative bacteria in European intensive care units: study protocol for a clusterrandomized crossover controlled trial. Trials. 2014;15:277.
 30.
Campbell MK, Piaggio G, Elbourne DR, Altman DG, Group C. CONSORT 2010 Statement: extension to cluster randomised trials. BMJ. 2012;345:e5661.
 31.
Morgan KE, Forbes AB, Keogh RH, Jairath V, Kahan BC. Choosing appropriate analysis methods for cluster randomised crossover trials with a binary outcome. Stat Med. 2017;36(2):318–33.
Acknowledgements
Not Applicable.
Funding
This research was in part supported by a National Health and Medical Research Council (NHMRC) project grant (1108283).
SJA was supported in part by a Monash University Graduate Scholarship and a National Health and Medical Research Council of Australia Centre of Research Excellence grant (1035261) to the Victorian Centre for Biostatistics (ViCBiostat).
JEM was supported by a National Health and Medical Research Council (NHMRC) Australian Public Health Fellowship (1072366).
Availability of data and materials
Not applicable.
Author information
Affiliations
Contributions
SJA led the development of all sections and drafted the manuscript. JEM contributed to the development of all sections and provided critical review of the manuscript. KH contributed to the development of the graphical illustrations and corresponding sections, and provided critical review of the manuscript. DP provided guidance on the ANZICAPD data and contributed to the development of the sample size examples. ABF conceived of the graphical illustrations, contributed to the development of all sections and provided critical review of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Authors’ information
Not applicable
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional files
Additional file 1:
Continuous outcomes sample size Stata do file. Stata do file to perform sample size calculations for continuous outcomes using formulae presented in the ‘Performing a sample size calculation’ section, for a given set of sample size parameters. (DO 1 kb)
Additional file 2:
Binary outcomes sample size Stata do file. Stata do file to perform sample size calculations for binary outcomes using formulae presented in the ‘Performing a sample size calculation’ section, for a given set of sample size parameters. (DO 2 kb)
Appendices
Appendix 1
Estimates of the WPC and BPC
To illustrate the impact of the WPC and BPC on the sample size calculation, we estimate the values of the WPC and BPC by using previously published methods for continuous and binary outcomes [5, 12].
Continuous outcomes
ICU LOS is rightskewed, so we begin by logtransforming this variable, so that the assumptions of the model used to estimate the correlations are more likely to be met. We use LOS to represent log(LOS) throughout. We estimate the values of the WPC and the BPC from the variances estimated by fitting the following model [5]:
where there are i = 1, …, n ICUs, j = 1, 2 12month periods and k = 1, …, m _{ ij } patients in the i ^{th} ICU (cluster) and j ^{th} period; Y _{ ijk } is the LOS for the k ^{th} patient in the j ^{th} clusterperiod in the i ^{th} ICU (cluster); μ is the overall mean LOS; π is the fixed period effect; u _{ i } ~ N(0, σ ^{2}_{ C } ) is the difference from the overall mean LOS for each ICU mean LOS; v _{ ij } ~ N(0, σ ^{2}_{ CP } ) is the difference from the ICU mean LOS for each clusterperiod mean LOS, and e _{ ijk } ~ N(0, σ ^{2}_{ I } ) is the difference from the clusterperiod mean LOS for each patient LOS; σ ^{2}_{ C } , σ ^{2}_{ CP } , and σ ^{2}_{ I } are the variances for the ICU (cluster) mean LOS, clusterperiod mean LOS and patient LOS within each clusterperiod, respectively.
Because we are fitting the model to registry data, rather than clinical trial data of the actual treatments to be considered, we estimate the model parameters under the assumption of a null treatment effect, and hence have not included a fixed treatment effect. A fixed treatment effect should be included when estimating the variance components from data from the actual clinical trial.
The model was fitted in Stata 14 with the mixed command using restricted maximum likelihood estimation: mixed log(LOS) periodeffect  cluster:  cluster_period:, reml.
Binary outcomes
We estimate the value of the WPC for withinICU mortality by fitting the analysis of variance (ANOVA) estimator for the intracluster correlation [12]:
where there are i = 1, …, n ICUs and j = 1, 2 12month periods; m _{ ij } is the number of patients in the i ^{th} ICU (cluster) and j ^{th} period; N _{ j } is the total number of patients in each period and N is the total number of patients overall; \( {\widehat{P}}_{ij} \) is the estimated mortality rate in each clusterperiod; and \( {\widehat{P}}_j \) is the estimated mortality rate in period j.
And by fitting the Pearson pairwise estimator for the BPC [12]:
where Y _{1i } and Y _{2i } are the number of deaths in two adjacent time periods on the i ^{th} ICU.
Appendix 2
Sample size calculations
In this section we provide the details of the sample size calculations presented in the ‘Performing a sample size calculation’ section, using the estimates for the WPC and BPC that we calculated from the ANZICSAPD data in Appendix 1.
Sample size calculation for ICU LOS
Total number of participants and ICUs required for the CRXO design
Since we expect 200 patients in each ICU for each of the two 12month periods, the number of ICUs needed to achieve the required number of participants is:
If the BPC was η = 0.010 rather than η = 0.032, then:
The total number of ICUs required to obtain the required number of participants is:
Total number of participants and ICUs required for the CRCT design
Assuming that 200 patients are eligible in each ICU over the 12month trial period, we would need to include:
Total number of participants and ICUs required for the IRCT design
For a trial run over 12 months, with 200 patients per ICU (100 patients per intervention), the total number of ICUs required is:
Sample size calculation for inICU mortality
Total number of participants and ICUs required for the CRXO design
The number of ICUs needed to achieve the required number of participants is:
If the BPC was η = 0.006, rather than η = 0.007, then the total number of participants required is:
We would need to include:
Total number of participants and ICUs required for the CRCT design
We would need \( {n}_{CRCT}=\frac{N_{CRCT}}{m}=\frac{134792}{1200}=113\kern0.5em \mathrm{ICUs}. \)
Total number of participants and ICUs required for the IRCT design
The total number of ICUs required is:
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Arnup, S.J., McKenzie, J.E., Hemming, K. et al. Understanding the cluster randomised crossover design: a graphical illustration of the components of variation and a sample size tutorial. Trials 18, 381 (2017). https://doi.org/10.1186/s1306301721132
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1306301721132
Keywords
 Cluster randomised
 Crossover
 Sample size
 Intracluster correlation
 Withinperiod correlation
 Betweenperiod correlation
 Components of variability