Wound Healing In Surgery for Trauma (WHIST): statistical analysis plan for a randomised controlled trial comparing standard wound management with negative pressure wound therapy

Background In the context of major trauma, the rate of wound infection in surgical incisions created during fracture fixation amongst patients with closed high-energy injuries is high. One of the factors which may reduce the risk of surgical site infection is the type of dressing applied over the closed incision. The WHIST trial evaluates the effects of negative-pressure wound therapy (NPWT) compared with standard dressings. Methods/design The WHIST trial is a multicentre, parallel group, randomised controlled trial. The primary outcome is the rate of deep surgical site infection at 30 days after major trauma. Secondary outcomes are measured at 3 and 6 months post-randomisation and include the Disability Rating Index, the EuroQoL EQ-5D-5 L, the Doleur Neuropathique Questionnaire, a patient-reported scar assessment, and record of complications. The analysis approaches for the primary and secondary outcomes are described here, as are the descriptive statistics which will be reported. The full WHIST protocol has already been published. Discussion This paper provides details of the planned statistical analyses for this trial and will reduce the risks of outcome reporting bias and data driven results. Trial registration International Standard Randomised Controlled Trials database, ISRCTN12702354. Registered on 9 December 2015.


Background
Major trauma is the leading cause of death in patients under 45 years and a significant cause of short-and long-term morbidity [1]. In the context of major trauma, the wounds associated with surgery to fractured limbs are notoriously difficult to manage. Even in closed high-energy injuries associated with major trauma, the rate of infection in surgical incisions created during fracture fixation remains high; tibial plateau fractures are associated with infection rates of up to 27% [2][3][4][5][6] while pilon fractures have an incidence of deep infection ranging from 5 to 40% [7][8][9][10]. If surgical site infection does occur, treatment frequently continues for years after the trauma with significant personal and societal costs [11].
One of the factors which may reduce the risk of surgical site infection in the surgical wounds of major trauma patients is the type of dressing applied over the closed incision at the completion of the operative procedure. Traditionally, the surgical incision is covered with an adhesive dressing or gauze maintained in place with a bandage to protect the wound from contamination from the outside environment. Negative-pressure wound therapy (NPWT) is an alternative form of dressing which may be applied to closed surgical incisions. In this treatment, an open-cell, solid foam overlies the incision and is covered with a semipermeable membrane. A sealed tube is used to connect the foam to a pump which creates a partial vacuum over the wound.
There has only been one randomised trial comparing standard wound dressing with NPWT for patients with closed surgical wounds following major trauma to the limbs [12]. This trial demonstrated a reduction in the rate of late/deep wound infections in patients treated with NPWT (9%) versus those treated with standard dressings (15%); however, the reduction was of borderline statistical significance (p = 0.049), and the study has since been criticised for methodological flaws [13]. In addition, a recent Cochrane review concluded that further trials regarding the effects of NPWT are required [13].
The WHIST trial is a large-scale, multicentre, parallel group, randomised controlled trial designed to compare the rates of deep infection in patients with major trauma requiring surgical incisions for the treatment of lower limb fractures treated with NPWT compared to those treated with standard wound dressings. The protocol paper for the WHIST trial has been published previously [14]; the aim of this paper is to report in detail the analysis plan as agreed by the trial steering committee in March 2018. This paper has been prepared according to the published guidelines on the content of statistical analysis plans [15].

Methods and design
Trial design WHIST is a multicentre, two-arm, parallel-group, superiority randomised controlled trial designed to compare the rates of 'deep infection' in patients allocated to standard wound dressing versus those allocated to NPWT. Eligible patients are randomised on a 1:1 basis using minimisation to balance the two treatment groups by trial centre, open or closed fracture at presentation, and Injury Severity Score (ISS) ≤ 15 versus ISS ≥ 16. Neither participants nor their treating surgeons are blind to treatment allocation since wound dressings are clearly visible. The primary outcome is assessed at 30 days after randomisation with secondary outcomes assessed at baseline, 30 days, and 3 and 6 months after randomisation. Full details of the trial design, study population, and study procedures have been published previously [14].
The trial is registered with the International Standard Randomised Controlled Trials database, ISRCTN reference number ISRCTN12702354.

Objectives
The primary objective of this trial is to quantify and draw inferences on differences in the rate of deep infection of the lower limb in the 30 days after major trauma between participants receiving standard dressings and those receiving NPWT. Secondary objectives include assessing differences between the same groups in disability, wound healing, quality of life, neuropathic pain, and the number and nature of complications experienced at 3 and 6 months post-randomisation.

Primary outcome
The primary outcome for this study is the rate of deep infection; the Centre for Disease Control and Prevention (CDC) definition of a "deep surgical site infection"-a wound infection involving tissues deep to the skin that occurs within 30 days of injury [16]-is used. Since the trial began, the CDC definition of a deep surgical site infection has been modified to include wound infections occurring up to 90 days after injury if metal implants are used in the fracture fixation [17]. Although not all wounds associated with lower limb fracture surgery contain implants, this is the most common method of fixation and therefore we will incorporate this alternative definition of the primary outcome in a supplementary analysis to ensure the study can be utilised in future systematic reviews and meta-analyses.

Secondary outcomes
The secondary outcome measures, recorded at 3 and 6 months post-injury unless otherwise stated, are as follows: Disability Rating Index (DRI) [18]: a selfadministered, 12-item visual analogue scale (VAS). Each item is scored from 0 (carry out task without difficulty) to 100 (not at all). Total scores are calculated as an average across all 12 items with higher scores indicating greater disability. Euroqol EQ-5D-5 L [19,20]: a self-reported outcome measure consisting of five dimensions each with five possible responses which are converted to multiattribute utility scores where 1 represents perfect health, 0 represents death, and scores less than 0 are possible. The EQ-5D-5 L also includes a 0-100 VAS recording overall health status with higher scores representing better health. Doleur Neuropathique Questionnaire (DN4) [21]: seven yes/no questions with total scores being the number of questions answered yes. Scores of 3 or greater are considered indicative of neuropathic pain. Patient-reported scar assessment using the patient scale from the Patient and Observer Scar Assessment Scale (POSAS) [22]: six questions each scored out of 10. The answers are summed to give an overall score out of 60 with higher scores indicating better healing. This scale is also measured at 30 days post-injury. Complications grouped into three categories: (i) local complications related to the injury or operation-this will include wound healing assessment at 30 days using photographs, signs of infection up to 6 months, and other local complications; (ii) systemic complications related to the injury or operation-this will include other related SAEs; and (iii) unrelated SAEs.

Sample size
Only one previous randomised trial has compared NPWT to standard dressings for surgical incisions associated with major trauma to the lower limb [12]. This trial indicated that the rate of 'late' (deep) infection was reduced by 6%; from 15% in the standard treatment group to 9% in the NPWT group.
In the absence of a minimum clinically important difference for deep wound infection, we surveyed surgeons in the UK Orthopaedic Trauma Society who perform surgery for major trauma to the limbs (unpublished data, 2015). This survey showed that a 6% reduction in the rate of 'deep infection' would, universally, be sufficient to change clinical practice with regard to the choice of dressing. Therefore, assuming a reduction in the proportion of patients having a deep infection from 15% to 9%, 615 patients would be required in each group to provide 90% power at the 5% level (two-sided) when comparing two independent proportions. Previous experience in clinical trials of lower limb fracture surgery for major trauma indicates that up to 20% of the primary outcome data may be lost during the follow-up period due to death and loss to follow-up [23]; therefore, we propose to recruit 1540 patients in total for this trial (770 per arm).

Statistical analysis General analysis principles
Two analysis populations will be considered, the intent-to-treat (ITT) population and the per-protocol (PP) population. The ITT population will include all participants randomised with the exception of those who: (i) prospectively declined consent but were subsequently randomised in error; (ii) retrospectively declined consent and requested that all their data were removed; or (iii) withdrew and requested that all their data were removed. Participants will be analysed according to the group to which they were randomised. The PP population will be analysed according to the treatment they actually received. Participants with major protocol deviations or violations will be excluded from the PP population. Major protocol deviations are those who did not satisfy the eligibility criteria (for example their wound could not be closed primarily), those who did not receive the allocated treatment, and those for whom insufficient data are available on the primary outcome. The definition of the PP population will be finalised during a blinded analysis of the data prior to the primary analysis time point.
A significance level of 0.05 will be used throughout, and 95% confidence intervals will be reported. The primary conclusion of the trial will be based on the results from the primary analysis of the primary outcome. Sensitivity analyses of the primary outcome will be performed to assess whether these results are robust. All analyses of secondary outcomes will be considered as supporting the primary analysis, and conclusions of the trial will not be based on these outcomes.
All analyses will be carried out using appropriate, validated statistical software such as STATA [24] or R [25]. The relevant package and version number used for the analysis will be recorded and reported.

Descriptive analyses
The flow of participants through each stage of the trial, including the number of eligible individuals screened, randomised to each arm, receiving allocated treatment, and included in the primary analysis will be summarised using a CONSORT flow chart (Fig. 1). Reasons for ineligibility, loss to follow-up, and exclusion from the primary analysis will be summarised, as will the number of patients declining consent both prospectively and retrospectively.
The baseline comparability of the two randomised groups in terms of (i) minimisation factors, (ii) baseline characteristics (Table 1), (iii) operative procedure details (Table 2), and (iv) secondary outcomes at baseline will be presented. Numbers with percentages will be used to compare binary and categorical variables, and either means and standard deviations or medians and interquartile ranges will be used for continuous variables. There will be no tests of statistical significance nor confidence intervals for differences between the randomised groups.
Loss to follow-up, withdrawals, and missing data The numbers (and percentages) of losses to follow-up and withdrawals along with reasons for these will be reported by intervention arm at each time point. To ensure that differential losses do not occur between the two groups this will be tested using absolute risk differences (with 95% confidence intervals) and a chi-squared test. Any deaths and their causes will be reported separately.
The patterns of availability of data for primary and key secondary outcomes, from baseline to end of follow-up, will be summarised for the two treatment groups (as number and percentage of individuals missing). Reasons for missing-ness will also be presented, if known. Where appropriate, differentiation will be made between partially completed and fully missing outcome data. Two analysis datasets will be considered: (i) the available case dataset, consisting of all observed data; and (ii) the imputed dataset, where missing data are imputed. Missing data on the primary outcome will be imputed using a best case worst case analysis, considering the situation where all participants in the Standard dressing group with missing data are assumed to have a deep infection and all participants in the NPWT group with missing data are assumed not to have a deep infection and vice versa. Missing data on continuous outcomes will be imputed using multiple imputation (MI) under the missing at random (MAR) assumption. The suitability of the MAR assumption will be considered.

Compliance
The randomised intervention in this trial is the dressing (standard or NPWT) applied to the closed fracture wound at the end of surgery. This intervention occurs at a single time point, and compliance is therefore defined as the proportion of participants in each arm receiving the treatment to which they were randomised. The number (and percentage) of participants receiving the assigned dressing and receiving another dressing or no dressing in each arm will be summarised as well as the reasons for not receiving the randomised treatment. Details of what was provided instead will also be summarised.

Analysis of primary outcome
The numbers and percentages of 'deep infections' occurring up to 30 days post-randomisation in the two study intervention groups, NPWT and standard dressing, will be calculated. The rates of deep infection in the two study groups will be compared using a mixed effects logistic regression model. The model will include a random effect to account for any heterogeneity in the response due to recruitment centre. Fixed effects to adjust for open versus closed fractures, ISS level (≤ 15 vs ≥ 16), participant age, and participant gender will also be included. Participant age and gender are included in the model since there is evidence that older men have worse outcomes after major trauma. The result will be reported as an odds ratio (OR) with associated 95% confidence interval and p value for comparison between the two treatment groups. The unadjusted OR and associated 95% confidence interval will also be reported. This analysis will be conducted for the ITT population using the available case dataset. As sensitivity analyses, the analysis will be repeated for: (i) the ITT population using an imputed dataset (best case worst case analysis); and (ii) the PP population using the available case dataset. In addition, a sensitivity analysis taking account of the competing risk of death [26] will be conducted if a sufficient number of deaths have occurred prior to 30 days, that is if more than 5% of participants have died prior to this time point.  If a significant treatment effect of NPWT is identified in the primary analysis, an exploratory subgroup analysis will be conducted to investigate whether this effect is moderated by the underlying risk level of the wound. This will be done by repeating the primary analysis and including wound location (above or below the knee) as a covariate. Wound location will be used as a proxy for wound risk level due to differences in soft-tissue cover; there being less soft-tissue cover below the knee.
The main analysis of the primary outcome (ITT population using the available case dataset) will be repeated using the alternative definition of 'deep infection' , including infections occurring up to 90 days after injury. If any of the sensitivity analyses conducted for the primary endpoint (30 days) demonstrated substantially different results from the primary analysis, these analyses will be repeated for the rates of deep infection up to 90 days.

Analysis of secondary outcomes
For each of the continuous secondary outcomes (DRI, EQ-5D-5 L, and POSAS) the mean and standard deviation for each intervention arm will be reported. Assuming approximate normality is established, multilevel mixed-effects linear regression models, using repeated measures (level 1) nested within participants (level 2), will be used. The suitability of the assumption of approximate normality will be explored by plotting the residuals from this model. The model will include a random effect to account for any heterogeneity in response due to recruitment centre (level 3). The model will also include fixed effects to adjust for open versus closed fractures, ISS level (≤ 15 vs ≥ 16), participant age and participant gender, and, where appropriate, pre-injury values (DRI and EQ-5D-5 L). Trends over time in each intervention arm will be examined by plotting these, and, if trends differ between arms, interactions between treatment and time will be included in the model. The adjusted difference between the treatment arms at each time point will be reported. This analysis will be conducted for the ITT population using the available case dataset. The analysis of the DRI will be repeated using an imputed dataset. Data will be imputed using MI under the MAR assumption.
If, for any of these variables, approximate normality is not appropriate, the first approach will be to consider a transformation of the data or the use of a different metric such as change from baseline to attain normality. If normality cannot be achieved by transformation, the data will be analysed using a non-parametric equivalent (Mann-Whitney U-test) with no adjustment and medians and interquartile ranges will be reported for each treatment arm.
In addition, supplementary analyses of the DRI and EQ-5D utility variables will be conducted using area under the curve (AUC) summary statistics [27]. For each intervention arm, a linear combination of the parameter estimates at each time point (baseline to 6 months) from the mixed effects models will be used to calculate the AUC, thus providing an overall estimate of recovery over time for each intervention arm. This analysis will be conducted for the ITT population using the available cases dataset. The difference between the two groups will be calculated and compared using a t-test.
The DN4 will be analysed using similar methods to those outlined for the primary outcome. The number and proportion of individuals deemed to have neuropathic pain (DN4 ≥ 3) will be reported for each treatment arm. A multi-level mixed-effects logistic regression model with repeated measures (level 1) nested within participants (level 2) will be used. The model will be adjusted for recruitment centre as a random effect (level 3), and fixed effects will be included to adjust for open versus closed fractures, ISS level (≤ 15 vs ≥ 16), participant age, and participant gender. Trends over time will be examined, and, if appropriate, interactions between treatment and time will be included. Results will be presented as ORs with associated confidence intervals. The unadjusted OR and associated 95% confidence interval will also be reported. This analysis will be conducted for the ITT population using the available case dataset.
Similar methods will also be used to analyse complications. The number and percentage of people experiencing each complication in each treatment arm will be reported. If there are sufficient numbers of events, a mixed-effects logistic regression model will be used to compare the rates of complications between intervention arms. The model will include a random effect for recruitment centre and fixed effects for open versus closed fractures, ISS level (≤ 15 vs ≥ 16), participant age, and participant gender. If there are not sufficient numbers of events to fit an adjusted model, unadjusted differences between intervention arms will be calculated using a chi-squared test. This analysis will be conducted for the ITT population using the available case dataset. Temporal patterns of complications will be presented graphically and, if appropriate, a time-to-event analysis (Kaplan-Meier survival analysis) will be used to assess the overall risk and risk within individual classes of complications.

Discussion
The WHIST trial will initially provide data regarding the effects of NPWT on the outcomes of participants up to 6 months after surgery for major trauma, compared to those receiving standard wound dressings. This paper provides details of the planned statistical analyses for this trial and will help reduce the risks of outcome reporting bias and data-driven results [28].
The participants enrolled in the trial will subsequently be followed up annually for 5 years to assess their long-term outcomes, but the analysis of these data will be reported separately.

Trial status
Recruitment for the trial closed on 17 April 2018. In total 1548 patients from 24 study sites were recruited. Follow-up is currently ongoing and expected to finish in October 2018; the analysis of outcomes up to 6 months after randomisation will be conducted thereafter. The trial also includes long-term follow-up from 1 to 5 years and the analysis of these data will be reported separately.