Treatment comparisons
Two distinct treatment comparisons are planned in this study. The first is to compare oxygen supplementation with no oxygen supplementation (with the former group comprising both patients who were allocated nocturnal oxygen only for three nights and patients allocated continuous oxygen for 72 hours). The second comparison compares the two oxygen supplementation groups with each other (nocturnal oxygen only for three nights versus continuous oxygen for 72 hours).
In addition to the two main comparisons, in the event that, in relation to the primary outcome measure, (a) the overall comparison between both oxygen supplementation groups and the control group is non-significant, but (b) one oxygen supplementation group shows significantly greater benefit than the other, a comparison will be performed (on the primary outcome measure only) between the better of the two oxygen supplementation groups and the control group. This is to examine the possibility that, for example, nocturnal oxygen supplementation may have a beneficial effect that is offset by, and thus masked by, a non-beneficial or harmful effect of continuous oxygen supplementation.
The analyses described next will be repeated for each of the two main comparisons.
Primary outcome measure
The primary outcome measure is the modified Rankin scale (mRS) at 90 days after randomization[17]. The mRS is an ordinal scale ranging from 0 (no disability) to 5 (extreme disability). Patients who die prior to the 90-day time point will be considered to have an mRS score of 6, thus creating a 0 to 6 scale.
The mRS will be analyzed using an ordinal logistic regression model. Both an unadjusted (primary) and adjusted (secondary) analysis will be performed using the covariates specified earlier (see General considerations section).
Secondary outcome measures at 7 days
Secondary outcome measures at 7 days comprise: the NIHSS; the number of patients with neurological improvement (≥4 point decrease from baseline or a value of 0 in the NIHSS); mortality; highest oxygen saturation during the first 72 hours; lowest oxygen saturation during the first 72 hours; the number of patients whose oxygen saturation falls below 90%.
Secondary outcome measures at 90 days
Secondary outcome measures at 90 days comprise: mortality; the number of patients alive and independent (mRS ≤2); the number of patients living at home; ability to perform activities of daily using (using the Barthel index of activities of daily living[18]); quality of life (using the EuroQoL EQ-5D questionnaire[19]); extended activities of daily living (using the Nottingham Extended Activities of Daily Living (NEADL) index[20]).
Participants who die before the assessment point will not have data for the NIHSS, Barthel index, EuroQol EQ-5D, or NEADL index. This could bias results in favor of the treatment arm with higher mortality. Death will therefore be included in the analysis of the NIHSS, EuroQol EQ-5D, and NEADL index as the worst outcome on the scale[21].
For numerical outcomes, means and standard deviations or medians and interquartile ranges will be reported, as appropriate. Unadjusted analyses will use an unrelated t test, with the mean difference between treatments and appropriate confidence interval reported. In the event of major deviations from the assumptions of the t test, an appropriate alternative analysis will be used. The adjusted analysis will use analysis of covariance methods, with the covariates specified earlier included in the analysis.
For dichotomous outcomes, percentages will be compared across the treatment comparisons using a χ-squared or Fisher exact test as appropriate for the unadjusted analysis. The adjusted analysis of dichotomous outcomes will use binary logistic regression, using the covariates listed earlier. Odds ratios and confidence intervals will be reported. The number needed to be treated will also be calculated[22].
For ordinal secondary outcomes, the analyses described for the mRS will be applied.
Data at 6 and 12 months
The longer-term follow-up data at 6 and 12 months will be analyzed at each time point using the same methods described earlier. In addition, analyses will be performed across 3-, 6- and 12-month time points using a longitudinal repeated measures analysis, such as linear mixed models[23].
The treatment effect will initially be assumed to be constant over time; further analyses may be carried out to investigate the effects of including time and a treatment × time interaction in the models.
Mortality will be analyzed using log-rank methods (unadjusted analysis) with Kaplan-Meier plots presented. The adjusted analysis will use Cox regression methods, including the covariates listed previously. In the covariates, the prognostic index for 30-day survival will replace that for independence at 6 months. The proportional hazards assumption associated with the Cox regression model will be tested via Schoenfeld residuals (this assumption was found to be tenable in the analysis of the 6-month survival in the pilot study[24]). Hazard ratios and 99% confidence intervals will be reported for both unadjusted and adjusted analyses.
Planned subgroup analyses
These will be performed in respect of the primary outcome measure only, based on a risk-stratification approach[25]. The subgroups comprise: NIHSS score at baseline as indicator of stroke severity (0 to 4, 5 to 9, 10 to 14, 15 to 20, >20); baseline% oxygen saturation (<92, 92 to 93.9, 94 to 94.9, 95 to 97, >97); treatment with O2 prior to randomization (yes or no); time in hours since onset of stroke (≤3, >3 to 6, >6 to 12, >12 to 24, >24); type of stroke (hemorrhage or infarct); Glasgow Coma Scale motor score plus eye score (<10, 10); age (<50, 50 to 80, >80); history of chronic obstructive airways disease or asthma (yes or no); history of heart failure (yes or no); thrombolysis (yes or no); baseline SSV risk score for independence at 6 months (≤0.1, >0.1 and ≤0.35, >0.35 and ≤0.7, >0.7).
These subgroup effects will be analyzed by means of an interaction term[26]; however, pairwise hypothesis tests between the levels of the subgroup factor will not be performed, owing to the probably low level of statistical power. Subgroup-specific estimates will be reported descriptively with 95% confidence intervals and displayed graphically in a forest plot, and will be interpreted with caution (especially in respect of any subgroups with low numbers).
Exploratory analyses
Exploratory analyses will be conducted using data collected at 7 days. These will include details of the stroke diagnosis and imaging (for example, imaging results, final diagnosis, stroke syndrome[27], etiological classification[28], indicators of compliance with the trial treatment, oxygen saturation during the intervention), and clinical data that might indicate stress induced by the intervention, for example, sedative use, the number of participants whose highest heart rate was >100 beats per minute, whose highest systolic blood pressure was >200 mmHg, or whose highest diastolic blood pressure was >100 mmHg during the intervention, or who developed infections (antibiotic use during the first 7 days).
In addition, we will report data on symptoms that were highlighted as important for their quality of life by stroke patients and their carers[29], but not sufficiently covered in the validated tools used for the assessment of the primary and secondary outcomes. These are: the number of patients who reported their memory, eyesight, and sleep as being ‘as good as before the stroke’ , and the number who reported that they had no significant speech problems (no problems or some problems but not interfering significantly with conversation). These outcomes were adapted from the ‘simple questions’ described by Lindley et al.[30], but are not validated.
For these exploratory analyses, data will be tabulated across the treatment comparisons at each time point, but will not be subjected to formal statistical testing.
Sensitivity analyses
The following sensitivity analyses will be performed.
Two comparative analyses with the observed case analysis will be performed to allow for missing data. First, a multiple imputation method, using at least ten imputed datasets, will be used. These imputations will be based on specified baseline covariates (age, sex, treatment group, oxygen saturation, SSV risk score, NIHSS score) and values of the outcome variable concerned at other time points. If necessary, missing data on baseline covariates used in the multiple imputation algorithm will be estimated[31]. Second, two additional imputations will be conducted to allow for the possibility that data is missing not at random, and missing values are (i) better or (ii) worse than would otherwise be expected.
Two comparative analyses with the intention-to-treat analysis will be performed[32]. First, a per-protocol ‘adherers only’ analysis will be conducted, where only patients who complied with treatment are analyzed. Second, a per-protocol ‘as treated’ analysis (if feasible) will be carried out, where patients are classified with respect to the intervention that they ultimately received rather than the intervention to which they were randomized.
Additionally, in the event that the proportional odds assumption for analysis of the main outcome does not hold, an appropriate alternative method will be investigated, such as a sliding dichotomy analysis[33].
Serious adverse events
The proportion of patients who experience at least one serious adverse event will be analyzed as a categorical variable (with patients classified as either having experienced at least one serious adverse event, or not) using a χ-squared or Fisher’s exact test (as appropriate). If appropriate, a more complex model of serious adverse event occurrences will be constructed utilizing adverse events as count variables. Possible models for this analysis include Poisson regression and negative binomial regression. An analysis of subgroups of serious adverse events will be performed separately, but in an identical manner to the overall adverse events analyses.