 Research
 Open Access
 Published:
Noninferiority test for a continuous variable with a flexible margin in an active controlled trial: an application to the “Stratall ANRS 12110 / ESTHER” trial
Trials volume 23, Article number: 202 (2022)
Abstract
Background
Noninferiority trials are becoming increasingly popular in public health and clinical research. The choice of the noninferiority margin is the cornerstone of such trials. Most of the time, the noninferiority margin is fixed and constant, determined from historical trials as a fraction of the effect of the reference intervention. But in some circumstances, there may some uncertainty around the reference treatment that one would like to account for when performing the hypothesis testing. In this case, the noninferiority margin is not fixed in advance and depends on the reference intervention estimate. Hence, the uncertainty surrounding the noninferiority margin should be accounted for in statistical tests. In this work, we explore how to perform the noninferiority test for a continuous variable with a flexible margin.
Methods
We have proposed in this study, two procedures for the noninferiority test with a flexible margin for continuous endpoints. The proposed test procedures are based on a test statistic and confidence interval approaches respectively. Simulations have been used to assess the performances and properties of the proposed test procedures. An application was done on a realworld clinical data, to assess the efficacy of clinical monitoring alone versus laboratory and clinical monitoring in HIVinfected adult patients.
Results
Basically, for both proposed methods, the type I error estimate was not dependent on the values of the reference treatment. In the test statistic approach, the type 1 error rate estimate was approximatively equal to the nominal value. It has been found that the confidence interval level determined approximatively the level of significance. For a given nominal type I error α, the appropriate one and twosided confidence intervals should be with levels 1−α and 1−2α, respectively.
Conclusions
Based on the type I error rate and power estimates, the proposed noninferiority hypothesis test procedures had good performances and were applicable in practice.
Trial registration
ClinicalTrials.gov NCT00301561. Registered on March 13, 2006, url: https://clinicaltrials.gov/ct2/show/NCT00301561.
Background
After developing a new health intervention (treatment or diagnostic test), the next step is to assess its effectiveness, relative to the existing reference intervention. There are several strategies to do this, such as the superiority trials which involve testing whether the new treatment is superior to another (placebo, reference, or active control treatment). However, when the active control intervention achieves maximum efficacy or the use of a placebo is unethical, it becomes difficult to statistically show the superiority of the new health intervention. Studies aimed at showing that a new intervention is not worse than the active control intervention by more than a prespecified amount of efficacy have become increasingly common in the recent decade [1]. The expression is not worse than the active control intervention by more than a prespecified amount, means it is acceptable to lose a “little bit” of the main effect of the active control intervention compared to a new intervention’s benefits (fewer side effects, costs, tolerable, and safer). This acceptable loss of efficacy is illustrated numerically as the noninferiority margin. A trial showing that the new intervention is noninferior to the active control intervention is called a noninferiority trial [1].
The Food and Drug Administration (FDA)[2] provided general principles for an appropriate choice of the noninferiority margin. The noninferiority margin is at the upper limit of the confidence interval, so the trial is designed to show evidence of no more than this “loss of maximum efficacy.” Generally, this margin is fixed, determined from historical trials as a fraction of the treatment effect. However, in some cases, the mean estimate of reference treatment could be subjected to variations to the levels that adopting a fixed margin would not be relevant. Indeed, the fixed margin cannot take into account the variability which surrounds the reference treatment estimate, in this case, the margin should be a function of the reference treatment. For binary endpoints, tests that account for nonfixed margins have been studied [3–5]. One finds that most works on the noninferiority test for continuous endpoints with fixed and linear margin have been focused on the confidence intervals approach [6–8], mainly consisting of comparing the bounds of the treatments difference to the fixed margin. However, few studies have been performed for a nonfixed or variable margin for continuous endpoints. This work is aimed at deriving noninferiority tests for continuous endpoints with flexible margin in active randomized controlled trials. An application of the proposed methods is done on the Stratall ANRS 12110/ESTHER trial.
Methods
Notations
The following are the definition of the basic notations used.

X_{R} and X_{N} are the the random variables for continuous primary endpoint in the active control group (reference) and new intervention group (new group), respectively.

n_{R} and n_{N} are the the sample sizes for the active control group and new group, respectively.

μ_{R} and μ_{N} are the the means of continuous primary endpoint for the active group and new group, respectively.

\({\sigma }^{2}_{R}\) and \({\sigma }^{2}_{N}\) are the the variances of continuous primary endpoint for the active group and new group respectively.

Δ_{L}(μ_{R}) is the noninferiority margin, and Δ=μ_{N}−μ_{R} is the difference of true means.

H_{0} and H_{1} are the null and alternative hypotheses, respectively.
Approach using a test statistic
Without loss of generality, assuming that an increase in the endpoint corresponds to more efficacy. The noninferiority hypotheses can be formulated as follows:
The formulation of the hypotheses test in Eq. (1) shows that the noninferiority means that the new intervention is not worse than the active control intervention with a Δ_{L} margin. When Δ_{L} is fixed, testing the hypotheses (1) can be viewed as a classical composite hypotheses test for mean difference [9]; therefore, based on the central limit theorem applied to the boundary of the null hypothesis, the asymptotic test Z_{fixed} can be obtained by:
In effect, when Δ_{L} is fixed, we have:
The null hypothesis is rejected if Z_{fixed}>Z_{1−α}, where Z_{1−α} is the (1−α) percentile of the standard normal distribution. From the KarlinRubin theorem, this test is the uniformly most powerful test of level α [10].
If Δ_{L} is not fixed, i.e, if Δ_{L} is a function of μ_{R}, then \(\text {Var}\{\bar {X}_{N}\bar {X}_{R}+\Delta _{L}(\bar {X}_{R})\}\neq \text {Var}(\bar {X}_{N})+\text {Var}(\bar {X}_{R})\), and therefore, \(\text {Var}(\bar {X}_{N})+\text {Var}(\bar {X}_{R})\) is not a valid variance of \(\bar {X}_{N}\bar {X}_{R}+\Delta _{L}(\bar {X}_{R})\). Under the assumption that Δ_{L} is a continuously differentiable function, variance estimation was performed using delta method discussed below.
Variance estimation using delta method
If Δ_{L}(.) is a continuously differentiable such that ΔL′(μ_{R})≠0 (ΔL′ is the first derivative of Δ_{L}), then using the Taylor series of order 1 in a neighborhood of μ_{R},
Hence,
Thus, the variance estimate is:
The test statistic can then be expressed as:
Asymptotic properties of the test statistic Z _{flexible}
From the central limit theorem, when n_{N} and n_{R} approach infinity, the random variable Z_{flexible}∼N(0,1) on the boundary of null hypothesis, that is, asymptotically,
μ_{R} is unknown and \(\sigma ^{2}_{R}\) and \(\sigma ^{2}_{N}\) may be unknowns, which need to be estimated. We used the maximum likelihood estimation method on the boundary of the null hypothesis (μ_{N}=μ_{R}−Δ_{L}(μ_{R})). The unknown parameters are estimated considering the cases where the variances \(\sigma ^{2}_{R}\) and \(\sigma ^{2}_{N}\) are known, unknown, equal, or unequal.
The maximum likelihood (ML) estimators \(\hat {\mu _{R}}, \hat {\sigma _{R}}^{2}\) and \(\hat {\sigma _{N}}^{2}\) for \(\mu _{R}, \sigma ^{2}_{R}\) and \(\sigma ^{2}_{N}\), respectively, are consistent. Moreover, since ΔL′ is assumed continuous, \(\Delta '_{L}(\hat {\mu _{R}})\) is a consistent estimator for ΔL′(μ_{R}). The estimator \(\hat {Z}_{\text {flexible}}\) of the test statistic Z_{flexible} can be obtained by replacing the unknown parameters in (6) by their ML estimators. Therefore, the test H0′ versus H_{1} (where H0′ is the boundary of H_{0} i.e μ_{N}=μ_{R}−Δ_{L}(μ_{R})) is rejected if \(\hat {Z}_{\text {flexible}}>z_{1\alpha }\), where α is the nominal type I error and z_{1−α} denotes the 1−α percentile of the standard normal distribution. The significance level of this test tends to α when n_{N} and n_{R} approach infinity.
Assuming that, under alternative hypotheses H_{1},μ_{N}−μ_{R}+Δ_{L}(μ_{R})=v, we have v>0. Hence, if η is the power of the test, it follows that:
where, under alternative hypothesis, \(\frac {\bar {X}_{N}\bar {X}_{R}+\Delta _{L}(\bar {X}_{R})v}{\sqrt {\frac {\sigma ^{2}_{N}} {n_{N}}+\frac {(\Delta '_{L}(\mu _{R})1)^{2}\sigma ^{2}_{R}}{n_{R}}}} \sim N(0,1)\). Assuming the equal variance in both groups (\(\sigma ^{2} = \sigma ^{2}_{R} =\sigma ^{2}_{N}\)) and denoting by δ=v/σ, the power, given as a function of δ,n_{N},n_{R}, and α is:
where Φ is the cumulative distribution function of the standard normal distribution. For a fixed nominal type I error α, and for any fixed μ_{R} and μ_{N} such that v=μ_{N}−μ_{R}+Δ_{L}(μ_{R})>0, when n_{R}→∞ and n_{N}→∞, it follows that η→1. Therefore, the test Z_{flexible} is asymptotically convergent. From Eq. 8, it is possible to find the sample size that achieves the nominal fixed power. Denoting the nominal type II error by β and assuming that n_{N}=rn_{R} with r>0, the sample size which will allow nominal power (1−β) is such that:
This formula is equivalent to the one found in [9] when the margin is fixed. Practically, δ is equivalent to the standardized difference in the comparison of the means, and in this work, it would be named standardized noninferiority difference. In the power and sample sizes calculations, one will fix δ (for example, δ=0.05 or δ=0.5 if one wants to detect small or large inferiority differences respectively), and μ_{R} could be prespecified from historical studies with similar treatment.
The proposed test statistic \(\hat {Z}_{\text {flexible}}\) is asymptotic, hence works well for large sample sizes, hence not adapted for datasets with small sample sizes, which are not uncommon in pratical situations. In such cases, the nonparametric test based on the percentile bootstrap confidence interval which does not require any assumptions on the sample size or sample distribution can be used[11].
Approach based on confidence intervals
For any test based on confidence intervals, the main interest is on the level of confidence intervals which is required to achieve a desired nominal type I error. Moreover, as discussed in [9] and [12], the type I error is a controversial issue in clinical trial tests. In the framework of noninferiority tests, when the noninferiority margin is fixed, [13] recommended using 1−α and \(1\frac {\alpha }{2}\) for twosided and onesided confidence interval levels respectively, while [7] recommended to use 1−2α for twosided and 1−α for onesided confidence intervals. In [7], it is argued that the recommendation of [13] would lead to a conservative test, as the estimate type I error rate would be half the nominal one. Moreover, it has been argued that there would be approximately a 10% loss of power. In this section, we propose a nonparametric procedure for the confidence interval (onesided and two sided) construction when the noninferiority margin is flexible.
An intuitive procedure based on confidence intervals for the hypotheses test in Eq. (1) would be by checking the overlapping of the confidence intervals of μ_{N}−μ_{R} and −Δ_{L}(μ_{R}). The null hypothesis would be rejected if the two confidence intervals are nonoverlapped and not rejected otherwise. In such case, as illustrated in [14], the intervals may be overlapped while the statistics would not be necessarily nonsignificantly different; thus, the power of the test would be lower. The proposed procedure involves comparing the lower bound of the confidence interval (one or twosided, respectively) with γ% level of μ_{N}−μ_{R}+Δ_{L}(μ_{R}) with 0. The null hypothesis H_{0} is rejected if the lower bound of the confidence interval for μ_{N}−μ_{R}+Δ_{L}(μ_{R}) is greater than 0.
Estimation of the type I error is performed using simulations and nonparametric estimation of confidence intervals on the boundary of the null hypothesis. The detailed steps are described below. 1. From a fixed μ_{R}, calculate μ_{N}=μ_{R}−Δ_{L}(μ_{R}) (satisfying the null hypothesis H_{0}). We assume that the standard deviations σ_{N} and σ_{R} are known. 2. Let m denote the number of desired simulations, for i∈{1⋯m}, simulate m pairs of samples X_{N} and X_{R} of size n_{N} and n_{R}, respectively, from the normal distribution \(\mathcal {N}(\mu _{N}, \sigma _{N})\) and \(\mathcal {N}(\mu _{R}, \sigma _{R}),\) respectively. 3. Using bootstrap, compute the empirical percentile confidence intervals [a_{i},∞] for onesided confidence interval (and [a_{i},b_{i}] for twosided confidence interval, respectively) of level γ for μ_{N}−μ_{R}+Δ_{L}(μ_{R}), for i∈{1⋯m}. 4. For i∈{1⋯m}H_{0} is rejected when a_{i}>0, thus the level of significance is estimated by: \(\alpha (\gamma)=\frac {1}{m}\sum ^{m}_{i=1}1_{a_{i}>0}\).
Like any other power estimation, the data are drawn under the alternative hypothesis that is, μ_{N}>μ_{R}−Δ_{L}(μ_{R}). Since there is a wide range of possibilities on the alternative hypothesis, in practice, one considers the equivalence point, that is, μ_{R}=μ_{N}. Therefore, similarly to studies of [5] and [15], the equivalence point (μ_{R}=μ_{N}) will be used for drawing data for the power estimation. 1. Given μ_{R}, simulate m pairs of samples X_{N} and X_{R} of respective sizes n_{N} and n_{R} using the respective normal distributions \(\mathcal {N}(\mu _{R}, \sigma _{N})\) and \(\mathcal {N}(\mu _{R}, \sigma _{R})\). 2. Using bootstrap, compute the empirical percentile confidence intervals [a_{i},b_{i}] of level γ for μ_{N}−μ_{R}+Δ_{L}(μ_{R}), for i∈{1⋯m}. 3. For i∈{1⋯m}H_{0} is rejected when a_{i}>0. Thus, the power is estimated by, \(\eta (\gamma)=\frac {1}{m}\sum ^{m}_{i=1}1_{a_{i}>0}\).
Performances assessment
Simulations were done to evaluate the finitesample performances of the asymptotic test and confidence interval based test. The performance indicators used were the type I error and statistical power. MonteCarlo simulation techniques were used for the estimation of the considered indicators. In the simulations, we considered the margin \(\Delta _{L}(\mu _{R})=\mu _{R}^{1/4}\); and unknown variances \(\sigma _{R}^{2}\) and \(\sigma _{N}^{2}\).
Both indicators were computed for the two proposed tests according to the reference treatment. For the type I error, data were drawn on the boundary of the null hypothesis: for a given μ_{R}, μ_{N} is obtained such that μ_{N}=μ_{R}−Δ_{L}(μ_{R}). For the power, data were drawn under the alternative hypothesis: for a given μ_{R}, μ_{N} is obtained such that μ_{N}>μ_{R}−Δ_{L}(μ_{R}). Usually, one takes μ_{N}=μ_{R}. In all cases, it is assumed that μ_{R} vary in [1,1000]. In the test based on statistic, the power was estimate using formula (8), and two cases were considered for δ=0.05 and δ=0.5.
In the approach based on the asymptotic test, the nominal type I error was fixed and set at α=5%. For the confidence interval based test, we considered 95% one and twosided confidence interval levels. The purpose was to estimate the type I error rate for the respective confidence interval. In all the simulations, we considered balanced sample sizes (that is when n=n_{N}=n_{R}), n=30,100, and 1000 for small, medium, and large sample sizes, respectively. The number of bootstrap samples with replacement was B=1000, and the number of simulation replications was m=10000. The R software programming language [16] was used to conduct the simulations and codes are accessible in a separate file on request.
Application to the Stratall ANRS 12110 / ESTHER
This study was motivated by the randomized noninferiority “Stratall ANRS 12110 / ESTHER” trial [17]. The main purpose was to assess an exclusively clinical monitoring strategy compared with a clinical monitoring strategy plus laboratory monitoring in terms of effectiveness and safety in HIVinfected patients in Cameroon. The idea was to achieve the scalingup of HIV care in rural districts where most people live with HIV, but local health facilities generally have lowgrade equipment. A total of 459 HIVinfected patients were included in the study and randomly allocated to two groups, one receiving exclusively clinical monitoring (intervention group, N = 238) and the other receiving laboratory and clinical monitoring (active control group (reference), N = 221). All patients included were initiated antiretroviral treatment and were followed up for 24 months. Clinical monitoring alone was compared to laboratory and clinical monitoring in a noninferiority design. The continuous primary endpoint was the increase in CD4 cells count from treatment initiation to the twentyfourth month. Based on previous studies, the noninferiority margin (Δ_{L}(R)) was prespecified as a linear function (25%) of the mean CD4 cells increase (μ_{R}) after 24 months of antiretroviral treatment in laboratory and clinical monitoring group, \(\Delta _{L}(R)= \frac {25}{100} \mu _{R}\). Unlike other noninferiority studies [18, 19], the noninferiority margin in this study was varied (depending on the mean increase in CD4 in the active control group (reference)). However, the classical twosided confidence interval based test with 90% level were used to obtain a type I error (α) close to 5% [17]. Indeed, the statistical test procedures that explore the noninferiority test for con tinuous data with variable margins were not available at that time in the original paper [17]. Moreover, as discussed in [12], the relationship between the confidence intervals level and the type I error can be controversial.
More details about the background of the study and the clinical trial process can be found in [17]. Two analyses were done according to the type of data:

Firstly, the increase of CD4 cells count at 24 months from the baseline was considered, which implies missing or lost patients before the end of followup period were excluded in the analysis. In that case, the total number of patient in the analysis reduced to n=334, with n_{R}=169 and n_{N}=165. “Observed data” will refer to the case where data are analyzed by excluding participants with missing observation at 24 months.

Secondly, an analysis was done with all participants who attended at least one followup visit, and the last observation carried forward (LOCF) imputation method was applied for participants whose CD4 data were missing at 24 months (in this case, the number of patients to analyzed is the same as the baseline: n=459, n_{R}=238, n_{N}=221).
The classical parametric twosided confidence interval based test with 90% level was used by [17] to perform the noninferiority test. The final result was that the CLIN was inferior to the LAB.
Results
Simulations results
Test statistic based test
The results for the approach based on a statistic are summarized in Figs. 1, 2, and 3 for type I error rate and power estimates, respectively. Whatever the sample size, it is observed that the type I error rate estimates were constant and were not μ_{R} dependent. For small sample size, the type I error rate estimate was slightly above the nominal value, while the median value estimate was 0.053, and an Interquartile Range(IQR) of [0.051−0.054]. As the sample size increases, the type I error estimates get close to the nominal value. In effect, for medium sample size of n=100, the type I error estimate is close to the nominal value, the median value estimate for μ_{R} was 0.051 (IQR=[0.050−0.052]). For large sample sizes, for example, n=1000, the type I error estimate was more accurate and closer to the nominal value, the median estimate was 0.050 (IQR=[0.050−0.050]).
The power estimates were summarized in Figs. 2 and 3, and they were not μ_{R}dependent. As expected, the power increased with sample sizes for fixed standardized noninferiority difference δ, and larger values of δ led to a higher power estimate for fixed sample size.
Confidence interval based test
The results for the approach based on confidence intervals are summarized in Figs. 4, 5, 6, and 7. For 95% both one and twosided confidence interval levels, the estimate type I error rates remained around 0.05 and 0.025, respectively, and are more concentrated around those values as the sample sizes get larger. Then, for a given nominal type I error of α, the suitable confidence intervals level would be 1−α and 1−2α for one and twosided confidence intervals, respectively. The power (at the equivalence point, μ_{R}=μ_{N}) increases with the sample sizes, but the convergence to 1 seemed to require very large sample sizes. This is not the case for the test statistic based method. Therefore, in terms of power estimate, the approach based on the test statistic would perform better than the confidence intervals based approach.
The Stratall ANRS 12110 / ESTHER trial
The proposed methods were also applied to the Stratall ANRS 12110 / ESTHER tria, based on Observer and LOCF data, with a linear margin of \(\Delta _{L}(R)= \frac {25}{100} R\). The results for the approach based on the test statistic are summarized in Table 1. The pvalue is calculated based on the test statistic in Eq. (6). The statistical power was computed using Eq. (8) and based on the same inputs as in [17], which were μ_{N}=μ_{R}=140 and σ_{N}=σ_{R}=130. For the Observed data, the pvalue estimate was =0.02, and the null hypothesis that CLIN was inferior to the LAB was rejected at 0.05 level. On the other hand, for the LOCF data, the pvalue was =0.09, and the null hypothesis that CLIN was inferior to the LAB was not rejected at 0.05 level.
For the confidence intervalbased approach, the test was performed by considering the one and twosided confidence interval levels. The results are presented in Table 2. The null hypothesis that CLIN was inferior to LAB was not rejected for any of the confidence intervals used with “LOCF data.” On the other hand, when using “Observed data,” the null hypothesis of inferiority was not demonstrated.
The two proposed methods produced consistent results on the Stratall ANRS 12110 / ESTHER trial. Moreover, based on LOCF data, the obtained results are in line with those in [17]: the clinical monitoring alone was inferior to laboratory plus clinical monitoring.
Discussions
In this study, we have proposed two noninferiority test approaches for a continuous endpoints with flexible margins: a test based on a test statistic and a confidence interval based test. The confidence interval approach is more used in literature and recommended by the international guideline [2]. For the noninferiority test with continuous endpoints and fixed margin, some studies like [7] and [12] studied the confidence interval approach which does not allowed for explicit sample size calculation. Comparatively, our proposed test based on a statistic allows explicit calculation of sample size and power formula.
The simulation results for the confidence intervals based test showed that the confidence interval level determined approximatively the type I error rate. The test with 95% one and twosided confidence intervals level led to type I errors which were approximated by 0.05 and 0.025, respectively. Therefore, for a given nominal type I error α=0.05, the confidence intervals based test would be performed with one or twosided confidence intervals with 1−α or 1−2α levels, respectively; these findings are consistent with those in [7]. The noninferiority hypothesis test is a onetailed test, so when performing the testing procedure with the classical nominal type I error α, the actual type I error would be α/2. Therefore, for a given desired nominal type I error, to avoid the conservativeness of the test, the test should be performed with this nominal error times two. However, the debate on which of the one or twosided confidence intervals should be used in noninferiority trials remains open, which is discussed in [20].
The most important output of this study was the type I error which was not varying according to the value of reference treatment, either for the test based on a statistic or the test based on confidence intervals. This suggested that the variability and uncertainty around the margin were accounted for, without affecting the properties of the proposed tests. The proposed methods in this study could therefore be viewed as a generalization of the case where the noninferiority margin is fixed for continuous endpoints.
Conclusions
In an active controlled trial of noninferiority, the noninferiority margin should be a function of reference treatment to account for the uncertainty surrounding the mean estimate of reference treatment. This paper produced a framework on how to perform the noninferiority hypothesis test with a flexible margin. Based on type I one error rate and power estimates, the proposed noninferiority hypothesis test procedures have good performances and are applicable in practice, a practical application on clinical data was illustrative.
Availability of data and materials
The datasets used and analyzed during the current study are available from the corresponding author or the author named Christian Laurent (christian.laurent@ird.fr) on reasonable request.
Abbreviations
 CD4:

Cluster of differentiation 4
 CLIN:

Clinical monitoring alone
 HIV/AIDS:

Human immunodeficiency virus infection and acquired immune deficiency syndrome
 LAB:

Laboratory and clinical monitoring
 LOCF:

Last observation carried forward
References
Rothmann MD, Wiens BL, Chan IF. Design and analysis of noninferiority trials. Boca Raton: Taylor and Francis Group; 2012.
Food and Drug Administration. Noninferiority clinical trials to establish effectivenessGuidance for industry. US: Department of Health and Human Services; 2016.
Phillips KF. A new test of noninferiority for antiinfective trials. Stat Med. 2003; 22:201–12.
Kim MY, Xue X. Likelihood ratio and a Bayesian approach were superior to standard noninferiority analysis when the noninferiority margin varied with the control event rate. J Clin Epidemiol. 2004; 57:1253–61.
Zhang Z. Noninferiority testing with a variable margin. Biom J. 2006; 48:948–65.
Ng T. Noninferiority hypotheses and choice of noninferiority margin. Stat Med. 2008; 27:5392–406.
Elie C, Rycke YD, Jais JP, MarionGallois R, Landais P. Methodological and statistical aspects of equivalence and non inferiority trials. Rev Epidemiol Sante Publique. 2008; 56:267–77.
Tsong Y, Wang SJ, Hung HM, Cui L. Statistical issues on objectives, designs and analysis of noninferiority test active controlled clinical trials. J Biopharm Stat. 2003; 13:29–41.
Julious SA. Sample sizes for clinical trials with normal data. Stat Med. 2004; 23:1921–86.
Casella G, Berger RL. Statistical inference, 2nd ed. USA: Duxbury Advavanced Series; 2002.
Good P. Permutation, parametric and bootstrap tests of hypothesis. NewYork: Springer; 2005.
Wellek S. Testing statistical hypotheses of equivalence and noninferiority, 2nd ed. Boca Raton: Taylor and Francis Group; 2010.
Committee for Proprietary Medicinal Products. Point to consider on switching between superiority and noninferiority: European Medicines Agency (EMEA); 2000. https://www.ema.europa.eu/en/documents/scientificguideline/pointsconsiderswitchingbetweensuperioritynoninferiority_en.pdf.
Knezevic A. Overlapping confidence intervals and statistical significance: Cornell Statistical Consulting Unit Newsletter; 2008. https://cscu.cornell.edu/wpcontent/uploads/73_ci.pdf.
Flight L, Julious SA. Practical guide to sample size calculations: noninferiority and equivalence trials. Pharm Stat. 2016; 15(9):80–9.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2021. https://www.Rproject.org/.
Laurent C, Kouanfack C, LabordeBalen G, Aghokeng AF, Mbougua JB, Boyer S, et al.Monitoring of HIV viral loads, CD4 cell counts, and clinical assessments versus clinical monitoring alone for antiretroviral therapy in rural district hospitals in Cameroon (Stratall ANRS 12110/ESTHER): a randomised noninferiority trial. Lancet Infect Dis. 2011; 11:825–33.
Mugyenyi P, Walker AS, Hakim J, Munderi P, Gibb DM, Kityo C, et al.Routine versus clinically driven laboratory monitoring of HIV antiretroviral therapy in Africa (DART): a randomised noninferiority trial. Lancet. 2010; 375(9709):123–31.
Sanne I, Orrell C, Fox MP, Conradie F, Ive P, Zeinecker J, et al.Nurse versus doctor management of HIVinfected patients receiving antiretroviral therapy (CIPRASA): a randomised noninferiority trial. Lancet. 2010; 376(9734):33–40.
Dunn DT, Copas AJ, Brocklehurst P. Superiority and noninferiority: two sides of the same coin?Trials. 2018; 19:1–5.
Acknowledgements
ABS is grateful to the African Union; he was a recipient of a full scholarship for his doctoral studies.
Funding
No funding was obtained for this study.
Author information
Authors and Affiliations
Contributions
ABS, JBTM, and AW drafted the manuscript, proposed the methods, and analyzed the data. NM, CK, and CL produced the clinical data, read and edited the manuscript, and provided observations. The authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
This study involved an analysis of data that was already analyzed in a primary research work. A confidential agreement was done with the main investigators.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Sandie, A.B., Molinari, N., Wanjoya, A. et al. Noninferiority test for a continuous variable with a flexible margin in an active controlled trial: an application to the “Stratall ANRS 12110 / ESTHER” trial. Trials 23, 202 (2022). https://doi.org/10.1186/s1306302206118x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1306302206118x
Keywords
 Asymptotic test
 Active controlled trial
 Confidence interval
 Flexible margin
 Noninferiority