Skip to main content

The effect of covariate adjustment for baseline severity in acute stroke clinicaltrials with responder analysis outcomes



Traditionally in acute stroke clinical trials, the primary clinical outcomeemployed is a dichotomized modified Rankin Scale (mRS). New statisticalmethods, such as responder analysis, are being used in stroke studies toaddress the concern that baseline prognostic variables, such as strokeseverity, impact the likelihood of a successful outcome. Responder analysisallows the definition of success to vary according to baseline prognosticvariables, producing a more clinically relevant insight into the actualeffect of investigational treatments. It is unclear whether or notstatistical analyses should adjust for prognostic variables when responderanalysis is used, as the outcome already takes these prognostic variablesinto account. This research aims to investigate the effect of covariateadjustment in the responder analysis framework in order to determine theappropriate analytic method.


Using a current stroke clinical trial and its pilot studies to guidesimulation parameters, 1,000 clinical trials were simulated at varyingsample sizes under several treatment effects to assess power and type Ierror. Covariate-adjusted and unadjusted logistic regressions were used toestimate the treatment effect under each scenario. In the case ofcovariate-adjusted logistic regression, the trichotomized National Instituteof Health Stroke Scale (NIHSS) was used in adjustment.


Under various treatment effect settings, the operating characteristics of theunadjusted and adjusted analyses do not substantially differ. Power and typeI error are preserved for both the unadjusted and adjusted analyses.


Our results suggest that, under the given treatment effect scenarios, thedecision whether or not to adjust for baseline severity when using aresponder analysis outcome should be guided by the needs of the study, astype I error rates and power do not appear to vary largely between themethods. These findings are applicable to stroke trials which use the mRSfor the primary outcome, but also provide a broader insight into theanalysis of binary outcomes that are defined based on baseline prognosticvariables.

Trial registration

This research is part of the Stroke Hyperglycemia Insulin Network Effort(SHINE) trial, Identification Number NCT01369069.

Peer Review reports


Stroke is a potentially debilitating medical event that affects approximately 800,000people in the United States each year, leaving as many as 30% of survivorspermanently disabled [1]. Given this impact, there is great demand for treatments thatsignificantly improve functional outcome following a stroke. To date, few clinicaltrials for the treatment of acute stroke have succeeded; of over 125 acute strokeclinical trials, only three successful treatment methods have been identified [2, 3].

One of the possible reasons for the excessive number of neutral or unsuccessfulstroke trials is the definition of successful outcome utilized in the studies [4]. In clinical trials, stroke outcome is most commonly measured by themodified Rankin Scale (mRS) of global disability at 90 days. The mRS is a validand reliable measure of functional outcome following a stroke [5]. Past trials have dichotomized mRS scores into “success” and“failure”, scores of 0 to 1 (or 0 to 2) were considered to be“successes” while scores greater than 1 (or 2) were considered to be“failures,” regardless of baseline stroke severity [69]. This method fails to take into account the understanding that baselineseverity is highly correlated with outcome. New methods, such as the globalstatistic, shift analysis, permutation testing and responder analysis, are evolvingto make better use of the outcome data with the hopes of providing highersensitivity to detect true treatment effects [2, 4, 6, 917].

Responder analysis, also known as the sliding dichotomy, dichotomizes ordinaloutcomes into “success” and “failure,” but addresses thedrawbacks of traditional dichotomization by allowing the definition of success tovary by baseline prognostic variables. Various trials have implemented the responderanalysis where baseline severity is defined by one or many baseline prognosticfactors [1820]. Those study subjects in a less severe prognosis group at baseline mustachieve a better outcome to be considered a trial “success,” whereas aless stringent criterion for success is applied to subjects in a more severebaseline prognosis category. The currently enrolling Stroke Hyperglycemia InsulinNetwork Effort (SHINE) trial employs responder analysis for its primary efficacyoutcome [18].

The SHINE trial is a large, multicenter, randomized clinical trial designed todetermine the efficacy and safety of targeted glucose control in hyperglycemic acuteischemic stroke patients. While the methodological details of the SHINE trial arediscussed elsewhere [18], it should be noted that the primary outcome for efficacy is the baselineseverity adjusted 90-day mRS score dichotomized as “success” or“failure” according to a sliding dichotomy. Eligibility criteria forSHINE require that a subject’s baseline NIHSS score must be between 3 and 22,inclusively. Those with a “mild” prognosis, defined by a baseline NIHSSscore of 3 to 7, must achieve a 90-day mRS of 0 to be classified as a“success.” Those with a “moderate” prognosis, defined by abaseline NIHSS score of 8 to 14, must achieve a 90-day mRS of 0 to 1 to beclassified as a “success.” Finally, those subjects with a“severe” prognosis, defined by a baseline NIHSS score of 15 to 22, mustachieve a 90-day mRS of 0 to 2 to be classified as a “success.” By usingresponder analysis with a trichotomized NIHSS, the threshold for success isstringent for the milder strokes, while the moderate to severe strokes are allowedto have more residual deficits in the threshold for success.

One of the questions that arose from the trial’s Data and Safety MonitoringBoard was that of covariate adjustment. Statistical analyses often adjust forprognostic factors, or covariates, that may be predictive of the primary outcome,such as baseline severity [21, 22]; however, in the case of SHINE, this prognostic variable is also used todefine the outcome. While the literature provides many resources on the design andimplementation of responder analysis, as well as examples of trials which usedresponder analysis, there are no clear resources discussing whether or notstatistical analyses should be adjusted for the prognostic variables used to definesuccessful outcome.

This research aims to investigate the effect of covariate adjustment in the responderanalysis framework, particularly when the covariate is involved in the definition ofsuccessful outcome. The cut-points for the SHINE trial are clinically, rather thanstatistically, defined and so it is conceivable that adjustment for baselineseverity in the statistical analysis may account for additional variation andincrease the power to detect a true treatment effect. A simulation study isconducted to assess the operating characteristics (power and type I error) ofcategorically-adjusted and unadjusted analyses under several possible treatmenteffect scenarios. In addition, treatment effect estimates and their standard errorsare examined across the various scenarios. Since the primary outcome for the SHINEtrial is binary, we expect to see an increase of standard error on the treatmenteffect estimates, consistent with the findings of Robinson and Jewell [23]. However, also consistent with Robinson and Jewell, we expect to see thisincrease in standard error to be balanced by a movement of the treatment effectestimate away from the null hypothesis.

By examining the effect of covariate adjustment in responder analysis, we aim todefine the most appropriate statistical approach to identify true treatment effects.Our findings are not only applicable to the SHINE and other stroke trials which usethe mRS for the primary outcome, but also provide insight into the appropriate useof categorical baseline prognostic variables in other trials which use an ordinalscale as a primary outcome measure.


Simulation studies were performed to examine the performance of logistic regressionmodels that were unadjusted and adjusted by a trichotomized baseline severitycategory. Baseline severity category and criteria for successful outcome weredefined as in the SHINE trial described above, and are summarized inTable 1. The type I error rate and power werecalculated and compared for each method, as were the treatment effect estimates andtheir standard errors.

Table 1 Sliding dichotomy criteria for successful outcome in SHINE trial

The simulation parameters were guided by the SHINE trial design. A total of 1,000clinical trials were simulated at sample sizes ranging from 498 to 1,958. Thissample size range allowed us to cover the planned SHINE sample size of 1,400 whilealso examining model behavior at smaller and larger sample sizes. A 1:1randomization scheme was assumed for the purposes of this investigation. Allanalyses were performed using SAS version 9.2 (SAS Institute, Cary, NC, USA).

The prevalence of each baseline severity category was guided by data from two priorpilot trials of hyperglycemia management in acute stroke, the Glucose Regulation inAcute Stroke Patients (GRASP) [24] and Treatment of Hyperglycemia in Ischemic Stroke (THIS) [25] pilot trials. In the simulations, 42% of subjects were classified as“mild” at baseline, 32% classified as “moderate”, and theremaining 26% classified as “severe”. This distribution of prognosiscategories was imposed using a uniform (0, 1) random variable. In order to simulate90-day mRS scores for the control group, we examined the distribution of 90-day mRSscores for the control groups in the GRASP and THIS pilot trials. Though thesimulation of 90-day mRS scores was primarily driven by the results of the GRASP andTHIS pilot trials, the National Institute of Neurological Disorders and Stroketissue Plasminogen Activator (NINDS tPA) trial control data [26] were used to aid in the approximation of mRS outcome distributions withineach of the baseline severity strata. The NINDS tPA control data helped smooth thedistribution of mRS scores, as the GRASP and THIS pilot trials each had small samplesizes that resulted in several empty cells after baseline severity stratification.The exact control group distribution of 90-day mRS scores used in the simulationstudy is shown in Additional file 1: Table S1.

Type I error rates for each method of analysis were obtained by using the sameproportion of success for both the control and intervention groups, simulating thenull hypothesis of “no treatment effect”. In order to assess the powerof each method, a treatment effect was simulated in the data by altering the successprevalence for the intervention group. A 7% treatment effect was used, as this wasthe minimal clinically relevant absolute difference in favorable outcome between thetwo treatment groups in the SHINE study plan. For these analyses, power was examinedunder several scenarios as illustrated in Table 2: (1) a“flat” scenario, in which the 7% treatment effect was held constant overthe three baseline severity strata; (2) a “varying” scenario, in whichthe overall treatment effect is still 7%, but the magnitude within strata is varied,where the mild and moderate groups see the most benefit; (3) another“varying” scenario, in which the severe group sees the most benefit; (4)a “mild harm” scenario, where the mild group sees a harmful treatmenteffect; and (5) a “severe harm” scenario, in which the severe group seesa harmful treatment effect.

Table 2 Success prevalence for simulated treatment effect scenarios

In the first varying scenario, we applied an 8.6% treatment effect in the mildcategory, a 9% treatment effect in the moderate category and a 2% treatment effectin the severe category; that is, there was an 8.6% increase in prevalence of the 0mRS for the mild stratum, a 9% increase in the prevalence of the 0 to 1 range of mRSscores for the moderate stratum, and a 2% increase in the prevalence of the 0 to 2range of mRS scores for the severe stratum. This scenario is relevant to the SHINEtrial; it is similar to what we may observe if the investigational treatment islargely beneficial to mild and moderate stroke victims, but only marginallybeneficial to victims of severe stroke. The second varying treatment effect scenarioapplies an opposite effect in which the intensive glucose control intervention islargely beneficial to more severe strokes, but only slightly beneficial to thosesubjects having mild strokes. Additional file 1: Table S1shows the exact distribution of 90-day mRS scores for the treatment groups undereach of these treatment effect scenarios. These distributions were used to randomlyassign 90-day mRS scores to each simulated subject in each simulated trial, with theproportions of success following the scenarios in Table 2. Given a subject’s simulated baseline severity stratum (mild,moderate or severe), an assignment of “success” or “failure”was made according to the sliding dichotomy definitions.

Logistic regression was used to investigate each of these scenarios. The unadjustedcase models “success” as a function only of treatment group, while thecategorically-adjusted case models “success” as a function of treatmentgroup and severity category. Severity was defined as “mild,”“moderate” or “severe” based on the NIHSS prognosis groupdiscussed in the introduction. Power and type I error rate were based on theproportion of simulated trials at a given sample size which rejected the nullhypothesis at a nominal level of 0.05. The treatment effect and its standard errorwere estimated for each trial.


The type I error rate at each sample size for each analysis method is plotted inFigure 1. The nominal 5% reference line is shown,along with the upper and lower 95% confidence limits on this nominal level ofsignificance. The confidence limits were calculated using the formula for binomialproportion 95% confidence intervals. The confidence limits remain the same at eachsample size, as they are based on the number of trials at each sample size (1,000)rather than the sample size itself. The type I error rates for both the unadjustedand categorically-adjusted methods are within the 95% confidence limits for all thesample sizes, hovering close to the nominal 5% level.

Figure 1
figure 1

Significance levels of unadjusted and categorically-adjusted methods.

The first investigation of power was under a “flat” treatment effect of7% where the success rates in the control group were 25%, 35% and 15% in the mild,moderate and severe prognosis groups, respectively. The power estimates for this“flat” treatment effect scenario are plotted in Figure 2. The unadjusted and categorically-adjusted methods do notsignificantly differ, with the categorically-adjusted method having slightly greaterpower for most of the sample sizes. As planned by the SHINE study investigators, the80% power threshold is crossed between 650 and 700 subjects per arm (1,300 to 1,400subjects total).

Figure 2
figure 2

Power of unadjusted and categorically-adjusted methods under a flat 7%treatment effect.

The next two scenarios varied the treatment effects across the mild, moderate andsevere baseline categories as 8.6%, 9% and 2%, respectively and 2%, 9% and 12.6%,respectively. The power results for these two scenarios are shown inFigure 3. As in the flat treatment effect scenario,there is no drastic difference in the unadjusted and categorically-adjusted methodswith respect to power in these varying treatment effect scenarios.

Figure 3
figure 3

Power of unadjusted and categorically-adjusted methods under the firstand second varying 7% treatment effects.

As previously mentioned, it is conceivable that one of the prognosis groups mayexperience a slightly harmful treatment effect. When 2% harm is experienced ineither the mild or the severe baseline prognosis category, the unadjusted andadjusted analyses still appear to have a similar performance, as shown inFigure 4. In the mild harm scenario, the unadjustedand adjusted power curves are still nearly stacked upon one another, with the powercurve for the adjusted analysis pulling slightly above that of the unadjustedanalysis at a few points. A more noticeable difference can be seen in the severeharm scenario, where the adjusted analysis consistently has a slightly, though notdramatically, higher power than that of the unadjusted analysis.

Figure 4
figure 4

Power of unadjusted and categorically-adjusted methods under mild andsevere harm effects.

In addition to the plots in Figures 2, 3 and 4, we also observed the treatmentcoefficient estimates and their standard errors for the adjusted and unadjustedmodels under the various treatment effect scenarios at selected sample sizes. Thesample sizes of 498, 722, 946, 1,170 and 1,394 were chosen because they are theclosest sample sizes to those at which the planned interim and final analyses willtake place for SHINE. In addition to model estimates, the true treatment effectcoefficient was calculated by pooling the nominal log-odds ratios for each prognosisgroup. To visualize the bias of the estimate of each treatment effect parameter andtheir standard errors, the simulation mean squared error (MSE) was plotted againstthe squared bias in Figure 5. The MSE quantifies theaccuracy and precision of an estimate in terms of both the bias (the differencebetween the true and estimated treatment effect) and the variance of the estimate.By plotting the MSE against the squared bias, we can illustrate the adequacy of theestimator. In Figure 5, the squared bias is depicted onthe x-axis and the MSE on the y-axis. While the bias decreases with increasingsample size, the adjusted estimates of the treatment effect parameter areconsistently less biased than the unadjusted estimates. For smaller sample sizes,the MSEs for the adjusted analyses are negligibly larger than those for theunadjusted analyses. The treatment coefficients and standard errors are provided inAdditional file 2: Table S2.

Figure 5
figure 5

Mean squared error versus squared bias at selected sample sizes.


Successful stroke treatments are desperately needed given stroke’s large anddetrimental effect on the worldwide population. Consequently, statistical methodsthat offer high power to detect a true treatment effect are also needed. With thissimulation study, we sought to determine whether adjustment for baseline severitywithin the responder analysis setting would be beneficial or harmful in terms ofpower and type I error rates when compared to an unadjusted analysis.

The type I error rates did not differ substantially between the two methods. Theexperimental type I error rates for both of the methods stayed within the 95%confidence bounds. This is a welcomed result, as a test that is either too liberalor too conservative, (rejects the null hypothesis either more or less than thenominal level, respectively), has implications on the power of the test. Theoscillation around the nominal 5% level of significance is likely due to chance, andis to be expected in simulated data. Since neither method shows consistently largertype I error rates than the other, we can conclude that there is no meaningfuldifference between the two methods with respect to type I error.

The power appears to be approximately the same or slightly higher for the adjustedanalyses in the selected scenarios. In the cases where the power is slightly higher,the magnitude is not remarkable and offers little evidence to suggest that adjustingby the single covariate leads to significantly more power. Although the simulationstudy presented is not exhaustive and, therefore, does not provide additionalinsight regarding this, the literature by Choi and Hernández suggest that anincrease in power could occur as other important prognostic variables are added tothe model [27, 28]. It is reassuring, however, that neither method appears to be detrimentalto power under the given scenarios.

In terms of bias, the unadjusted analyses consistently underestimate the nominaltreatment effect, while the adjusted analyses tend to be less biased, but oftenslightly overestimate the nominal treatment effect. Given the magnitude of thecoefficient estimates and their standard errors, neither of these bias tendencies issubstantial. In terms of MSE, the two methods do not differ greatly as the samplesize increases. At the smaller sample sizes, the adjusted analyses have larger MSEvalues due to increased standard error; however, as the sample size increases, theMSE values for the two methods converge.

Though negligible differences were identified between the adjusted and unadjustedmodels, researchers should keep the randomization scheme of the study in mind whendeciding whether or not to adjust for baseline severity. In general, it is advisableto “analyze as you randomize,” meaning that any variable used as astratification variable during randomization should be included as a covariate inthe analysis in order to preserve nominal type I error rates and power [22, 29]. Baseline severity is often used as a stratification variable in therandomization of acute stroke clinical trials, and should be included as a covariatein these cases.

It is important to note that these analyses adjust categorically for baselineseverity. The categories - mild, moderate and severe - are defined by the NIHSSscore, which is a larger scale ranging from 0 to 42 (limited to 3 through 22 inSHINE’s inclusion criteria). A one-unit change in the NIHSS cannot easily beinterpreted, as this change may have different meanings depending on the combinationof neurological impairments and location along the scale. Despite this issue, theNIHSS is sometimes used as a continuous measure in the literature [30, 31]. This is not necessarily straightforward and should be done with caution.It is possible that adjusting by the actual NIHSS score will provide additionalinformation to the model and increase or maintain power in some treatment effectscenario(s). However, due to uncertainties in the clinical interpretation of acontinuous NIHSS variable, adjustment by actual NIHSS score has been left as a topicfor future research.

Adjustment for other baseline prognostic variables may also impact study power underthe given scenarios. The inclusion of additional covariates that were not used indefining the primary outcome has not been examined in these scenarios, as it isoutside the primary focus of this research. It is conceivable that the addition ofmultiple covariates could reduce overall power due to the increasing standard erroron the treatment effect estimate, as studied by Robinson and Jewell [23] and discussed in the Background section of this paper.


Our results show negligible differences between analysis methods in the responderanalysis setting, suggesting that in most treatment effect scenarios, adjustment forbaseline severity in the primary analyses may best be guided by individual studyneeds rather than a blanket guideline for all studies. Though we have not shown theresults here, we did examine other treatment effect scenarios which yield similarresults. These scenarios included a flat and varying 15% treatment effect (insteadof the 7% specified in the SHINE study plan), as well as a scenario in which themild group experienced 5% harm.

Overall, these results shed light on the important concept of adjustment in thecontext of responder analysis. Though this study only examined a single severityscale, its findings are not restricted to use in stroke studies; they can provideinsight into the treatment of categorical baseline prognostic covariates in otherstudies which use responder analysis to define their primary outcome ofinterest.

Authors’ information

KG recently completed her master’s degree in biostatistics from the MedicalUniversity of South Carolina. This manuscript is a result of her master’sthesis and her work as a graduate student on the SHINE grant. The other authors wereher committee members and VD was her primary mentor.



Glucose Regulation in Acute Stroke Patients


modified Rankin Scale


mean square error


National Institute of Health Stroke Scale


National Institute of Neurological Disorders and Stroke tissue PlasminogenActivator

SHINE trial:

Stroke Hyperglycemia Insulin Network Effort trial


Treatment of Hyperglycemia in Ischemic Stroke.


  1. Sacco RL, Frieden TR, Blakeman DE, Jauch EC, Mohl S: What the Million Hearts Initiative means for stroke: a presidential advisoryfrom the American Heart Association/American Stroke Association. Stroke. 2012, 43: 924-928. 10.1161/STR.0b013e318248f00e.

    Article  PubMed  Google Scholar 

  2. Saver JL: Optimal endpoints for acute stroke therapy trials: best ways to measuretreatment effects of drugs and devices. Stroke. 2011, 42: 2356-2362. 10.1161/STROKEAHA.111.619122.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Hong KS, Lee SJ, Hao Q, Liebeskind DS, Saver JL: Acute stroke trials in the 1st decade of the 21st century. Stroke. 2011, 42: e314-

    Article  Google Scholar 

  4. Bath PM, Gray LJ, Collier T, Pocock S, Carpenter J, The Optimising Analysis of Stroke Trials (OAST) Collaboration: Can we improve the statistical analysis of stroke trials? Statisticalreanalysis of functional outcomes in stroke trials. Stroke. 2007, 38: 1911-1915.

    Article  PubMed  Google Scholar 

  5. Banks JL, Marotta CA: Outcomes validity and reliability of the modified Rankin scale: implicationsfor stroke clinical trials. Stroke. 2007, 38: 1091-1096. 10.1161/01.STR.0000258355.23810.c6.

    Article  PubMed  Google Scholar 

  6. Saver JL: Novel end point analytic techniques and interpreting shifts across the entirerange of outcome scales in acute stroke trials. Stroke. 2007, 38: 3055-3062. 10.1161/STROKEAHA.107.488536.

    Article  PubMed  Google Scholar 

  7. Young FB, Lees KR, Weir CJ: Strengthening acute stroke trials through optimal use of disability endpoints. Stroke. 2003, 34: 2676-2680. 10.1161/01.STR.0000096210.36741.E7.

    Article  PubMed  Google Scholar 

  8. Murray GD, Barer D, Choi S, Fernandes H, Gregson B, Lees KR, Maas AI, Marmarou A, Mendelow AD, Steyerberg EW, Taylor GS, Teasdale GM, Weir CJ: Design and analysis of phase III trials with ordered outcome scales: theconcept of the sliding dichotomy. J Neurotrauma. 2005, 22: 511-517. 10.1089/neu.2005.22.511.

    Article  PubMed  Google Scholar 

  9. Savitz SI, Lew R, Bluhmki E, Hacke W, Fisher M: Shift analysis versus dichotomization of the modified Rankin scale outcomescores in the NINDS and ECASS-II trials. Stroke. 2007, 38: 3205-3212. 10.1161/STROKEAHA.107.489351.

    Article  CAS  PubMed  Google Scholar 

  10. Kasner SE: Clinical interpretation and use of stroke scales. Lancet Neurol. 2006, 5: 603-612. 10.1016/S1474-4422(06)70495-1.

    Article  PubMed  Google Scholar 

  11. Tilley BC, Marler J, Geller NL, Lu M, Legler J, Brott T, Lyden P, Grotta J: Use of a global test for multiple outcomes in stroke trials with applicationto the National Institute of Neurological Disorders and Stroke t-PA StrokeTrial. Stroke. 1996, 27: 2136-2142. 10.1161/01.STR.27.11.2136.

    Article  CAS  PubMed  Google Scholar 

  12. Cobo E, Secades JJ, Miras F, Gonzalez JA, Saver JL, Corchero C, Rius R, Dàvalos A: Boosting the chances to improve stroke treatment. Stroke. 2010, 41: e143-e150. 10.1161/STROKEAHA.109.567404.

    Article  PubMed  Google Scholar 

  13. Saver JL, Gornbein J: Treatment effects for which shift or binary analyses are advantageous inacute stroke trials. Neurology. 2009, 72: 1310-1315. 10.1212/01.wnl.0000341308.73506.b7.

    Article  PubMed  PubMed Central  Google Scholar 

  14. McHugh GS, Butcher I, Steyerberg EW, Marmarou A, Lu J, Lingsma HF, Weir J, Maas AI, Murray GD: A simulation study evaluating approaches to the analysis of ordinal outcomedata in randomized controlled trials in traumatic brain injury: results fromthe IMPACT Project. Clin Trials. 2010, 7: 44-57. 10.1177/1740774509356580.

    Article  PubMed  Google Scholar 

  15. Howard G, Waller JL, Voeks JH, Howard VJ, Jauch EC, Lees KR, Nichols FT, Rahlfs VW, Hess DC: A simple, assumption-free, and clinically interpretable approach for analysisof modified Rankin outcomes. Stroke. 2012, 43: 664-669. 10.1161/STROKEAHA.111.632935.

    Article  PubMed  Google Scholar 

  16. Saver JL, Yafeh B: Confirmation of tPA treatment effect by baseline severity-adjusted end pointreanalysis of the NINDS-tPA stroke trials. Stroke. 2007, 38: 414-416. 10.1161/01.STR.0000254580.39297.3c.

    Article  CAS  PubMed  Google Scholar 

  17. Young FB, Lees KR, Weir CJ: Improving trial power through use of prognosis-adjusted end points. Stroke. 2005, 36: 597-601. 10.1161/01.STR.0000154856.42135.85.

    Article  PubMed  Google Scholar 

  18. Bruno A, Durkalski VL, Hall CE, Juneja R, Barsan WG, Janis S, Meurer WJ, Fansler A, Johnston KC: The stroke hyperglycemia insulin network effort (SHINE) trial; design andmethodology. Int J Stroke. In press,

  19. den Hertog HM, van der Worp HB, van Gemert HM, Algra A, Kappelle LJ, van Gijn J, Koudstaal PJ, Dippel DW, PAIS Investigators: The Paracetamol (Acetaminophen) In Stroke (PAIS) trial: a multicentre,randomised, placebo-controlled, phase III trial. Lancet Neurol. 2009, 8: 434-440. 10.1016/S1474-4422(09)70051-1.

    Article  CAS  PubMed  Google Scholar 

  20. Adams HP, Effron MB, Torner J, Davalos A, Frayne J, Teal P, Leclerc J, Oemar B, Padgett L, Barnathan ES, Hacke W: Emergency administration of abciximab for treatment of patients with acuteischemic stroke: results of an international phase III trial. Stroke. 2008, 39: 87-99. 10.1161/STROKEAHA.106.476648.

    Article  CAS  PubMed  Google Scholar 

  21. Piantadosi S: Clinical Trials: A Methodological Perspective. 2005, Hoboken, New Jersey: John Wiley and Sons, 470-473. 2

    Book  Google Scholar 

  22. Harrell FE: The Role of Covariable Adjustment in the Analysis of ClinicalTrials. [],

  23. Robinson LD, Jewell NP: Some surprising results about covariate adjustment in logistic regressionmodels. Int Stat Rev. 1991, 58: 227-240.

    Article  Google Scholar 

  24. Johnston KC, Hall CE, Kissela BM, Bleck TP, Conaway MR, GRASP Investigators: Glucose Regulation in Acute Stroke Patients (GRASP) trial: a randomized pilottrial. Stroke. 2009, 40: 3804-3809. 10.1161/STROKEAHA.109.561498.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Bruno A, Kent TA, Coull BM, Shankar RR, Saha C, Becker KJ, Kissela BM, Williams LS: Treatment of Hyperglycemia in Ischemic Stroke (THIS): a randomized pilottrial. Stroke. 2008, 39: 384-389. 10.1161/STROKEAHA.107.493544.

    Article  CAS  PubMed  Google Scholar 

  26. The National Institute of Neurological Disorders and Stroke rt-PAStroke Study Group: Tissue plasminogen activator for acute ischemic stroke. N Eng J Med. 1995, 333: 1581-1588.

    Article  Google Scholar 

  27. Choi SC: Sample size in clinical trials with dichotomous endpoints: use ofcovariables. J Biopharm Stat. 1998, 8: 367-375. 10.1080/10543409808835246.

    Article  CAS  PubMed  Google Scholar 

  28. Hernández AV, Steyerberg EW, Butcher I, Mushkudiani N, Taylor GS, Murray GD, Marmarou A, Choi SC, Lu J, Habbema DF, Maas AI: Adjustment for strong predictors of outcome in traumatic brain injury trials:25% reduction in sample size requirements in the IMPACT study. J Neurotrauma. 2006, 23: 1295-1303. 10.1089/neu.2006.23.1295.

    Article  PubMed  Google Scholar 

  29. Committee for Proprietary Medicinal Products: Committee for ProprietaryMedicinal Products (CPMP): points to consider on adjustment for baseline covariates. Stat Med. 2004, 23: 701-709.

    Article  Google Scholar 

  30. Lin HJ, Chang WL, Tseng MC: Readmission after stroke in a hospital based registry: risk, etiologies, andrisk factors. Neurology. 2011, 76: 438-443. 10.1212/WNL.0b013e31820a0cd8.

    Article  PubMed  Google Scholar 

  31. Dai DF, Thajeb P, Tu CF, Chiang FT, Chen CH, Yang RB, Chen JJ: Plasma concentration of SCUBE1, a novel platelet protein, is elevated inpatients with acute coronary syndrome and ischemic stroke. J Am Coll Cardiol. 2008, 51: 2173-2180. 10.1016/j.jacc.2008.01.060.

    Article  CAS  PubMed  Google Scholar 

Download references


This work was funded by both the SHINE trial NIH/NINDS grant 1U01-NS069498 andthe Neurological Emergencies Treatment Trial (NETT) Statistical and DataManagement Center NIH/NINDS grant U01-NS059041.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Valerie L Durkalski.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

KG carried out the simulation programming and interpretation, manuscript drafting andfinalization. SY helped conceive the study concept, assisted in simulationprogramming and interpretation, and aided in manuscript drafting and finalization.VR helped conceive the study concept, aided in statistical interpretation, as wellas manuscript drafting and finalization. KJ and EJ assisted with study concept,design, clinical interpretation, manuscript drafting and finalization. VD helpedconceive the study concept, aided in design, analysis and interpretation, as well asmanuscript drafting and finalization. VD provided overall supervision as the primarymentor of the first author. All authors read and approved the final manuscript.

Electronic supplementary material

Additional file 1: Table S1: Distribution of 90-day mRS scores. (DOCX 14 KB)


Additional file 2: Table S2: Treatment coefficient estimates and their standard errors for unadjusted andadjusted methods under different treatment effect scenarios. (DOCX 14 KB)

Authors’ original submitted files for images

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Garofolo, K.M., Yeatts, S.D., Ramakrishnan, V. et al. The effect of covariate adjustment for baseline severity in acute stroke clinicaltrials with responder analysis outcomes. Trials 14, 98 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: