Dealing with heterogeneity of treatment effects: is the literature up to the challenge?
© Gabler et al; licensee BioMed Central Ltd. 2009
Received: 03 October 2008
Accepted: 19 June 2009
Published: 19 June 2009
Some patients will experience more or less benefit from treatment than the averages reported from clinical trials; such variation in therapeutic outcome is termed heterogeneity of treatment effects (HTE). Identifying HTE is necessary to individualize treatment. The degree to which heterogeneity is sought and analyzed correctly in the general medical literature is unknown. We undertook this literature sample to track the use of HTE analyses over time, examine the appropriateness of the statistical methods used, and explore the predictors of such analyses.
Articles were selected through a probability sample of randomized controlled trials (RCTs) published in Annals of Internal Medicine, BMJ, JAMA, The Lancet, and NEJM during odd numbered months of 1994, 1999, and 2004. RCTs were independently reviewed and coded by two abstractors, with adjudication by a third. Studies were classified as reporting: (1) HTE analysis, utilizing a formal test for heterogeneity or treatment-by-covariate interaction, (2) subgroup analysis only, involving no formal test for heterogeneity or interaction; or (3) neither. Chi-square tests and multiple logistic regression were used to identify variables associated with HTE reporting.
319 studies were included. Ninety-two (29%) reported HTE analysis; another 88 (28%) reported subgroup analysis only, without examining HTE formally. Major covariates examined included individual risk factors associated with prognosis, responsiveness to treatment, or vulnerability to adverse effects of treatment (56%); gender (30%); age (29%); study site or center (29%); and race/ethnicity (7%). Journal of publication and sample size were significant independent predictors of HTE analysis (p < 0.05 and p < 0.001, respectively).
HTE is frequently ignored or incorrectly analyzed. An iterative process of exploratory analysis followed by confirmatory HTE analysis will generate the data needed to facilitate an individualized approach to evidence-based medicine.
Randomized controlled trials (RCTs) are the cornerstone of evidence-based medicine. Such trials rely on random assignment to alternative treatment groups to control for baseline patient factors that could affect outcomes. The resulting estimate of the average treatment effect is an average of the individual treatment effects (ITEs) for participants in the study. While estimates of the average treatment effect are generally useful, some treated individuals, both within and outside of clinical trials, will experience more or less benefit than the reported average. Such variation in treatment effect is termed heterogeneity of treatment effects (HTE) [1, 2].
HTE may be quantitative (subgroup effects in the same direction as the average effect but varying in magnitude) or qualitative (treatment effects in different directions in different subgroups, where treatment is beneficial in some subgroups and harmful in others). The prevalence of HTE is unknown and perhaps unknowable, but highly variable treatment response rates for many common conditions suggest it is substantial. [3, 4] For example, Allen Roses, the vice president of genetics at GlaxoSmithKline, has stated, "Our drugs don't work on most patients" . Several empirical demonstrations of HTE have recently been published, including studies of ischemic stroke , risk reduction by carotid endarterectomy [7, 8], and diabetes . While qualitative HTE may be uncommon, quantitative HTE should not be dismissed, because even modest variations in the magnitude of net treatment benefits may have important implications for patient care and cost-effectiveness.
HTE can be assessed in several ways. The most direct approach is the n-of-1 clinical trial, which assigns individual patients to receive alternative treatment in a randomly predetermined sequence [10, 11]. Results from a series of such trials can be aggregated to assess heterogeneity in the population. However, n-of-1 trials are applicable to a relatively small subset of conditions and treatments [10, 11] and are subject to random within-patient variability (thus requiring a careful design and repeated crossovers) [12, 13]. A second approach is to stratify patients according to risk of disease-related adverse events [14, 15]. A third approach, typically performed for purposes of hypothesis generation rather than testing, entails a careful examination of subgroups within RCTs.
Subgroup analysis can be perilous. Real effects can be missed because of inadequate statistical power [16, 17], and reported effects may be spurious because of the performance of multiple statistical tests (13–16) and/or due to random intra-individual variability [12, 13]. Random intra-individual variability is especially problematic because it is not possible to estimate this variability in parallel group trials, the most common type of clinical trial design. In parallel group trials, participants are only randomized to one treatment and do not crossover to alternative treatments. As such, it is not possible to estimate any variation that occurs within a participant. In recognition of the drawbacks of subgroup analysis, the Consolidated Standards of Reporting Trials (CONSORT) statement warns that subgroup analyses, especially post hoc subgroup comparisons, "do not have great credibility" .
On the other hand, it has been claimed that nearly everything we have learned from epidemiology resulted from subgroup analysis . While this conclusion applies most obviously to observational studies, careful scrutiny of subgroup-specific effects in randomized trials has generated important new hypotheses and sometimes directly influenced practice. Nevertheless, subgroup analyses are not always performed correctly. Some studies (e.g. [20–23]) report results by subgroups, without any statistical testing or interval estimation for the difference across subgroups; these studies do not provide quantitative information on HTE per se. Other studies report p-values corresponding to each subgroup, subsequently claiming that the treatment effect differs across subgroups because it is statistically significant in one subgroup and not in another . However, both treatment effect and sample size influence the p-value, such that similar effect sizes within each subgroup might generate markedly different p-values. Instead of comparing the p-values across subgroups, the appropriate way to identify significant HTE is to make statistical comparisons for treatment effects across subgroups, using a test for heterogeneity or interaction [7, 17, 18, 25–27].
The tension between needing to understand HTE and lacking the statistical power to properly examine it presents difficulties for researchers, clinicians, and patients. While some experts have offered general encouragement to perform more HTE analyses [28, 29], the literature is relatively silent on how to manage the risks of over- and under-testing. Kraemer et al. [25, 30, 31] have suggested a sequential approach that could shed light on possible HTE. Defining treatment modifiers as factors that influence the treatment effect size across subgroups, they propose that all RCTs use exploratory interaction analyses as a method to generate hypotheses regarding moderators of treatment effects. The presence of strong moderator effects would encourage future researchers to perform adequately powered confirmatory studies stratified prospectively on these moderators. While the proposal of Kraemer et al. makes sense, several small reviews, most published well before the revised CONSORT statement, suggest that testing for HTE is reported in only 25% to 50% of RCTs [18, 27, 32–35].
We undertook the current review of the prevalence of HTE analyses in a comparatively larger sample of articles published in the general medical literature in order to assess trends over time, examine the appropriateness of the statistical methods used, and explore the predictors of such analyses. A persistently low rate of appropriate HTE or interaction analysis would suggest missed opportunities for identifying HTE.
We conducted a literature sample of RCTs published in five prominent general medical journals during 1994, 1999, and 2004. The search strategy and abstraction forms incorporated input from a Project Advisory Committee. Human Subjects committee approval was not required. The study was funded by Pfizer, Inc., under a contract to the academic institutions involved. However, the investigators were solely responsible for all aspects of study design, data collection and analysis, and result reporting.
Data Sources and Searches
To be included in our sample, a trial had to meet the following criteria: (1) human study population; (2) parallel group RCT (including matched pair trials) or a crossover RCT (including n-of-1 trials); and (3) individual patient or time (treatment episode) within patient (for crossover trials) was the unit of randomization. We excluded trials that used cluster randomization because these trials generally focus on group- or organizational-level treatment effects.
Data Extraction, Measures, and Hypotheses
All data were abstracted independently by two trained abstractors. Any disagreements were adjudicated by a senior investigator. We used a standard protocol, form, and database that collected the following information: first author's name, article identification number, trial number (if more than one trial was reported in a particular article), condition under study (e.g. cardiovascular disease, cancer), country of first author's institution, continents from which the participants were derived, total number of participants randomized, number and percent of male participants, age of participants (mean, median, standard deviation, range, reported categories), race of participants (number and percent), and number of treatment arms. For each arm, the following information was collected: number of participants; description of treatment provided (i.e. drug, medical device, surgical procedure); and gender, age, and race of participants.
The use (or non-use) of HTE analysis was the primary outcome. In addition, we also examined the presence of subgroup-only analysis, and either subgroup or HTE analysis, as secondary outcomes. Subgroup-only analyses represent missed opportunities on the path to understanding HTE; with minor effort, studies that reported subgroup-only analyses could have conducted formal HTE analyses and provided direct information on HTE. The additional step (of conducting a formal HTE analysis as opposed to a subgroup-only analysis) is important because it will provide important hypothesis-generating information for future studies. We therefore identified all trials as reporting either (1) HTE analysis, utilizing a formal test for heterogeneity or interaction; (2) subgroup-only analysis, with no formal statistical tests for heterogeneity or interaction, or (3) neither. For trials that reported HTE analysis, the covariates examined were also recorded. These covariates were later categorized by one investigator with consultation from a second as needed, into the following categories: age, gender, race/ethnicity, center/trial site/country, individual clinical risk factor, multivariable risk index, co-occurring treatment, comorbidity, and socioeconomic status (income, marital status, and education). Individual clinical risk factors were further categorized as being related to prognosis, treatment responsiveness, or treatment vulnerability . We also coded whether the authors presented information using a Forest Plot (a graph depicting subgroup results as point estimates [boxes] and confidence intervals [lines]) .
Potential predictors of HTE and subgroup-only analysis included journal name, year, condition studied, geographic region of the first author's home institution, trial design (either parallel or crossover RCT), and sample size (in quintiles). We expected that different journals might have different reporting policies, possibly influenced by prevailing norms of the country of publication, which in turn might be confounded with the first author's geographic region. We also hypothesized that HTE analysis might increase over time as CONSORT standards were disseminated, and that HTE analysis would increase with sample size. The revised CONSORT standards  made a clear distinction between subgroup and HTE analysis, citing a test of interaction as the correct, and stronger, analytic technique. The Statement elaborated on the original 1996 guidelines, and emphasized the incorrectness of comparing subgroup-specific p-values as a method of inferring treatment heterogeneity. Thus, we expected to see in increase in HTE analysis subsequent to 2001, with a concomitant decrease in subgroup analyses. Finally, we hypothesized that trials of common conditions with well-defined prognostic indicators might be more likely to evaluate these indicators.
The relationships of HTE reporting with study characteristics were assessed in two-way contingency table analyses by using Pearson Χ2 tests or Fisher's exact test when sample size was small, while trends (where appropriate) were assessed using the Mantel-Hantzel Χ2 test for trend. Logistic regression was used to determine predictors of HTE analysis. Significance of study characteristics in relation to use of HTE analysis was assessed with a Wald Χ2 test. To further explore HTE reporting characteristics, we separately examined the association between study characteristics (other than sample size) and HTE reporting in articles above and below the median sample size (262 subjects). Wald Χ2 tests were used to determine significant differences among these categories. SAS version 9 was used for all computations .
We conducted two sensitivity analyses. First, to assess whether our results were sensitive to possible clustering effects arising from multiple trials per article, we re-ran the logistic regression analysis after randomly selecting only one trial from each article that reported on more than one trial. Second, we recalculated reporting of HTE and subgroup analysis after restricting our analysis to those trials meeting specific sample size criteria (overall sample size greater than 250 and at least 100 participants per arm, making reasonably-powered subgroup analyses feasible). Furthermore, we recalculated results for gender and race/ethnicity after restricting to trials with an overall sample size greater than 250, at least 100 participants per arm, and at least 25% participants in the second largest gender or racial/ethnic subgroup.
Out of the 379 articles identified by our search, 303 met our inclusion criteria and 76 did not (Figure 1). Some articles reported more than one trial, and abstraction occurred on the trial level. Thus, the 303 eligible articles represented 319 eligible RCTs. Twelve articles reported on more than one trial: ten articles reported on two trials each, and two articles reported on four trials each. Eighty-six trials were excluded, including 77 from the 76 ineligible articles, and 9 from the 303 eligible articles. (An eligible article could contain both eligible and ineligible trials; we kept the eligible trials and excluded the ineligible trials in these articles.) The most common reason for exclusion was a trial design other than RCT (n = 61 trials), typically a cohort or case control design. The next most common reason was unit of randomization other than the individual or treatment episode within an individual (n = 25 trials).
Characteristics of included articles (n = 303) and RCTs described in these articles (n = 319)*
# RCTs Included
Journal of publication
Year of publication
Medical condition under study
First author's study region
Subgroup without statistical comparison
262 (101 – 708)
Among studies reporting HTE analysis on at least one named covariate (n = 91), 47% analyzed one covariate, 26% analyzed 2–4 covariates, 19% analyzed 5–10 covariates, and 8% analyzed more than 10 covariates. Individual risk factors for disease occurrence or progression (e.g. smoking status, creatinine level, CD4 count) were analyzed in 56% of studies, age in 29%, study site or region in 29%, concurrent treatment in 25%, and comorbid medical conditions in 21%. The 51 articles that reported HTE based on individual risk factors for disease occurrence or progression examined a total of 159 variables. Of these variables, 91% were prognostic risk factors, 25% were related to treatment responsiveness, and 4% were factors related to vulnerability to adverse outcomes (some RCTs examined multiple individual clinical risk factors). Treatment by gender interactions were evaluated in 30% of studies in which both genders participated; treatment by race/ethnicity interactions were assessed in 7% of studies involving more than one race/ethnicity. Despite increased recognition of the value of multivariable risk indices in HTE analyses [6, 9, 40–43] only three studies [44–46] evaluated outcomes of treatment stratified by multivariable risk. When examined by sample size quintile, we found that even studies in the smallest quintile (median = 37 participants) examined up to 9 covariates for HTE.
HTE reporting by study characteristics (n = 319 RCTs in 303 articles)
No. (%) Reporting HTE
No. (%) Reporting Either subgroup or HTE
Journal of publication
Year of publication
Medical condition under study
First author's study region
Quintile 1 (median = 37)
Quintile 2 (median = 124)
Quintile 3 (median = 263)
Quintile 4 (median = 549)
Quintile 5 (median = 1560)
Logistic regression results examining predictors of HTE or subgroup analysis
Predict HTE analysis
OR (95% CI)*
Journal of publication
Year of publication
Medical condition under study
First author's study region
Quintile 1 (median = 37)
Quintile 2 (median = 124)
Quintile 3 (median = 263)
Quintile 4 (median = 549)
Quintile 5 (median = 1560)
In the sensitivity analysis, there were 153 studies with sample size of at least 250 with a minimum of 100 subjects per trial arm. Sixty-one of these trials (40%) reported HTE analysis, 47 (31%) reported subgroup analysis only, and 45 studies (29%) did not report either type of analysis. Among 104 trials with at least 25 male subjects and 25 female subjects per arm, 14 trials (13.5%) examined gender for HTE, compared with 3/33 (9%) among those not meeting these minimal sample size criteria. Only 5/34 (15%) of trials meeting minimal sample size criteria for race/ethnicity (see Methods) conducted an HTE analysis with respect to race/ethnicity.
This review of 319 RCTs published in five prominent general medical journals is the most comprehensive to date, and the only one that examines trends of HTE reporting over time. The results suggest that reporting on HTE occurs in less than one-third of studies published in prominent general medical journals, and were only marginally better in 2004 than in 1994. Overall, less than one-third of studies in our sample reported HTE analysis, a result consistent with previous, less comprehensive reviews. Another 28% reported subgroup-only analyses without formal statistical tests for HTE/interaction.
These trials are missed opportunities. With minimal additional effort, they could have added statistical tests for HTE or interaction in addition to the subgroup results that they reported, nearly doubling the proportion of HTE analyses and enriching the literature with much-needed HTE information. Such tests are critical for appropriate interpretation of results, as differences in subgroup-specific point estimates are meaningful only when evaluated alongside their corresponding confidence intervals.
Considering both HTE analyses and subgroup-only analyses, 57% of the trials in our review reported some kind of subgroup analysis, a proportion that increased to about three-quarters if we examine only the largest trials. Previous research reported equal  or higher [27, 32, 33, 35] proportions, possibly because of restriction to trials of a minimum sample size, a specific discipline, or a specific journal.
Among trials that explored heterogeneity of treatment effects, clinical prognostic factors were evaluated frequently; age, gender, and site factors less often; and race/ethnicity rarely. Even among those trials with a sample size adequate for exploring HTE by race/ethnicity, only 7% of trials did so. The limited attention to race/ethnicity is puzzling for two reasons. First, the literature provides examples in which race/ethnicity influences baseline risk of a disease , responsiveness to treatment , and vulnerability to adverse outcomes . Second, growing interest in genomics might be expected to stimulate interest in the treatment-modifying effects of genetic proxies, including imperfect ones like race/ethnicity. Consistent with Parker et al. , we found little use of multivariate risk stratification, an approach that may greatly increase statistical power for detecting HTE .
More frequent HTE analysis in North American journals may reflect differences in biostatistical perspectives or biomedical culture. The dominant norm tends to be more conservative in Europe and especially in Britain, perhaps as a result of public payment for care, which demands a higher standard of evidence before treatments are widely accepted and delivered [52–54]. The trend for increased HTE analysis between 1994 and 2004 may be attributable to the revised CONSORT recommendations and a growing awareness of the potential of such analyses. The relatively infrequent use of Forest Plots, even among studies reporting HTE analysis, is regrettable because these plots are a simple, compact, and readily understood method of presenting potential moderators.
Our results should be interpreted in light of several limitations. First, we reviewed a limited number of trials. It is possible that subgroup-specific trials were published in other journals, or during months or years that we did not sample. However, our sample included five large-circulation, high-impact journals with strict peer-review standards, so our sample should be representative of well-designed clinical trials that generalist clinicians are most likely to read and that major news media are most likely to publicize. Second, it is possible that some trials conducted HTE analysis but did not report it because the results were not statistically significant. However, even non-significant data are useful for the purposes of hypothesis generation and, arguably, authors should report any HTE analysis, significant or not, and especially when the analysis is pre-specified. Third, our standards for HTE reporting may not reflect journals' own standards for HTE reporting. A more conservative statistical review process may result in reduced HTE reporting, regardless of the analysis actually conducted in the study. Finally, our recommendations for authors and editors are based on an informal procedure, and should be interpreted in light of this limitation. Further refinement of the recommendations may be necessary before adoption by editors and authors.
Our findings indicate that HTE reporting in the general medical literature is neither rigorous nor routine. Given the increasing recognition of HTE [2, 3, 25], it may be time to develop standards for reporting of exploratory and confirmatory HTE analysis. In 1994, the National Institutes of Health mandated the inclusion of women and racial/ethnic minorities in research populations and, in 2000, supplemented that recommendation with guidelines regarding the reporting of subgroup-specific results of Phase III Clinical Trials . Although these guidelines included a recommendation that investigators report both significant and non-significant results, our data show only modest progress toward that goal. Highlighting variables that deserve further exploration is a first step in identifying groups that may or may not respond better to a given therapy . Because trials in more responsive subgroups have lower sample size requirements, identifying these groups through exploratory subgroup analysis could facilitate relatively cost-effective confirmatory trials. An iterative process of exploratory followed by confirmatory HTE analysis may not only quicken the cycle of discovery but also inform clinical judgment.
heterogeneity of treatment effects
randomized controlled trials
individual treatment effect
Consolidated Standards of Reporting Trials
British Medical Journal
Journal of the American Medical Association
New England Journal of Medicine.
We are grateful to Paul Shekelle, MD (RAND/UCLA, Los Angeles, CA) and Helena Kraemer, PhD (Stanford School of Medicine, Palo Alto, CA) for their helpful editorial input, to Irva Hertz-Picciotto, PhD (UC Davis, Davis, CA) for her helpful input while serving as a member of the Advisory Panel, and to Danielle Seiden, MPP (UCLA, Los Angeles, CA), Elizabeth Yakes, MS, RD (UC Davis), and Kiavash Nikkhou, BS (UCLA) for their contributions to data collection and abstraction.
The study was funded by Pfizer, Inc., under a contract to the academic institutions involved. However, the investigators were solely responsible for all aspects of study design, data collection and analysis, and result reporting.
Dr. Elmore is supported by a Public Health Research grant from the National Institutes of Health (#5 K05 CA104699-04). Dr. Kravitz holds a K24 Midcareer Research and Development Award (#5 K24 MH072756-03) from the National Institute of Mental Health.
This work was presented at the Society of General Internal Medicine meeting, April 25–28, 2007, Toronto, ON, Canada, and at AcademyHealth, June 3–5, 2007, Orlando, FL.
- Longford NT: Selection bias and treatment heterogeneity in clinical trials. Stat Med. 1999, 18: 1467-1474. 10.1002/(SICI)1097-0258(19990630)18:12<1467::AID-SIM149>3.0.CO;2-H.View ArticlePubMedGoogle Scholar
- Kravitz RL, Duan N, Braslow J: Evidence-based medicine, heterogeneity of treatment effects, and the trouble with averages. Milbank Q. 2004, 82: 661-687. 10.1111/j.0887-378X.2004.00327.x.View ArticlePubMedPubMed CentralGoogle Scholar
- Greenfield S, Kravitz R, Duan N, Kaplan SH: Heterogeneity of treatment effects: implications for guidelines, payment, and quality assessment. Am J Med. 2007, 120: S3-9. 10.1016/j.amjmed.2007.02.002.View ArticlePubMedGoogle Scholar
- Spear BB, Heath-Chiozzi M, Huff J: Clinical application of pharmacogenetics. Trends Mol Med. 2001, 7: 201-204. 10.1016/S1471-4914(01)01986-4.View ArticlePubMedGoogle Scholar
- Connor S: Glaxo chief: Our drugs do not work on most patients. The Independent. 2003Google Scholar
- Kent DM, Ruthazer R, Selker HP: Are some patients likely to benefit from recombinant tissue-type plasminogen activator for acute ischemic stroke even beyond 3 hours from symptom onset?. Stroke. 2003, 34: 464-467. 10.1161/01.STR.0000051506.43212.8B.View ArticlePubMedGoogle Scholar
- Rothwell PM: Treating individuals 2. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet. 2005, 365: 176-186. 10.1016/S0140-6736(05)17709-5.View ArticlePubMedGoogle Scholar
- Rothwell PM, Mehta Z, Howard SC, Gutnikov SA, Warlow CP: Treating individuals 3: from subgroups to individuals: general principles and the example of carotid endarterectomy. Lancet. 2005, 365: 256-265.View ArticlePubMedGoogle Scholar
- Kent DM, Hayward RA: Limitations of applying summary results of clinical trials to individual patients: the need for risk stratification. JAMA. 2007, 298: 1209-1212. 10.1001/jama.298.10.1209.View ArticlePubMedGoogle Scholar
- Guyatt G, Sackett D, Adachi J, Roberts R, Chong J, Rosenbloom D, Keller J: A clinician's guide for conducting randomized trials in individual patients. Cmaj. 1988, 139: 497-503.PubMedPubMed CentralGoogle Scholar
- Guyatt G, Sackett D, Taylor DW, Chong J, Roberts R, Pugsley S: Determining optimal therapy – randomized trials in individual patients. N Engl J Med. 1986, 314: 889-892.View ArticlePubMedGoogle Scholar
- Senn S: Individual response to treatment: is it a valid assumption?. BMJ. 2004, 329: 966-968. 10.1136/bmj.329.7472.966.View ArticlePubMedPubMed CentralGoogle Scholar
- Senn S: Controversies concerning randomization and additivity in clinical trials. Stat Med. 2004, 23: 3729-3753. 10.1002/sim.2074.View ArticlePubMedGoogle Scholar
- Hayward RA, Kent DM, Vijan S, Hofer TP: Reporting clinical trial results to inform providers, payers, and consumers. Health Aff (Millwood). 2005, 24: 1571-1581. 10.1377/hlthaff.24.6.1571.View ArticleGoogle Scholar
- Kent DM, Hayward RA, Griffith JL, Vijan S, Beshansky JR, Califf RM, Selker HP: An independently derived and validated predictive model for selecting patients with myocardial infarction who are likely to benefit from tissue plasminogen activator compared with streptokinase. Am J Med. 2002, 113: 104-111. 10.1016/S0002-9343(02)01160-9.View ArticlePubMedGoogle Scholar
- Cook DI, Gebski VJ, Keech AC: Subgroup analysis in clinical trials. Med J Aust. 2004, 180: 289-291.PubMedGoogle Scholar
- Pocock SJ, Assmann SE, Enos LE, Kasten LE: Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practice and problems. Stat Med. 2002, 21: 2917-2930. 10.1002/sim.1296.View ArticlePubMedGoogle Scholar
- Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, Gotzsche PC, Lang T: The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Intern Med. 2001, 134: 663-694.View ArticlePubMedGoogle Scholar
- Stallones RA: The use and abuse of subgroup analysis in epidemiological research. Prev Med. 1987, 16: 183-194. 10.1016/0091-7435(87)90082-X.View ArticlePubMedGoogle Scholar
- Nissen SE, Tuzcu EM, Schoenhagen P, Brown BG, Ganz P, Vogel RA, Crowe T, Howard G, Cooper CJ, Brodie B: Effect of intensive compared with moderate lipid-lowering therapy on progression of coronary atherosclerosis: a randomized controlled trial. JAMA. 2004, 291: 1071-1080. 10.1001/jama.291.9.1071.View ArticlePubMedGoogle Scholar
- Norman PE, Jamrozik K, Lawrence-Brown MM, Le MT, Spencer CA, Tuohy RJ, Parsons RW, Dickinson JA: Population based randomised controlled trial on impact of screening on mortality from abdominal aortic aneurysm. BMJ. 2004, 329: 1259-10.1136/bmj.38272.478438.55.View ArticlePubMedPubMed CentralGoogle Scholar
- Ruggenenti P, Perna A, Gherardi G, Garini G, Zoccali C, Salvadori M, Scolari F, Schena FP, Remuzzi G: Renoprotective properties of ACE-inhibition in non-diabetic nephropathies with non-nephrotic proteinuria. Lancet. 1999, 354: 359-364. 10.1016/S0140-6736(98)10363-X.View ArticlePubMedGoogle Scholar
- Torriani FJ, Rodriguez-Torres M, Rockstroh JK, Lissen E, Gonzalez-Garcia J, Lazzarin A, Carosi G, Sasadeusz J, Katlama C, Montaner J: Peginterferon Alfa-2a plus ribavirin for chronic hepatitis C virus infection in HIV-infected patients. N Engl J Med. 2004, 351: 438-450. 10.1056/NEJMoa040842.View ArticlePubMedGoogle Scholar
- Matthews JN, Altman DG: Statistics notes. Interaction 2: Compare effect sizes not P values. BMJ. 1996, 313: 808-View ArticlePubMedPubMed CentralGoogle Scholar
- Kraemer HC, Frank E, Kupfer DJ: Moderators of treatment outcomes: clinical, research, and policy importance. JAMA. 2006, 296: 1286-1289. 10.1001/jama.296.10.1286.View ArticlePubMedGoogle Scholar
- Lu M, Lyden PD, Brott TG, Hamilton S, Broderick JP, Grotta JC: Beyond subgroup analysis: improving the clinical interpretation of treatment effects in stroke research. J Neurosci Methods. 2005, 143: 209-216. 10.1016/j.jneumeth.2004.10.002.View ArticlePubMedGoogle Scholar
- Parker AB, Naylor CD: Subgroups, treatment effects, and baseline risks: some lessons from major cardiovascular trials. Am Heart J. 2000, 139: 952-961. 10.1067/mhj.2000.106610.View ArticlePubMedGoogle Scholar
- Horwitz RI, Singer BH, Makuch RW, Viscoli CM: Clinical versus statistical considerations in the design and analysis of clinical research. J Clin Epidemiol. 1998, 51: 305-307. 10.1016/S0895-4356(98)00006-7.View ArticlePubMedGoogle Scholar
- Feinstein AR: The problem of cogent subgroups: a clinicostatistical tragedy. J Clin Epidemiol. 1998, 51: 297-299. 10.1016/S0895-4356(98)00004-3.View ArticlePubMedGoogle Scholar
- Kraemer HC, Stice E, Kazdin A, Offord D, Kupfer D: How do risk factors work together? Mediators, moderators, and independent, overlapping, and proxy risk factors. Am J Psychiatry. 2001, 158: 848-856.View ArticlePubMedGoogle Scholar
- Kraemer HC, Wilson GT, Fairburn CG, Agras WS: Mediators and moderators of treatment effects in randomized clinical trials. Arch Gen Psychiatry. 2002, 59: 877-883. 10.1001/archpsyc.59.10.877.View ArticlePubMedGoogle Scholar
- Hernandez AV, Boersma E, Murray GD, Habbema JD, Steyerberg EW: Subgroup analyses in therapeutic cardiovascular clinical trials: are most of them misleading?. Am Heart J. 2006, 151: 257-264. 10.1016/j.ahj.2005.04.020.View ArticlePubMedGoogle Scholar
- Assmann SF, Pocock SJ, Enos LE, Kasten LE: Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet. 2000, 355: 1064-1069. 10.1016/S0140-6736(00)02039-0.View ArticlePubMedGoogle Scholar
- Moreira ED, Stein Z, Susser E: Reporting on methods of subgroup analysis in clinical trials: a survey of four scientific journals. Braz J Med Biol Res. 2001, 34: 1441-1446.PubMedGoogle Scholar
- Wang R, Lagakos SW, Ware JH, Hunter DJ, Drazen JM: Statistics in medicine – reporting of subgroup analyses in clinical trials. N Engl J Med. 2007, 357: 2189-2194. 10.1056/NEJMsr077003.View ArticlePubMedGoogle Scholar
- Chew M, Villanueva EV, Weyden Van Der MB: Life and times of the impact factor: retrospective analysis of trends for seven medical journals (1994–2005) and their Editors' views. J R Soc Med. 2007, 100: 142-150. 10.1258/jrsm.100.3.142.View ArticlePubMedPubMed CentralGoogle Scholar
- Garfield E: Which medical journals have the greatest impact?. Ann Intern Med. 1986, 105: 313-320.View ArticlePubMedGoogle Scholar
- Lewis S, Clarke M: Forest plots: trying to see the wood and the trees. Bmj. 2001, 322: 1479-1480. 10.1136/bmj.322.7300.1479.View ArticlePubMedPubMed CentralGoogle Scholar
- SAS: Cary, North Carolina, USA: SAS Institute, 9
- Antman EM, Cohen M, Bernink PJ, McCabe CH, Horacek T, Papuchis G, Mautner B, Corbalan R, Radley D, Braunwald E: The TIMI risk score for unstable angina/non-ST elevation MI: A method for prognostication and therapeutic decision making. Jama. 2000, 284: 835-842. 10.1001/jama.284.7.835.View ArticlePubMedGoogle Scholar
- Ioannidis JP, Lau J: Heterogeneity of the baseline risk within patient populations of clinical trials: a proposed evaluation algorithm. Am J Epidemiol. 1998, 148: 1117-1126.View ArticlePubMedGoogle Scholar
- Morrow DA, Antman EM, Snapinn SM, McCabe CH, Theroux P, Braunwald E: An integrated clinical approach to predicting the benefit of tirofiban in non-ST elevation acute coronary syndromes. Application of the TIMI Risk Score for UA/NSTEMI in PRISM-PLUS. Eur Heart J. 2002, 23: 223-229. 10.1053/euhj.2001.2738.View ArticlePubMedGoogle Scholar
- Rothwell PM, Warlow CP: Prediction of benefit from carotid endarterectomy in individual patients: a risk-modelling study. European Carotid Surgery Trialists' Collaborative Group. Lancet. 1999, 353: 2105-2110. 10.1016/S0140-6736(98)11415-0.View ArticlePubMedGoogle Scholar
- The ESPRIM trial: short-term treatment of acute myocardial infarction with molsidomine. European Study of Prevention of Infarct with Molsidomine (ESPRIM) Group. Lancet. 1994, 344: 91-97.Google Scholar
- Bjornson CL, Klassen TP, Williamson J, Brant R, Mitton C, Plint A, Bulloch B, Evered L, Johnson DW: A randomized trial of a single dose of oral dexamethasone for mild croup. N Engl J Med. 2004, 351: 1306-1313. 10.1056/NEJMoa033534.View ArticlePubMedGoogle Scholar
- Milpied N, Deconinck E, Gaillard F, Delwail V, Foussard C, Berthou C, Gressin R, Lucas V, Colombat P, Harousseau JL: Initial treatment of aggressive lymphoma with high-dose chemotherapy and autologous stem-cell support. N Engl J Med. 2004, 350: 1287-1295. 10.1056/NEJMoa031770.View ArticlePubMedGoogle Scholar
- Cui L, Hung HM, Wang SJ, Tsong Y: Issues related to subgroup analysis in clinical trials. J Biopharm Stat. 2002, 12: 347-358. 10.1081/BIP-120014565.View ArticlePubMedGoogle Scholar
- Hayward RA, Kent DM, Vijan S, Hofer TP: Multivariable risk prediction can greatly enhance the statistical power of clinical trial subgroup analysis. BMC Med Res Methodol. 2006, 6: 18-10.1186/1471-2288-6-18.View ArticlePubMedPubMed CentralGoogle Scholar
- Haiman CA, Stram DO, Wilkens LR, Pike MC, Kolonel LN, Henderson BE, Le Marchand L: Ethnic and racial differences in the smoking-related risk of lung cancer. N Engl J Med. 2006, 354: 333-342. 10.1056/NEJMoa033250.View ArticlePubMedGoogle Scholar
- Conjeevaram HS, Fried MW, Jeffers LJ, Terrault NA, Wiley-Lucas TE, Afdhal N, Brown RS, Belle SH, Hoofnagle JH, Kleiner DE, Howell CD: Peginterferon and ribavirin treatment in African American and Caucasian American patients with hepatitis C genotype 1. Gastroenterology. 2006, 131: 470-477. 10.1053/j.gastro.2006.06.008.View ArticlePubMedGoogle Scholar
- McDowell SE, Coleman JJ, Ferner RE: Systematic review and meta-analysis of ethnic differences in risks of adverse reactions to drugs used in cardiovascular medicine. BMJ. 2006, 332: 1177-1181. 10.1136/bmj.38803.528113.55.View ArticlePubMedPubMed CentralGoogle Scholar
- Department of Health: A first class service: quality in the new NHS. 1998, London: Department of HealthGoogle Scholar
- Department of Health: "Faster Access to Modern Treatment": how NICE appraisal will work. 1999, Leeds: NHS ExecutiveGoogle Scholar
- Rawlins M: In pursuit of quality: the National Institute for Clinical Excellence. Lancet. 1999, 353: 1079-1082. 10.1016/S0140-6736(99)02381-8.View ArticlePubMedGoogle Scholar
- NIH guidelines on the inclusion of women and minorities as subjects in clinical research. http://grants.nih.gov/grants/guide/notice-files/NOT-OD-00-048.html
- Trusheim MR, Berndt ER, Douglas FL: Stratified medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nat Rev Drug Discov. 2007, 6: 287-293. 10.1038/nrd2251.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.