- Open Access
- Open Peer Review
Figures in clinical trial reports: current practice & scope for improvement
© Pocock et al; licensee BioMed Central Ltd. 2007
- Received: 21 June 2007
- Accepted: 19 November 2007
- Published: 19 November 2007
Most clinical trial publications include figures, but there is little guidance on what results should be displayed as figures and how.
To evaluate the current use of figures in Trial reports, and to make constructive suggestions for future practice.
We surveyed all 77 reports of randomised controlled trials in five general medical journals during November 2006 to January 2007. The numbers and types of figures were determined, and then each Figure was assessed for its style, content, clarity and suitability. As a consequence, guidelines are developed for presenting figures, both in general and for each specific common type of Figure.
Most trial reports contained one to three figures, mean 2.3 per article. The four main types were flow diagram, Kaplan Meier plot, Forest plot (for subgroup analyses) and repeated measures over time: these accounted for 92% of all figures published. For each type of figure there is a considerable diversity of practice in both style and content which we illustrate with selected examples of both good and bad practice. Some pointers on what to do, and what to avoid, are derived from our critical evaluation of these articles' use of figures.
There is considerable scope for authors to improve their use of figures in clinical trial reports, as regards which figures to choose, their style of presentation and labelling, and their specific content. Particular improvements are needed for the four main types of figures commonly used.
- Flow Diagram
- Statistical Uncertainty
- Forest Plot
- Individual Patient Data
- Trial Report
Much has been written about how to visually display quantitative information, [1–4] and some attention has been paid to the specific constraints of including figures in Medical journal articles [5–9]. In this article we focus on the use of figures in reports of randomised clinical trials, for which there is little specific guidance available at present.
In order to understand current practice, we undertook a survey of recent publications of randomized clinical trial in major general medical journals. This provides objective evidence as to the extent of use of figures in trial reports, including which types of figure are chosen by authors. This survey then facilitates constructive critical appraisal of both the good features and the limitations of how authors and journals are utilizing figures for visual display of results.
In formulating our recommendations for improved use of figures in future trial reports, we give particular attention to each main type of figure used and draw on published examples to illustrate the specific features of interest.
We identified all 77 reports of randomised clinical trials published in November 2006 to January 2007 in the following five major general medical journals: Archives of Internal Medicine, British Medical Journal (BMJ), Journal of American Medical Association (JAMA), Lancet and New England Journal of Medicine (NEJM). BMJ publishes a shorter version of each article in print, and a larger version on-line. We have studied the longer versions.
For each article we first noted the number and types of figure included, restricting attention to figures displaying data. There were nine other Figures, mostly schema of trial protocols. From past guidance in the literature and our own previous surveys of trial reports in 1999 [11, 12] and 2005  we formulated a prior list of issues that we thought were pertinent to the style and content of both figures in general and the specific common types of figures used in trial reports. With this list in mind, we carefully inspected every figure included in our survey regarding the appropriateness of its presentation and content. This exercise led us to refine our list (presented as recommendation at end of the Discussion) as to what constitutes good and bad practice in the use of figures, and to select specific examples to add practicality to our illustrative points.
Figures in Reports of Clinical Trials in Five Medical Journals during November 2006 to January 2007.
No. of articles (total 77)
Annals of Internal Medicine
British Medical Journal
Journal of American Medical Association
New England Journal of Medicine
No. of Figures in each article
Types of Figure in each article*
Kaplan Meier plot
Individual patient data
Flow diagram (66 articles) describing the flow of participants through the various stages of the trial.
Kaplan Meier plot (32 articles) comparing treatments for time-to-event (survival) outcomes.
Forest plot (21 articles) displaying several estimates of treatment effect, usually by subgroups of patients, but occasionally by other comparative features.
Repeated measures plot (20 articles) displaying mean outcomes at baseline and several follow-up times by treatment group.
These four types of plot accounted for 92% of figures in our survey. The remainder comprised bar chart (7 articles), individual patient data display (3 articles), box plot (2 articles), cumulative distributions (1 article).
We now turn attention to the style and content of specific types of figure. From the survey, we have chosen four examples of each main type of Figure, plus a few other examples (20 examples in all) to illustrate the main features to consider, include, and sometimes avoid, in one's use of figures.
The flow diagram is an integral part of the CONSORT guidelines [14, 15], adopted by most major journals. Hence it is meant to be a mandatory requirement for publication in all journals we surveyed, except NEJM which had flow diagrams for half its clinical trial articles. Its aim is to display the flow of participants through each stage, specifically for each randomized group reporting the numbers randomly assigned, receiving intended treatment, completing study protocol, and analysed for the primary outcome.
The Kaplan Meier plot is the routine method of displaying time-to-event (survival) data by treatment group . The event may be death, a non-fatal event (e.g. disease recurrence), a composite outcome (e.g. time to death, myocardial infarction or stroke whichever occurs first) or occasionally a good outcome (e.g. time to recovery).
A superficial glance at Figure 5 takes in the fact that the PCI group has a slightly higher % endpoints at all times, but this can be readily attributable to chance. Hence, it is good practice to include the hazard ratio, its 95% CI and the logrank P-value on the figure to clarify the (lack of) evidence concerning a treatment difference. Figure 5's footnote includes yet more details on treatment comparisons by year, which is perhaps more than is usually warranted.
Note the horizontal axis is on a log scale i.e. the distance from 0.5 to 1 is the same as the distance from 1 to 2. This makes sense in that a halving and a doubling of odds are of equal magnitude. This use of log scale also makes all CIs symmetric about the estimated effects. The plot usefully gives the overall estimated odds ratio for all patients, and its CI. Figure 9 also gives in tabular form i) the number of deaths and patients by treatment overall and by subgroup and ii) the consequent odds ratios and CIs that are already plotted. This duplication of information is useful or repetitious, depending on the tastes of authors and editors.
The one exception is age. The two CIs for younger and older patients overlap only slightly, and the interaction test has P = 0.05. This might provoke some interest as an exploratory finding suggesting PCI may have more merit in older patients. However, the authors, aware of the dangers of false positive findings across multiple subgroup analyses, mention in the footnote that P < 0.01 was the tough pre-specified criterion for any claims of interaction. Figure 10 also tabulates four-year event rates by treatment and subgroup, which is a useful way of documenting absolute risk and how it varies by subgroup. For instance, event rates are higher for patients with ejection fraction < 50%. The figure's footnote is unduly long, and perhaps much of it should have been in the Methods section instead.
Figure 11 did not provide any interaction tests; instead the text includes the comment "there was no evidence of substantial heterogeneity...." The term HR is a little blunt; to state "hazard ratio" would be clearer. Again to give all HRs and CIs in both figure and tabular form is unnecessarily repetitious.
Since the main inference is about baseline adjusted mean changes this would have been conveyed better with a plot of mean changes rather than means. From the rest of the article, one deduces that all three intervention groups did somewhat better than the control group at five years, a fact hard to decipher from the figure. Note the 30% drop-outs by five years, which is usefully made clear in Figure 16.
Bar charts are occasionally used to display summary statistics such as means or percentages by treatment groups. However, many authors correctly decide that such relatively simple results are best shown in a table or text rather than as a figure.
Figures are a key element of any trial report. They are often more likely to be noticed by readers than text or tables, and to be disseminated in conferences and discussions, since by their very nature figures catch the eye more readily, and hence have the potential to convey key results more fully and immediately. Figures can reveal unexpected (and expected) patterns in data and graphs of model estimates can encapsulate the entire picture of what was learnt in a study, much more than can be done in a table. To date, little attention has been paid to what constitutes good practice for producing figures in trial reports. Each journal has its own approach to use of figures, but the key choice rests with what figures authors include in their submitted articles.
Our survey of three months of trial reports in five key journals illustrates on the one hand that the great majority of figures are of four main types (Flow diagram, Kaplan Meier plot, Forest plot and Repeated measures over time), but on the other hand there is a great diversity of style in the way those Figures are presented.
If there is any overlap between the standard error bars then the difference is not statistically significant
If there is a gap between the standard error bars and that gap itself exceeds one standard error then the difference is significant, at P < 0.035 in fact. Thus, a lesser gap may fall short of conventional significance
If the 95% CIs do not overlap then we have strong evidence of a difference, P < .006 in fact. So, a slight overlap between two 95% CIs may still be statistically significant.
Of course, this guide should not substitute for the formal presentation of P-values for comparisons of key interest.
A cynic might observe that i) authors lack imagination and are over-conservative in their use of figures and ii) authors are sloppy in the way they actually present figures. The first point may be unduly harsh, since clinical trials have a limited number of data types, and over time it has become evident which types of figure work in practice. Also, unconventional uses of figures, while having creative potential may carry the risk that some readers struggle to understand and interpret them. Nevertheless some types of figure may at present be underutilized, for instance appropriate displays of individual patient data.
We feel there is more justification in the second criticism above as regards sloppiness and inconsistencies in style. Accordingly, we devote the rest of this Discussion to a list of Recommendations for future practice.
One needs to decide which results merit a figure rather than a table. Some figures (e.g. Kaplan Meier plots) would be cumbersome as a Table while others (e.g. a bar chart of percentages) may be better in tabular form or in the text.
Every figure needs the following: a good legend, clear labelling, clarity of presentation and to stand alone in its comprehensibility rather than needing explanation in the text.
Figures should clearly identify each treatment group, and require care in use of colours and background shading since many readers use black-and-white copies.
Figures should indicate the numbers of patients by treatment group in each analysis presented.
Figures should display appropriate measures of uncertainty, e.g. standard error bars or CIs.
Figures should often state the primary inferences to be derived from them, e.g. estimates of treatment effect, their CIs and P-values, since visual inspection alone could lead to misleading interpretations by the reader.
The following recommendations relate to the four main types of Figure:
A) Flow Diagram
The flow diagram should include the numbered flow of participants throughout the trial, including the numbers screened and eligible prior to randomization.
It is particularly important to provide the numbers in each group lost to follow-up or excluded from analysis for other reasons.
Some flow diagrams can become indigestible with too many repeat words, especially with several treatment arms. These may be better displayed as a Table without loss of information.
B) Kaplan Meier plot
Plots should include numbers at risk over time under the time axis.
The plot should not extend too far in time, to avoid the numbers at risk becoming unduly small.
Plots with relatively low event rates should be displayed going up (i.e. cumulative percent with event on the vertical axis) so that the detail is discernable.
Plots should often include standard error bars at appropriate time points to convey statistical uncertainty. To date this is rarely done.
C) Forest plot
In addition to estimates and 95% CIs for various subgroups, Forest plots should also include the overall estimate and its CI. Drawing a vertical dotted line at the overall estimate helps readers to spot any consistency (or otherwise) across subgroups.
One can usefully use varying sizes of square at each estimate to indicate which subgroups are based on a lot (or a little) data.
For plots of hazard ratio, odds ratio or relative risk a log scale is often preferable, leading to symmetric CIs.
The risk scale should provide an appropriate degree of detail, and make clear which direction indicates which treatment is better.
Forest plots can usefully tabulate for each subgroup some of the following: the numbers of patients and numbers with events by treatment, the estimate and its CI and the interaction test P-value. However, this should not result in excessively detailed information for what is an exploratory subgroup analysis.
Interaction tests should be reported rather than subgroup P-values. That is, the difference between "significant" and "non-significant" subgroups may not be statistically significant.
D) Repeated measures plot
The points for each estimate (usually means) at each time point should be clearly marked and joined by lines for each treatment in a clearly identified manner. With several treatment groups it may be clearer to identify groups by symbols rather than by lines.
Measures of uncertainty, usually standard errors bars, should be attached to each point, and the number of patients included should be under the time axis.
It is useful to slightly stagger the groups so means and standard errors don't overlap confusingly.
It is often better to plot mean changes from baseline, rather than means, using analysis of covariance to present baseline adjusted mean changes.
The method of analysis used to make inferences from the repeated measures should be briefly stated on the plot, and it may be useful to add some overall estimate of treatment effect with CI and P-value.
In conclusion, we hope these useful pointers enhance the quality of clinical trial reports with respect to use of figures. A similar enquiry to this may be of value for other type of study eg reports of observational studies in epidemiology, so that all journal articles pay appropriate attention to the informative use of figures.
- Tufte ER: The Visual Display of Quantitative Information. 1983, Cheshire, CT: Graphics PressGoogle Scholar
- Cleveland WS: The Elements of Graphing Data. 1994, Summit, NJ: Hobart PressGoogle Scholar
- Finney DJ: On presenting tables and diagrams. Scholarly Publishing. 1986, 17: 327-342.Google Scholar
- Gelman A, Pasarica C, Dodhia R: Let's practice what we preach: turning tables into graphs. Am Statistician. 2002, 56: 121-130. 10.1198/000313002317572790.View ArticleGoogle Scholar
- Cooper RJ, Schriger DS, Close RJH: Graphical literacy: the quality of graphs in a large-circulation journal. Ann Emerg Med. 2002, 40: 317-322. 10.1067/mem.2002.127327.View ArticlePubMedGoogle Scholar
- Schriger DL, Cooper RJ: Achieving graphical excellence: suggestions and methods for creating high-quality visual displays of experimental data. Ann Emerg Med. 2001, 37: 75-87. 10.1067/mem.2001.111570.View ArticlePubMedGoogle Scholar
- Puhan MA, Riet G, Eichler K, Steurer J, Bachmann LM: Moremedical journals should inform their contributions about three key principles of graph construction. J Clin Epi. 2006, 59: 1017-22.View ArticleGoogle Scholar
- Bryant TN: Presenting graphical information. Pediatr Allergy Immunol. 1999, 10: 4-13. 10.1034/j.1399-3038.1999.101004.x.View ArticlePubMedGoogle Scholar
- Lang T, Secic M: Visual displays of data and statistics. Chapter 21 of How to Report Statistics in Medicine. 2006, American College of Physicians, Philadelphia, 2Google Scholar
- Schriger DL, Sinha R, Schroter S, Liu PY, Altman DG: From Submission to Publication: A Retrospective Review of the Tables and Figures in a Cohort of Randomized Controlled Trials Submitted to the British Medical Journal. Ann Emerg Med. 2006Google Scholar
- Assmann SF, Pocock SJ, Enos LE, Kasten LE: Subgroup analysis and other (mis)uses of baseline data in clinical trials. Lancet. 2000, 355: 1064-69. 10.1016/S0140-6736(00)02039-0.View ArticlePubMedGoogle Scholar
- Pocock SJ, Clayton TC, Altman DG: Survival plots of time-to-event outcomes in clinical trials: good practice and pitfalls. Lancet. 2002, 359: 1686-89. 10.1016/S0140-6736(02)08594-X.View ArticlePubMedGoogle Scholar
- Pocock SJ, Travison T, Wruck L: Figures in clinical trials reports in PSI workshop "Imaginative Uses of Visual Displays". London 14 Sept. 2005Google Scholar
- Moher D, Schulz KF, Altman DG: The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. Ann Intern Med. 2001, 134: 657-662.View ArticlePubMedGoogle Scholar
- Altman DG, Schulz KF, Moher D: The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann Inter Med. 2001, 134: 663-694.View ArticleGoogle Scholar
- Lewis S, Clarke M: Forest plots: trying to see the wood and the trees. Brit Med J. 2001, 322: 1479-80. 10.1136/bmj.322.7300.1479.View ArticlePubMedPubMed CentralGoogle Scholar
- Cuzick J: Forest plots and the interpretation of subgroups. Lancet. 2005, 365: 1308-10.1016/S0140-6736(05)61026-4.View ArticlePubMedGoogle Scholar
- Fitzmaurice GM, Laird NM, Ware JH: Adjustment for baseline response. Section 5.5 of Applied Longitudinal Analysis. 2004, New York: WileyGoogle Scholar
- Beunckens C, Molenberghs G, Kenward MG: Direct likelihood analysis versus simple forms of imputation for missing data in randomized clinical trials. Clinical Trials. 2005, 2: 379-386. 10.1191/1740774505cn119oa.View ArticlePubMedGoogle Scholar
- van Belle G: Bargraphs waste ink: they do not illuminate complex relationships. Section 7.5 of Statistical Rules of Thumb. 2002, New York: WileyGoogle Scholar
- Cumming G, Fidler F, Vaux DL: Error bars in experimental biology. J Cell Biology. 2007, 177: 7-11. 10.1083/jcb.200611141.View ArticleGoogle Scholar
- Gelman A, Stern H: The difference between "significant" and "not significant" is not itself statistically significant. Amer Statistician. 2006, 60: 328-31. 10.1198/000313006X152649.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.