 Methodology
 Open Access
 Published:
Practical methods for incorporating summary timetoevent data into metaanalysis
Trials volume 8, Article number: 16 (2007)
Abstract
Background
In systematic reviews and metaanalyses, timetoevent outcomes are most appropriately analysed using hazard ratios (HRs). In the absence of individual patient data (IPD), methods are available to obtain HRs and/or associated statistics by carefully manipulating published or other summary data. Awareness and adoption of these methods is somewhat limited, perhaps because they are published in the statistical literature using statistical notation.
Methods
This paper aims to 'translate' the methods for estimating a HR and associated statistics from published timetoeventanalyses into less statistical and more practical guidance and provide a corresponding, easytouse calculations spreadsheet, to facilitate the computational aspects.
Results
A wider audience should be able to understand published timetoevent data in individual trial reports and use it more appropriately in metaanalysis. When faced with particular circumstances, readers can refer to the relevant sections of the paper. The spreadsheet can be used to assist them in carrying out the calculations.
Conclusion
The methods cannot circumvent the potential biases associated with relying on published data for systematic reviews and metaanalysis. However, this practical guide should improve the quality of the analysis and subsequent interpretation of systematic reviews and metaanalyses that include timetoevent outcomes.
Background
Timetoevent outcomes take account of whether an event takes place and also the time at which the event occurs, such that both the event and the timing of the event are important. For example, in cancer a cure may not be possible, but it is hoped that a new intervention will increase the duration of survival. Therefore, although the same or similar number of deaths may be observed, it is hoped that a new intervention will decrease the rate at which they take place. Other examples of outcomes where the timing of events may be vital in assessing the value of an intervention include: time free of seizures in epilepsy; time to conception in fertility treatment; time to resolution of symptoms of flu and time to fever in chickenpox.
Odds ratios (ORs) or relative risks (RRs) that measure only the number of events and take no account of when they occur are appropriate for measuring dichotomous outcomes, but less appropriate for analysing timetoevent outcomes. Using such dichotomous measures in a metaanalysis of timetoevent outcomes can pose additional problems. If the total number of events reported for each trial is used to calculate an OR or RR, this can involve combining trials reported at different stages of maturity, with variable follow up, resulting in an estimate that is both unreliable and difficult to interpret. Alternatively, ORs or RRs can be calculated at specific points in time making estimates comparable and easier to interpret, at least at those timepoints. However, interpretation is difficult, particularly if individual trials do not contribute data at each time point. Furthermore, bias could arise if the time points are subjectively chosen by the systematic reviewer or selectively reported by the trialist at times of maximal or minimal difference between intervention groups.
Timetoevent outcomes are most appropriately analysed using hazard ratios (HRs), which take into account of the number and timing of events, and the time until last followup for each patient who has not experienced an event i.e. has been censored. HRs can be estimated by carefully manipulating published or other summary data [1, 2], but currently such methods are underused in metaanalyses. For example, Issue 3, 2006 of the Cochrane Library contained 43 cancer metaanalyses based on published data that included an analysis of survival and were not conducted by the current authors. Only sixteen of these estimated HRs and the remainder calculated ORs or RRs. This may reflect that the trials included in these metaanalyses did not report the necessary statistical information [3, 4] to allow estimation of HRs. However, if there is sufficient data available to estimate an OR or RR, there is usually sufficient data to estimate a HR. Therefore, we suspect that use of the methods is limited because awareness is limited or because the statistical notation used to describe them may be difficult to follow for those with little formal statistical training. Furthermore, it is common for information on the effects of interventions to be presented in a number of different ways and it may not be clear which of the published methods is most appropriate.
Our aim in this paper is to provide stepbystep guidance on how to calculate a HR and the associated statistics for individual trials, according to the information presented in the trial report. To facilitate this we have translated the relevant equations (Appendix 1) from the previously reported statistical methods [1, 2] into more descriptive versions, using familiar terms and explaining all arithmetic manipulations as simply as possible. We illustrate their use with data extracted from two cancer trial reports [5, 6].
Basic requirements for a metaanalysis based on hazard ratios
A metaanalysis of HRs, in common with metaanalyses of other effect measures, such as the RR or OR, usually involves a 2stage process. In the first stage, a HR is estimated for each trial and in the second stage, these HRs are pooled in a metaanalysis. A fixedeffect metaanalysis of HRs, can use the method of Peto[7]:
where ∑ is the "sum of" the respective values for each trial and "ln" is the natural logarithm (log). The logrank Observed minus Expected events (OE) and the logrank Variance (V) are derived from the number of events and the individual times to event on the research arm of each trial. Alternatively, the inverse variance approach can be used [1]:
which uses the Variance of the lnHR (V*) and the log Hazard Ratio (lnHR) for each trial.
If the HR and V or lnHR and V* are presented in a trial report, they can be used directly in a fixed effect metaanalysis using (1) or (2) respectively. Similarly, if the coefficient of the treatment effect and the variance from a Cox model are provided, which correspond to the lnHR and V*, they too be used directly in a fixed effect metaanalysis using (2). These same statistics can be employed if a random effects metaanalysis [8] is required. Where they are not reported however, it is necessary to estimate the OE and V or the lnHR and V* for each trial, in order to combine them in a metaanalysis.
Generating the OE, V, HR and lnHR from reported summary statistics
There are many ways to use the summary statistical data presented in trial reports to estimate the OE, V, V*, HR and lnHR. Some methods use the reported information to directly calculate the HR or lnHR and V or V* and are described in Sections 1–2. However, it is more likely that a trial report will only provide sufficient information to estimate some or all of the HR, lnHR, OE, V and V* by indirect methods that make certain assumptions, and these indirect methods are described in sections 3–9. For some of these methods, it is necessary to estimate the V and then derive V* and others the converse approach. Each is the reciprocal of the other:
V is used to denote the logrank Variance and V* to denote the variance of the lnHR.
If even these indirect methods cannot be applied, then it may be possible to generate the necessary statistics from published KaplanMeier curves (sections 10–11). For any set of trials, it is likely that a number of these methods will be required, and for any one trial, it may be possible to use more than one method.
Extraction of summary statistics from trial reports
At the outset, it is worthwhile extracting all the necessary descriptive and statistical information for the outcome of interest for each trial [9], using a standard form (e.g. Table 1). The term "research" is used to denote the research intervention and "control" to denote the standard or control arm. Numbers have been rounded to two decimal places for presentation, but not for the underlying calculations. Rounding should in fact be avoided when making these calculations.
1. Report presents O & E or hazard rates on research and control arm
If both the observed (O) and logrank expected events (E) on the research and control arm are presented in a trial report, then the HR can be calculated directly as the ratio of the hazard rates:
The associated V can also be calculated directly:
These statistics were included in our example report of an ovarian cancer trial [5]:
Observed events research = 34 Expected events research = 28.0
Observed events control = 24 Expected events control = 29.9
Using these data and equations (5) and (6), the HR and V can be calculated directly:
The OE is the number of observed events minus the logrank expected events on the research arm.
O  E = 34  28.0 = 6.00
If a hazard rate for each of the research and control arms is presented in a trial report they can replace the top and bottom of equation (5). Based on the example above, the hazard rate on the research arm of 1.21 and on control of 0.80 would be used to obtain a HR of 1.51. Such hazard rates cannot be used to calculate directly the associated V, which would need to be estimated using an indirect method (see below).
2. Report presents OE on research arm and logrank V
If a trial report presents the OE events on the research arm and V, the HR can be calculated directly:
Note that "exp" represents the exponential or inverse of the natural log. HRs calculated using formula (7) will not differ markedly from the formal definition described previously (5), unless the event rate in a trial is low [1].
For illustration purposes, the data derived from the ovarian cancer trial report [5] are shown:
Using the calculated OE and V in equation (7) gives a HR of 1.51:
Note that equation (7) can be rearranged by simple algebra thus:
If the HR and OE are reported, you can calculate V. A lternatively, if the HR and V are reported, you can calculate the OE. Equations (8) and (9) are useful for some of the indirect methods presented later.
Equation (5) is the preferred estimate for the HR, although it will only differ markedly from (7) when the total number of events in a trial is small [1].
3. Report presents HR and confidence intervals
Where the HR and its associated confidence interval (CI) are presented in a trial report, V* (variance of the ln(HR)) and subsequently, if necessary, V, can be estimated from the confidence interval (CI) provided the CI is given to two significant figures:
The top half of the equation uses the log of the upper and lower CI and the bottom half the zscore for the upper boundary of the confidence interval. In the usual situation of a 95% CI being presented, the corresponding zscore is 1.96. Thus, whenever a trial reports a HR and associated a 95% CI, this version of equation (10) can be used to calculate V*:
For a 99% CI, the zscore is 2.58 and for a 90% CI the zscore is 1.64.
To demonstrate this and the rest of the indirect methods we use a report of a trial of chemotherapy versus no chemotherapy for bladder cancer [6]. The data extracted from the trial report data are shown in Table 1.
Inserting the 95% CI (0.71–1.02, Table 1) and the zscore of 1.96 into equation (10):
and using the estimated V* (without rounding) in equation (4):
Gives an estimate of the logrank V of 117.07. Having both the reported HR of 0.85 and the estimated V, the OE equation (9) can be used to obtain an OE of 19.03
O  E = ln(0.85) × 117.07 = 19.03
Note that if a HR of an event on control versus the research arm is reported rather than vice versa, then a HR of the research arm versus control is obtained by taking the reciprocal of the HR i.e. 1/HR and associated CI.
4. Report presents HR and events in each arm (and the randomisation ratio is 1:1)
Where a HR is reported, without the associated CI, but with the numbers of events on each arm, and the randomisation ratio is 1:1, a reasonable approximation of V may be obtained using equation (11):
Using the relevant data from the bladder cancer trial (Table 1), equation (11) and then equation (9):
Gives an estimate of 120.87 for V and 19.64 for the OE.
5. Report presents HR and total events (and the randomisation ratio is 1:1)
If only the total number of events is reported along with the HR, the variance can be approximated simply using the total number of events, provided again that the randomisation ratio is 1:1:
where the total observed events is the sum of the observed events on the research and control arms.
Using the total number of events from the bladder cancer trial report (Table 1) gives an estimate 121.25 for V. Using this together with the reported HR and equation (9) gives a figure of 19.70 for the OE:
This particular method of estimating V also provides a simple way of checking (approximately) the plausibility of estimates of V derived using other equations.
6. Report presents HR, total events and the numbers randomised on each arm
If the randomisation ratio is not 1:1, methods 4 and 5 are not appropriate and one that accounts for the proportion of patients randomised to each arm is needed. If a report describes an analysis that is not based on all randomised patients; some patients being excluded subsequent to randomisation, then the HR and V should be based on the numbers analysed in the report rather than the numbers randomised, otherwise the precision of the estimate will be exaggerated:
If more than one analysis is presented, for example, one based on eligible patients and one based on all randomised patients, it is preferable to use the analysis based on all randomised patients.
This method can also be used if the randomisation ratio is 1:1. In the bladder cancer trial report, all randomised patients were included in the analysis and so the number randomised in each arm equals the number analysed (Table 1). Equation (13) can be used to estimate V and equation (9) to estimate the OE:
For a trial that randomised patients according to a 1:1 ratio, but analysed unequal numbers of patients on each arm because, for example, patients were excluded differentially by arm, equation (13) is the preferred indirect method of estimating the variance.
7. Report presents pvalue and events in each arm (and the randomisation ratio is 1:1)
If only the logrank, Mantel Haenszel or even the Cox regression pvalue, and numbers of events on each arm are reported and the randomisation ratio is 1:1, these data can be used to estimate the OE using:
For reliability, it is probably wise to use this method only when the exact pvalue is given to at least 2 significant figures [1, 2]. As well as the events on each arm and overall, a zscore for the 2sided pvalue divided by 2 is required. If a 1sided pvalue is reported it can be used directly to obtain the zscore. Such a zscore can be derived from either statistical tables or statistical or spreadsheet software (e.g. MS Excel).
A decision to assign a positive or negative value to OE is needed and this depends on whether the direction of the effect is in favour of the research or control arm. This in turn will depend on whether the outcome is positive or negative. For a positive outcome, such as time to pregnancy, more pregnancies and/or a shorter the time to pregnancy on the research arm compared to the control arm, will indicate that the effect is in favour of the research arm. For a negative outcome, such as time to death, fewer deaths and/or a longer time to death on the research compared to the control arm will indicate that the effect is in favour of the research arm. If the results are not statistically significantly in favour of either the research or control arm or if the relative numbers of events on each arm are not provided, it is possible to look for other indicators of the direction of the results, such as the relative numbers of events on each arm, separation of KaplanMeier curves or textual descriptions of the results.
The logrank pvalue of 0.075 gives a zscore of 1.78 and incorporating this with the number of events on each arm (Table 1) into equation (14):
gives an OE of 19.57. It is clear from the report of the bladder cancer trial that survival favours the research treatment, with fewer deaths and a longer time to death in the research arm. Therefore, the OE will be made negative (19.57). Then, using equations (11) and (7):
V is estimated as 120.87 and the HR as 0.85.
8. Report presents pvalue and total events (and the randomisation ratio is 1:1)
A similar equation to (14) can be used if just the pvalue and the total number of events are reported, provided the randomisation ratio (or the ratio of patients analysed) is 1:1:
Using equation (15):
As before, a sign needs to be applied based on the direction of the results, giving 19.60. Then using (12) and (8):
give estimates of 121.25 for V and 0.85 for the HR.
9. Report presents pvalue, total events and numbers randomised to each arm
Where the report presents the pvalue, the total events and the numbers randomised on each arm, another equation similar to (14) allows estimation of the OE for trials where the randomisation (or analysis) ratio is not 1:1:
Using (16):
Applying a negative sign on the basis of the direction of the results (19.60) and equations (13) and (8):
Provides an estimate of 121.25 for the V and 0.85 for the HR.
Generating the OE, V, HR and lnHR from published KaplanMeier curves
Some timetoevent analyses are presented solely in the form of KaplanMeier curves [1, 10]. It is possible to estimate the HR, lnHR, OE and V from a number of time intervals from such curves and pool across these time intervals within a trial to estimate a HR or lnHR that represents the whole curve (section 10–11). Alongside, the reported minimum and maximum followup times or the reported numbers at risk can be used, to estimate the amount of censoring in a trial. Otherwise, the estimate of effect would be based on too many patients and so be erroneously precise. If a trial report does not present either the numbers at risk or the actual minimum and maximum followup, then it may be possible to estimate the level of followup from other information provided (Appendix 2).
Extraction of curve data from trial reports
A sufficiently large, clear copy of the curve needs to be divided up into a number of time intervals, which give a good representation of event rates over time, whilst limiting the number of events within any time interval. Parmar et al. [1], suggest that, as far as possible, the event rate within a time interval should be no more than 20% of those at the start of the time interval. If the curve starts to level off, then few (or no) events are taking place and there is little value in extracting data from this area of a curve. Also, the final interval should not extend beyond the actual or estimated maximum followup.
For example, in a trial of metastatic breast cancer, many events (deaths) will occur in the first 3 months, so the curve would need to be split into smaller intervals at the beginning then gradually larger time intervals (e.g. monthly for the first 12 months, 3monthly to 24 months and then 6monthly thereafter). However, the curve from the bladder cancer trial (Figure 1) shows an event (death) rate that is quite high in the earlier parts of the curve, but is subsequently fairly steady. Therefore, the curve was divided into 3monthly intervals for the first 3 years and 6monthly intervals thereafter (Figure 1). The percentage survival for each arm at the start of each time interval, for each arm, was then extracted into Table 2.
10. Report presents KaplanMeier curve and information on followup
For each time interval and for each arm a number of iterative calculations are required. It is necessary to estimate the number of patients who were: 1) eventfree at the start of the interval, 2) censored during the interval and 3) at risk during the interval. Also, 4) the number of events during each interval needs to be estimated. Together these items are used to: 5) estimate the OE, V and HR for each time interval. Finally, 6) the OE, V and HR for the whole curve are derived from combining the estimates across time interval.
The numbers of patients at risk at the start of the first time interval is simply the total number analysed on each arm, making step 1 redundant for the first time interval of any curve. Therefore, in the bladder cancer trial, at the start of the 0–3 month time period, there are 491 and 485 patients at risk on the research and control arms, respectively (Table 2).
Based on the median followup of 48 months and accrual period of 69 months (Table 1), the minimum followup is estimated (Appendix 2) to be 14 months for this trial, and so all patients have complete followup and no patients are censored in the 0–3, 3–6, 6–9 and 9–12 month intervals. Therefore, for these time intervals, estimating the number of patients censored (step 2) is not relevant. Beyond 14 months patients are censored and this must be taken into account. Going through the steps 1, 3, 4 and 5 for the prior time intervals, the following were estimated for the 12–15 month time interval:
Eventfree at start of prior time interval (12–15 month), research = 382.98
Eventfree at start of prior time interval (12–15 month), control = 363.75
Events in prior time interval (12–15 month), research = 24.55
Events in prior time interval (12–15 month), control = 24.25
Censored in prior time interval (12–15 month), research = 0.00
Censored in prior time interval (12–15 month), control = 0.00
Note that these estimated values differ somewhat from the actual reported numbers at risk at 12 months (Table 2), but they can be used to illustrate all the steps of the method, in the presence of censoring, for the 15–18 month interval:
Step 1. Numbers eventfree at the start of the current interval
This is in fact the number of patients that were eventfree at the end of the prior time interval:
Event free at start of current interval = Event free at start of prior interval  Events in prior interval  Censored during prior interval
Using the data from 12–15 month time interval, the numbers of patients eventfree in the current 15–18 month time interval are estimated:
Event free at start (15–18 month), research = 382.98  24.55  0 = 358.43
Event free at start (15–18 month), control = 363.75  24.25  0 = 339.5
Step 2. Numbers censored during the current interval
Assuming that censoring is noninformative and that patients are censored at a constant rate within a given time interval, a simple method can be used to estimate numbers censored [1]:
Using the data from step 1, the estimated maximum followup of 82 months and equation (18):
around 8 patients in the research arm and 7 patients in the control arm were estimated to be censored during 15–18 month time interval:
Step 3. Numbers at risk during the current interval, adjusted for censoring
The numbers censored can be used to adjust (reduce) the numbers at risk during the time interval:
At risk during current interval, adjusted for censoring = Event free at start of current interval  Censored during current interval
Based on the data from step 1 and 2, the numbers at risk during the current 15–18 month time interval are:
At risk during, adjusted for censoring (15 – 18 month), research = 358.43  8.02 = 350.41
At risk during, adjusted for censoring (15 – 18 month), control = 339.50 7.60 = 331.90
Step 4. Number of events during the current interval
The number of events during the interval is then estimated from the reduced numbers at risk:
Using the numbers at risk during the interval from step 3 and the data extracted from the curve (Table 2) in equation (20), allows estimation of the number of events in the 15–18 month interval:
Step 5. Estimate the HR, V and OE for the current interval
As time to event and censoring have already been accounted for, the hazard ratio can be estimated by using the equation for calculating a relative risk:
with associated V:
Using the data from steps 3 and 5 and equations (21), (22) and (8) above, but without rounding:
Gives estimates of the HR, V and OE as 0.68, 15.17 and 5.74, respectively for the 15–18 month time interval. Note that if censoring had not been taken into account, the estimate of the HR for this time interval would still have been 0.68, but the V would be slightly greater at 15.52.
These steps are repeated for all time intervals.
Step 6, combining all time intervals
The final step is to calculate the overall HR for the trial using the formula for calculating a pooled HR shown previously (1). Taking all time intervals and accounting for censoring a pooled HR of 0.88 and V of 128.81 (95%CI of 0.74–1.05) is obtained:
In this example, if the censoring model had not been applied the same HR, a smaller, but similar V (136.23) and a similar CI (0.74–1.04) would have been estimated. This is probably because it is a large trial with good followup, making both estimates fairly precise. In contrast, the ovarian cancer trial [5] accrued far fewer patients and had poorer followup. Using the curve method and accounting for censoring, gives a HR estimate of 1.21 (95% CI 0.62–2.36), but discounting censoring, the HR is slightly more extreme (1.26), with overly precise confidence intervals (95% CI 0.69–2.28). I n other situations the differences may be more pronounced.
11. Report presents KaplanMeier curve and the numbers at risk
The presentation of the numbers at risk at particular time points with a KaplanMeier curve, offers a more direct means of assessing the level of censoring [2], which is taken into account when the HR, V and OE are estimated. However, this necessarily limits the division of the curve to these time points, which may be relatively few. Further this approach may be problematic when the event rate between time points is large, e.g. greater than 20% [1].
The number of patients eventfree at each time point i.e. the numbers of patients eventfree at the start and end of the each time interval is known, and so they do not need to be estimated. For each time interval for each arm, assuming that the level of censoring is constant within each interval, it remains to calculate the number of patients who were: 1) at risk during the interval and 2) the number of events during the interval. These can be used to 4) estimate the OE, V and HR for the time interval and the data from all the intervals can be combined in 5) to obtain the OE, V and HR for the complete curve. Although not required to estimate the HR, the number of patients who were 3) censored during the interval can also be calculated and is useful for comparison with the other curve method.
The bladder cancer trial report gave the numbers at risk annually until 5 years. These data, and the percentage survival (i.e. eventfree) for each arm at the start of each time interval, are given in Table 2 and can be used to illustrate the steps of the method for the 0–12 month time period:
Step 1. Numbers at risk during the current interval
The same data can be used to quantify the numbers of patients at risk during an interval:
For the 0–12 month interval:
Step 2. Number of events during the current interval
Again, the same published data can be used to estimate the number of events in an interval:
For the 0–12 month interval:
There were approximately 106 events estimated on the research arm and 120 on the control arm.
Step 3. Numbers censored during the current interval
The numbers censored are obtained from the reported numbers at risk and the event rate at the start and end of an interval:
Using event rates extracted from the curve at 0 and 12 months and the associated numbers at risk:
approximately 12 and 10 patients were estimated to be censored on the research and control arm respectively. Note that in section 10, by estimating the minimum followup to be 14 months and using the censoring model, we failed to take accurate account of censoring in the 0–12 month period.
Step 4a. Estimate the HR and V for the current interval using the number of events and the numbers at risk during the current interval
The results from steps 1 and 2 can then be used to estimate the HR, V and OE for the time interval using equations (21), (22) and (8), as in section 10.
Step 4b. Estimate the OE and V and HR for the current interval using the numbers of events and the numbers at risk during the current interval
An alternative method estimates E and then OE within in each interval:
Using the data for the 0–12 month interval gives the E as:
And the OE:
Either equation (12) or (13), described earlier can be use to estimate V. However, equation (13) is preferred if the randomisation ratio is not 1:1, or the numbers at risk during intervals are very different, e.g. because there is a big difference in effect between arms of the trial.
Using equation (6) we can estimate a HR of 0.88 for the interval.
Step 6, combining all time intervals
Taking all time intervals and censoring into account and using equation (1) as in section 10, gives a pooled HR of 0.88 and V of 119.80 (95%CI of 0.74–1.05).
Interpreting the hazard ratio (HR)
Usually a HR calculated for a trial or a metaanalysis is interpreted as the relative risk of an event on the research arm compared to control. However, it can also be translated into an absolute difference in the proportion of patients who are eventfree at a particular time point or for particular groups of patient, assuming proportional hazards:
exp [ln(proportion of patients eventfree) × HR]  proportion eventfree
Alternatively, it can be translated into an absolute difference in the median time event free, assuming exponential distributions, by first calculating the median time event free on the research arm:
and then the difference between medians:
Median time event free on research  Median time event free on research
These measures require an estimate of the proportion of patients that are eventfree in the control group or subgroup of interest and an estimate of the median time eventfree in the control group, respectively. Such data may be obtained from a KaplanMeier curve of a representative trial or individual patient data metaanalysis, or even from epidemiological data. Alternatively, it may be possible to use 'typical' values from other literature.
Using the bladder cancer example, the HR of 0.85 and an estimated 2year survival of 58% for patients on the control arm, gives an absolute improvement:
exp [ln(0.58) × 0.85]  0.58 = 0.05
in survival of 5% at 2 years, taking it from 58% to 63%.
The median survival on control was estimated to be 37 months and so the median survival on the research arm is:
37.0/0.85 = 43.6
43.6 months, giving an absolute improvement in median survival:
43.6  37.0
of 6.6 months with the research treatment.
Calculations spreadsheet
Some of the methods described are computationally more complex than others and performing all the calculations by hand for each and every trial can be laborious, lead to errors and require extra data checking. We have therefore developed spreadsheet in Microsoft Excel that carries out the calculations for all of the methods described. The user enters all the reported summary statistics and the spreadsheet estimates the HR, 95%CI, lnHR, V, and OE by all possible methods. The user can also input data extracted from KaplanMeier curves and estimate censoring using the minimum and maximum followup or the reported numbers at risk, to obtain similar summary statistics. Graphical representations of the input data are produced for comparison with the published curves, to assist with data extraction or to highlight data entry errors. Results from all methods are provided in a single output screen, which facilitates comparison. The main features of the calculations spreadsheet are illustrated in Figure 2 and the spreadsheet itself is freely available to readers (see Additional file 1).
Discussion
We have presented methods for calculating a HR and/or associated statistics from published timetoeventanalyses [1, 2] into a practical, less statistical guide. A corresponding, easytouse calculations spreadsheet, to facilitate the computational aspects, is available from the authors. The resulting summary statistics can then be used in the metaanalysis procedures found in statistical and metaanalysis software.
There is a hierarchy in the methods described [1, 2]. The direct methods make no assumptions and are preferable, followed by the various indirect methods based on reported statistics. The curve methods are likely to be the least reliable and it is not yet clear which method of adjusting for censoring is most reliable. If both curve methods are possible, the choice between the two may be a pragmatic one, depending on whether the minimum and maximum followup are reported or need to be estimated, and how many time points the number at risk are reported for and the event rate between those time points. The development of a hybrid of the two curve methods might optimise use of available data. Also, it is not clear how different schemes for dividing up the Kaplan Meier curves may impact on the resulting statistics. In fact, further research is required to assess how well all of the methods perform according to variations in, for example, trial size, levels of followup or event rates.
Although the methods provide a means of analysing timetoevent outcomes for individual trials, they cannot circumvent the other wellknown problems of relying on only published data for systematic reviews and metaanalyses. For example, it may not be possible to include all relevant trials, either because trials are not published or because the trial report does not include the outcome of interest, situations which could lead to publication bias [11–13] or selective outcome reporting bias [14], respectively. Similarly, these methods cannot correct common problems with the original reported analyses, such as the exclusion of patients [15, 16], analyses which are not by intentiontotreat [17] or analyses confined to particular patient subgroups, which may also lead to bias [16]. Furthermore, if the timetoevent outcome of interest is a longterm outcome, such as survival, then any HR estimation for an individual trial or metaanalysis will be limited by the extent of followup at the time that trials are reported. Such issues are relevant to all trials, systematic reviews and metaanalyses and so they should always be taken into account in interpreting results of these studies. Their relative impact is likely to vary between outcomes, trials, metaanalyses and healthcare areas and some may be addressed by obtaining further or updated information direct from trial investigators.
While the methods described previously [1, 2] and elaborated here are not a substitute for the reanalysis IPD from all randomised patients, they offer the most appropriate way of analysing timetoevent outcomes, when IPD is not available or the approach is infeasible. Thus, whenever possible they should be used in preference to using a pooled OR or RR or a series of ORs or RRs at fixed time points. This should improve the quality of the analysis and subsequent interpretation of systematic reviews and metaanalyses that include timetoevent outcomes.
Appendix 1: Previously published formulae for generating hazard ratios from published timetoevent data [1, 2]. The number in brackets link these to their descriptive equivalent in the text
1. Generating the OE, V, HR and lnHR from reported summary statistics
For equations 1–16 and following the notation of Parmar et al. [1], for trial i:
O _{ ri }= observed number of events in the research group
E _{ ri }= logrank expected events in the research group
O _{ ci }= observed number of events in the control group
E _{ ci }= logrank expected events in the control group
O _{ r } E _{ r }observed minus expected events in the research group
O _{ i }= total observed events (O _{ ri }+ O _{ ci })
V _{ ri }= logrank variance
ln(HR_{ i }) = log HR
var[ln(HR_{ i }) = variance of the log hazard ratio
UPPCI_{ i }= Value for the upper end of the confidence interval
LOWCI_{ i }= Value for the lower end of the confidence interval
Φ^{1}(1α_{ i }/2) = z score for the upper end of the confidence intreval
R _{ ri }= number randomised to the research group
R _{ ci }= number randomised to the control group
p_{ i } = reported twosided pvalue associated with the logrank or MantelHaenszel test (or Cox model)
Estimating a pooled lnHR from a series of trials
Estimating a pooled lnHR using the inverse variance method:
Estimating the OE, V, HR and lnHR from reported summary statistics
The reciprocal nature of the variance of the lnHR and the logrank variance:
Directly estimating the lnHR and associated variance using the formal definition:
Direct estimation of the lnHR using the alternative definition:
Indirect estimation of the variance of the lnHR from the confidence interval:
Indirect estimation of the variance of the lnHR from the number of events:
V _{ ri }= O _{ ri } O _{ ci }/O _{ i }
V _{ ri }= O _{ i }/4
Indirect estimation of the variance of the lnHR from the number of events and the numbers randomised (analysed) on each arm:
Indirect estimation of the observed minus expected events from the observed events and the pvalue:
Indirect estimation of the observed minus expected events from the observed events, the pvalue and the numbers randomised (analysed) on each arm:
2. Generating the HR and V from published KaplanMeier curves and followup
For equations 17–22, and following the notation of Parmar et al. [1], for trial i and T nonoverlapping time points (t = 1, ...,T) :
t = whole time interval (t  1, t)
t _{ s }= start of the time interval (t  1, t)
t _{ e }= end of the time interval (t  1, t)
R _{ ri }(t) = effective number of patients at risk on the research arm during time interval (t  1, t)
R _{ ri }(t  1) = effective number of patients at risk on the research arm during time interval (t  2,t  1)
D _{ ri }(t) = effective number of events on the research arm during time interval (t  1, t)
D _{ ci }(t) = effective number of events on the control arm during time interval (t  1, t)
D _{ ri }(t  1) = effective number of events on the research arm during time interval (t  2,t  1)
C _{ ri }(t) = effective number of patients censored on the research arm during time interval (t  1, t)
C _{ ci }(t) = effective number of patients censored on the control arm during time interval (t  1, t)
C _{ ri }(t  1) = effective number of patients censored on the research arm during time interval (t  2, t  1)
S_{ ri }(t _{ s }) = eventfree probability on the research arm at the start of time interval (t  1, t)
S_{ ri }(t _{ e }) = eventfree probability on the research arm at the end of time interval (t  1, t)
F _{ min }= minimum followup
F _{ max }= maximum followup
Estimation of the numbers eventfree at the start of a time interval:
R _{ ri }(t _{ s }) = R _{ ri }(t  1) D _{ ri }(t  1)  C _{ ri }(t  1)
Estimation of the numbers censored during a time interval
Estimation of the numbers at risk during a time interval, adjusted for censoring
R _{ ri }(t) = R _{ ri }(t _{ s }) C _{ ri }(t)
Estimation of the number of events during a time interval
Note that equations 17–20 are also are used for the control arm.
Estimation of the HR and V for a time interval from a KaplanMeier curve
3. Generating the HR and V from published KaplanMeier curves and the numbers at risk
For equations 23–26, and following the notation of [2], for time interval i:
j = treatment group (where 1 = the control arm and 2= the research arm)
t _{ i1}= time at the start of the current interval
t _{ i1}= time at the start of the prior interval
n _{ j,i }= number at risk at end of interval [t _{ i1,} t _{ i }) in group j
n _{ j,i1}= number at risk at start of interval [t _{ i1,} t _{ i }) in group j
n* _{ j,i }= number at risk during interval [t _{ i1,} t _{ i }) in group j
d* _{ j,i }= number of events during interval [t _{ i1,} t _{ i }) in group j
c* _{ j,i }= number censored during interval [t _{ i1,} t _{ i }) in group j
s* _{ j,i }= eventfree probability at end of interval [t _{ i1,} t _{ i }) in group j
s* _{ j,i1}= eventfree probability at start of interval [t _{ i1,} t _{ i }) in group j
e* _{ j,i }= logrank expected events during interval [t _{ i1,} t _{ i }) in group j = 2 (the research arm)
Estimation of the numbers at risk during a time interval from a KaplanMeier curve
Estimation of the number of events during a time interval from a KaplanMeier curve
Estimation of the numbers censored during a time interval from a KaplanMeier curve
Estimation of the number of logrank expected events during a time interval from a KaplanMeier curve
Appendix 2: Estimating or educated 'guesstimating' minimum and maximum followup
When the minimum and maximum followup are not explicitly reported, it may be possible to estimate them for a particular trial, provided that some indicators of extent of followup are provided. In descending order of preference, the following are some strategies that we have employed to estimate the minimum and maximum followup:
For minimum followup, if the trial report presents

1.
Censoring tick marks on KaplanMeier curve
Assume first tick mark indicates the point of minimum followup

2.
Median followup and accrual period
Assume minimum followup = median followup minus half the accrual period

3.
Date of analysis and accrual period, could assume
Assume minimum followup = date of analysis minus final date of accrual

4.
Date of submission and accrual period
Assume estimated date of analysis = date of submission minus 6 months
Assume minimum followup = estimated date of analysis minus final date of accrual
For maximum followup, if the trial report presents

1.
Censoring tick marks on KaplanMeier curve
Assume last tick mark indicates the point of maximum followup

2.
Median followup and accrual period
Assume maximum followup = median followup plus half the accrual period

3.
Date of analysis and accrual period, could assume
Assume maximum followup = date of analysis minus first date of accrual

4.
Date of submission and accrual period, could assume
Assume estimated date of analysis = date of submission minus 6 months
Assume maximum followup = estimated date of analysis minus first date of accrual
References
 1.
Parmar MKB, Torri V, Stewart L: Extracting summary statistics to perform metaanalyses of the published literature for survival endpoints. Statistics in Medicine. 1998, 17: 281534. 10.1002/(SICI)10970258(19981230)17:24<2815::AIDSIM110>3.0.CO;28.
 2.
Williamson PR, Tudur Smith C, Hutton JL, Marson AG: Aggregate data metaanalysis with timetoevent outcomes. Statistics in Medicine. 2002, 21: 333751. 10.1002/sim.1303.
 3.
Altman DG, De Stavola BL, Love SB, Stepniewska KA: Review of survival analyses published in cancer. British Journal of Cancer. 1995, 72: 5118.
 4.
Pocock SJ, Clayton TC, Altman DG: Survival plots of timetoevent ouctomes in clinical trials. Lancet. 2002, 359: 16869. 10.1016/S01406736(02)08594X.
 5.
Mangioni C, Bolis G, Pecorelli S, Bragman K, Epis A, Favalli G, Gambino A, Landoni F, Presti M, Torri W, Vassena L, Zanaboni F, Marsoni S: Randomized trial in advanced ovarian cancer comparing cisplatin and carboplatin. Journal of the National Cancer Institute. 1989, 81: 146171. 10.1093/jnci/81.19.1464.
 6.
International Collaboration of Trialists on behalf of the Medical Research Council Advanced Bladder Cancer Working Party, EORTC Genitourinary Group Australian Bladder Cancer Study Group, National Cancer Institute of Canada Clinical Trials Group, Finnbladder, Norwegian Bladder Cancer Study Group and Club Urologico Espanol de Tratamiento Oncologico (CUETO) group: Neoadjuvant cisplatin, methotrexate, and vinblastine chemotherapy for muscleinvasive bladder cancer: a randomised controlled trial. Lancet. 1999, 354: 53340. 10.1016/S01406736(99)022928.
 7.
Yusuf S, Peto R, Lewis JA, Collins R, Sleight P: Beta blockade during and after myocardial infarction: an overview of the randomized trials. Progress in Cardiovascular Diseases. 1985, 27: 33571. 10.1016/S00330620(85)800037.
 8.
DerSimonian R, Laird N: Metaanalysis in clinical trials. Controlled Clinical Trials. 1986, 7: 17788. 10.1016/01972456(86)900462.
 9.
Tudur C, Williamson PR, Khan S, Best L: The value of the aggregate data approach in metaanalysis with timetoevent outcomes. Journal of the Royal Statistical Society A. 2001, 164: 35770. 10.1111/1467985X.00207.
 10.
Tierney JF, Burdett S, Stewart LA: Feasibility and reliability of using hazard ratios in metaanalyses of published timetoevent data. preparation.
 11.
Dickersin K: The existence of publication bias and risk factors for its occurrence. Journal of the American Medical Association. 1990, 263: 13859. 10.1001/jama.263.10.1385.
 12.
Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR: Publication bias in clinical research. Lancet. 1991, 337: 86772. 10.1016/01406736(91)90201Y.
 13.
Dickersin K, Min YI, Meinert CL: Factors influencing publication of research results. Journal of the American Medical Association. 1992, 267: 3748. 10.1001/jama.267.3.374.
 14.
Chan AW, Hróbjartsson A, Haarh MT, Gøtzche PC, Altman DG: Empirical evidence for selective reporting of outcomes in randomized trials. Journal of the American Medical Association. 2004, 291: 245765. 10.1001/jama.291.20.2457.
 15.
Schulz KF, Grimes DA, Altman DG, Hayes RJ: Blinding and exclusions after allocation in randomised controlled trials: survey of published parallel group trials in obstetrics and gynaecology. BMJ. 1996, 312: 7424.
 16.
Tierney JF, Stewart LA: Investigating patient exclusion bias in metaanalysis. International Journal of Epidemiology. 2005, 34 (1): 7987. 10.1093/ije/dyh300.
 17.
Hollis S, Campbell F: What is meant by intentiontotreat analysis? Survey of published randomised controlled trials. BMJ. 1999, 319: 6704.
Acknowledgements
We are grateful to Mahesh Parmar for comments on an earlier draft of the manuscript and to both him and Paula Williamson for advice on the methods. Also, the calculations spreadsheet was based on ones initially developed by Sarah Simnett and Josie Sandercock for calculating hazard ratios from KaplanMeier curves. This work was funded by the UK Medical Research Council and the Australian Medical Research Council.
Author information
Affiliations
Corresponding author
Additional information
Competing interests
The author(s) declare that they have no competing interests.
Authors' contributions
This manuscript is based on workshops demonstrating these methods to systematic reviewers. JT helped develop the workshops and the spreadsheet to carry out the calculations, and drafted the manuscript. LS had the idea for the workshops, helped develop the initial methods paper and workshops and helped draft this manuscript. DG had the idea for the workshops, helped develop them and commented on the manuscript. SB helped test the spreadsheet and run the workshops and commented on the manuscript. MS developed the spreadsheet and commented on the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
13063_2006_188_MOESM1_ESM.xls
Additional file 1: HR calculations spreadsheet. Spreadsheet to facilitate the estimation of hazard ratios from published summary statistics or data extracted from KaplanMeier curves. (XLS 185 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Tierney, J.F., Stewart, L.A., Ghersi, D. et al. Practical methods for incorporating summary timetoevent data into metaanalysis. Trials 8, 16 (2007). https://doi.org/10.1186/17456215816
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/17456215816
Keywords
 Trial Report
 Pool Hazard Ratio
 Current Interval
 Randomisation Ratio
 Report Summary Statistic