This article has Open Peer Review reports available.

# Practical methods for incorporating summary time-to-event data into meta-analysis

- Jayne F Tierney
^{1}Email author, - Lesley A Stewart
^{2}, - Davina Ghersi
^{3}, - Sarah Burdett
^{1}and - Matthew R Sydes
^{4}

**8**:16

https://doi.org/10.1186/1745-6215-8-16

© Tierney et al; licensee BioMed Central Ltd. 2007

**Received: **25 September 2006

**Accepted: **07 June 2007

**Published: **07 June 2007

## Abstract

### Background

In systematic reviews and meta-analyses, time-to-event outcomes are most appropriately analysed using hazard ratios (HRs). In the absence of individual patient data (IPD), methods are available to obtain HRs and/or associated statistics by carefully manipulating published or other summary data. Awareness and adoption of these methods is somewhat limited, perhaps because they are published in the statistical literature using statistical notation.

### Methods

This paper aims to 'translate' the methods for estimating a HR and associated statistics from published time-to-event-analyses into less statistical and more practical guidance and provide a corresponding, easy-to-use calculations spreadsheet, to facilitate the computational aspects.

### Results

A wider audience should be able to understand published time-to-event data in individual trial reports and use it more appropriately in meta-analysis. When faced with particular circumstances, readers can refer to the relevant sections of the paper. The spreadsheet can be used to assist them in carrying out the calculations.

### Conclusion

The methods cannot circumvent the potential biases associated with relying on published data for systematic reviews and meta-analysis. However, this practical guide should improve the quality of the analysis and subsequent interpretation of systematic reviews and meta-analyses that include time-to-event outcomes.

## Background

Time-to-event outcomes take account of whether an event takes place and also the time at which the event occurs, such that both the event and the timing of the event are important. For example, in cancer a cure may not be possible, but it is hoped that a new intervention will increase the duration of survival. Therefore, although the same or similar number of deaths may be observed, it is hoped that a new intervention will decrease the rate at which they take place. Other examples of outcomes where the timing of events may be vital in assessing the value of an intervention include: time free of seizures in epilepsy; time to conception in fertility treatment; time to resolution of symptoms of flu and time to fever in chickenpox.

Odds ratios (ORs) or relative risks (RRs) that measure only the number of events and take no account of when they occur are appropriate for measuring dichotomous outcomes, but less appropriate for analysing time-to-event outcomes. Using such dichotomous measures in a meta-analysis of time-to-event outcomes can pose additional problems. If the total number of events reported for each trial is used to calculate an OR or RR, this can involve combining trials reported at different stages of maturity, with variable follow up, resulting in an estimate that is both unreliable and difficult to interpret. Alternatively, ORs or RRs can be calculated at specific points in time making estimates comparable and easier to interpret, at least at those time-points. However, interpretation is difficult, particularly if individual trials do not contribute data at each time point. Furthermore, bias could arise if the time points are subjectively chosen by the systematic reviewer or selectively reported by the trialist at times of maximal or minimal difference between intervention groups.

Time-to-event outcomes are most appropriately analysed using hazard ratios (HRs), which take into account of the number and timing of events, and the time until last follow-up for each patient who has not experienced an event i.e. has been censored. HRs can be estimated by carefully manipulating published or other summary data [1, 2], but currently such methods are under-used in meta-analyses. For example, Issue 3, 2006 of the Cochrane Library contained 43 cancer meta-analyses based on published data that included an analysis of survival and were not conducted by the current authors. Only sixteen of these estimated HRs and the remainder calculated ORs or RRs. This may reflect that the trials included in these meta-analyses did not report the necessary statistical information [3, 4] to allow estimation of HRs. However, if there is sufficient data available to estimate an OR or RR, there is usually sufficient data to estimate a HR. Therefore, we suspect that use of the methods is limited because awareness is limited or because the statistical notation used to describe them may be difficult to follow for those with little formal statistical training. Furthermore, it is common for information on the effects of interventions to be presented in a number of different ways and it may not be clear which of the published methods is most appropriate.

Our aim in this paper is to provide step-by-step guidance on how to calculate a HR and the associated statistics for individual trials, according to the information presented in the trial report. To facilitate this we have translated the relevant equations (Appendix 1) from the previously reported statistical methods [1, 2] into more descriptive versions, using familiar terms and explaining all arithmetic manipulations as simply as possible. We illustrate their use with data extracted from two cancer trial reports [5, 6].

### Basic requirements for a meta-analysis based on hazard ratios

*logrank Observed minus Expected events (O-E)*and the

*logrank Variance (V)*are derived from the number of events and the individual times to event on the research arm of each trial. Alternatively, the inverse variance approach can be used [1]:

which uses the *Variance of the lnHR* (*V*)* and the *log Hazard Ratio (lnHR) for each trial*.

If the HR and V or lnHR and V* are presented in a trial report, they can be used directly in a fixed effect meta-analysis using (1) or (2) respectively. Similarly, if the coefficient of the treatment effect and the variance from a Cox model are provided, which correspond to the lnHR and V*, they too be used directly in a fixed effect meta-analysis using (2). These same statistics can be employed if a random effects meta-analysis [8] is required. Where they are not reported however, it is necessary to estimate the *O-E* and *V* or the lnHR and *V** for each trial, in order to combine them in a meta-analysis.

### Generating the O-E, V, HR and lnHR from reported summary statistics

*O-E*,

*V*,

*V**, HR and lnHR. Some methods use the reported information to directly calculate the HR or lnHR and

*V or V**and are described in Sections 1–2. However, it is more likely that a trial report will only provide sufficient information to estimate some or all of the HR, lnHR,

*O-E*,

*V*and

*V**by indirect methods that make certain assumptions, and these indirect methods are described in sections 3–9. For some of these methods, it is necessary to estimate the

*V*and then derive

*V**and others the converse approach. Each is the reciprocal of the other:

*V* is used to denote the *logrank Variance* and *V** to denote the *variance of the lnHR*.

If even these indirect methods cannot be applied, then it may be possible to generate the necessary statistics from published Kaplan-Meier curves (sections 10–11). For any set of trials, it is likely that a number of these methods will be required, and for any one trial, it may be possible to use more than one method.

### Extraction of summary statistics from trial reports

Suggested data collection form completed with data extracted from the report of the example trial in bladder cancer [6]

Trial Reference: BA06 | (Chemotherapy) | (No chemotherapy) |
---|---|---|

Randomisation ratio (e.g. 1:1) | 1 | 1 |

Patients randomised | 491 | 485 |

Patients analysed | 491 | 485 |

Observed events | 229 | 256 |

Logrank expected events | Not reported | Not reported |

Hazard ratio, confidence interval (& level e.g. 95%) | 0.85, CI 0.71 to1.02 (95%) | |

Logrank variance | Not reported | |

Logrank observed minus-expected events | Not reported | |

Hazard ratio and confidence interval (& level e.g. 95%) or standard error or variance from adjusted or unadjusted Cox | Not reported | |

Test statistic, 2-sided p-value to 2 significant figures (& test used e.g. logrank, Mantel-Haenzsel or Cox) | Not reported, 0.075 (logrank) | |

Advantage to research or control? | Research | |

Actuarial or Kaplan Meier curves reported? | Yes, Kaplan Meier | |

Numbers at risk reported | Yes | |

Follow-up details | Min = 14 months, Max = 82 months (Estimated from recruitment of 69 months, 11/9 – 7/95 and median follow-up of 48 months) |

#### 1. Report presents O & E or hazard rates on research and control arm

*O*) and logrank expected events (

*E*) on the research and control arm are presented in a trial report, then the HR can be calculated directly as the ratio of the hazard rates:

*V*can also be calculated directly:

These statistics were included in our example report of an ovarian cancer trial [5]:

Observed events research = 34 Expected events research = 28.0

Observed events control = 24 Expected events control = 29.9

*V*can be calculated directly:

The *O-E* is the number of observed events minus the logrank expected events on the research arm.

O - *E* = 34 - 28.0 = 6.00

If a hazard rate for each of the research and control arms is presented in a trial report they can replace the top and bottom of equation (5). Based on the example above, the hazard rate on the research arm of 1.21 and on control of 0.80 would be used to obtain a HR of 1.51. Such hazard rates cannot be used to calculate directly the associated *V*, which would need to be estimated using an indirect method (see below).

#### 2. Report presents O-E on research arm and logrank V

*O-E*events on the research arm and

*V*, the HR can be calculated directly:

Note that "exp" represents the exponential or inverse of the natural log. HRs calculated using formula (7) will not differ markedly from the formal definition described previously (5), unless the event rate in a trial is low [1].

*O-E*and

*V*in equation (7) gives a HR of 1.51:

If the HR and *O-E* are reported, you can calculate *V. A* lternatively, if the HR and *V* are reported, you can calculate the *O-E*. Equations (8) and (9) are useful for some of the indirect methods presented later.

Equation (5) is the preferred estimate for the HR, although it will only differ markedly from (7) when the total number of events in a trial is small [1].

#### 3. Report presents HR and confidence intervals

*V**(variance of the ln(HR)) and subsequently, if necessary,

*V*, can be estimated from the confidence interval (CI) provided the CI is given to two significant figures:

*V**:

For a 99% CI, the z-score is 2.58 and for a 90% CI the z-score is 1.64.

To demonstrate this and the rest of the indirect methods we use a report of a trial of chemotherapy versus no chemotherapy for bladder cancer [6]. The data extracted from the trial report data are shown in Table 1.

Gives an estimate of the logrank *V* of 117.07. Having both the reported HR of 0.85 and the estimated *V*, the *O-E* equation (9) can be used to obtain an O-E of -19.03

*O* - *E* = ln(0.85) × 117.07 = -19.03

Note that if a HR of an event on control versus the research arm is reported rather than vice versa, then a HR of the research arm versus control is obtained by taking the reciprocal of the HR i.e. 1/HR and associated CI.

#### 4. Report presents HR and events in each arm (and the randomisation ratio is 1:1)

*V*may be obtained using equation (11):

Gives an estimate of 120.87 for *V* and -19.64 for the *O-E*.

#### 5. Report presents HR and total events (and the randomisation ratio is 1:1)

where the total observed events is the sum of the observed events on the research and control arms.

*V*. Using this together with the reported HR and equation (9) gives a figure of -19.70 for the

*O-E:*

This particular method of estimating *V* also provides a simple way of checking (approximately) the plausibility of estimates of *V* derived using other equations.

#### 6. Report presents HR, total events and the numbers randomised on each arm

*V*should be based on the numbers analysed in the report rather than the numbers randomised, otherwise the precision of the estimate will be exaggerated:

If more than one analysis is presented, for example, one based on eligible patients and one based on all randomised patients, it is preferable to use the analysis based on all randomised patients.

*V*and equation (9) to estimate the

*O-E*:

For a trial that randomised patients according to a 1:1 ratio, but analysed unequal numbers of patients on each arm because, for example, patients were excluded differentially by arm, equation (13) is the preferred indirect method of estimating the variance.

#### 7. Report presents p-value and events in each arm (and the randomisation ratio is 1:1)

For reliability, it is probably wise to use this method only when the exact p-value is given to at least 2 significant figures [1, 2]. As well as the events on each arm and overall, a z-score for the 2-sided p-value divided by 2 is required. If a 1-sided p-value is reported it can be used directly to obtain the z-score. Such a z-score can be derived from either statistical tables or statistical or spreadsheet software (e.g. MS Excel).

A decision to assign a positive or negative value to *O-E* is needed and this depends on whether the direction of the effect is in favour of the research or control arm. This in turn will depend on whether the outcome is positive or negative. For a positive outcome, such as time to pregnancy, more pregnancies and/or a shorter the time to pregnancy on the research arm compared to the control arm, will indicate that the effect is in favour of the research arm. For a negative outcome, such as time to death, fewer deaths and/or a longer time to death on the research compared to the control arm will indicate that the effect is in favour of the research arm. If the results are not statistically significantly in favour of either the research or control arm or if the relative numbers of events on each arm are not provided, it is possible to look for other indicators of the direction of the results, such as the relative numbers of events on each arm, separation of Kaplan-Meier curves or textual descriptions of the results.

*O-E*of 19.57. It is clear from the report of the bladder cancer trial that survival favours the research treatment, with fewer deaths and a longer time to death in the research arm. Therefore, the

*O-E*will be made negative (-19.57). Then, using equations (11) and (7):

*V* is estimated as 120.87 and the HR as 0.85.

#### 8. Report presents p-value and total events (and the randomisation ratio is 1:1)

give estimates of 121.25 for *V* and 0.85 for the HR.

#### 9. Report presents p-value, total events and numbers randomised to each arm

*O-E*for trials where the randomisation (or analysis) ratio is not 1:1:

Provides an estimate of 121.25 for the *V* and 0.85 for the HR.

### Generating the O-E, V, HR and lnHR from published Kaplan-Meier curves

Some time-to-event analyses are presented solely in the form of Kaplan-Meier curves [1, 10]. It is possible to estimate the HR, lnHR, *O-E* and *V* from a number of time intervals from such curves and pool across these time intervals within a trial to estimate a HR or lnHR that represents the whole curve (section 10–11). Alongside, the reported minimum and maximum follow-up times or the reported numbers at risk can be used, to estimate the amount of censoring in a trial. Otherwise, the estimate of effect would be based on too many patients and so be erroneously precise. If a trial report does not present either the numbers at risk or the actual minimum and maximum follow-up, then it may be possible to estimate the level of follow-up from other information provided (Appendix 2).

### Extraction of curve data from trial reports

A sufficiently large, clear copy of the curve needs to be divided up into a number of time intervals, which give a good representation of event rates over time, whilst limiting the number of events within any time interval. Parmar *et al*. [1], suggest that, as far as possible, the event rate within a time interval should be no more than 20% of those at the start of the time interval. If the curve starts to level off, then few (or no) events are taking place and there is little value in extracting data from this area of a curve. Also, the final interval should not extend beyond the actual or estimated maximum follow-up.

Example data extraction form with data extracted from bladder cancer Kaplan-Meier plot in Figure 1.

Time at start of interval (months) | % Event-free on research | % Event-free on control | Reported numbers at risk on research | Reported numbers at risk on control |
---|---|---|---|---|

0 | 100 | 100 | 491 | 485 |

3 | 97 | 97 | - | - |

6 | 92 | 92 | - | - |

9 | 86 | 84 | - | - |

12 | 78 | 75 | 372 | 355 |

15 | 73 | 70 | - | - |

18 | 68 | 63 | - | - |

21 | 65 | 60 | - | - |

24 | 62 | 58 | 283 | 257 |

27 | 60 | 56 | - | - |

30 | 58 | 54 | - | - |

33 | 56 | 52 | - | - |

36 | 54 | 51 | 200 | 187 |

42 | 52 | 49 | - | - |

48 | 51 | 46 | 139 | 132 |

54 | 49 | 44 | - | - |

60 | 49 | 43 | 93 | 80 |

### 10. Report presents Kaplan-Meier curve and information on follow-up

For each time interval and for each arm a number of iterative calculations are required. It is necessary to estimate the number of patients who were: 1) event-free at the start of the interval, 2) censored during the interval and 3) at risk during the interval. Also, 4) the number of events during each interval needs to be estimated. Together these items are used to: 5) estimate the *O-E*, *V* and HR for each time interval. Finally, 6) the *O-E*, *V* and HR for the whole curve are derived from combining the estimates across time interval.

The numbers of patients at risk at the start of the first time interval is simply the total number analysed on each arm, making step 1 redundant for the first time interval of any curve. Therefore, in the bladder cancer trial, at the start of the 0–3 month time period, there are 491 and 485 patients at risk on the research and control arms, respectively (Table 2).

Based on the median follow-up of 48 months and accrual period of 69 months (Table 1), the minimum follow-up is estimated (Appendix 2) to be 14 months for this trial, and so all patients have complete follow-up and no patients are censored in the 0–3, 3–6, 6–9 and 9–12 month intervals. Therefore, for these time intervals, estimating the number of patients censored (step 2) is not relevant. Beyond 14 months patients are censored and this must be taken into account. Going through the steps 1, 3, 4 and 5 for the prior time intervals, the following were estimated for the 12–15 month time interval:

Event-free at start of prior time interval (12–15 month), research = 382.98

Event-free at start of prior time interval (12–15 month), control = 363.75

Events in prior time interval (12–15 month), research = 24.55

Events in prior time interval (12–15 month), control = 24.25

Censored in prior time interval (12–15 month), research = 0.00

Censored in prior time interval (12–15 month), control = 0.00

Note that these estimated values differ somewhat from the actual reported numbers at risk at 12 months (Table 2), but they can be used to illustrate all the steps of the method, in the presence of censoring, for the 15–18 month interval:

#### Step 1. Numbers event-free at the start of the current interval

This is in fact the number of patients that were event-free at the end of the prior time interval:

*Event free at start of current interval* = *Event free at start of prior interval* - *Events in prior interval* - *Censored during prior interval*

Using the data from 12–15 month time interval, the numbers of patients event-free in the current 15–18 month time interval are estimated:

*Event free at start (15–18 month), research* = 382.98 - 24.55 - 0 = 358.43

*Event free at start (15–18 month), control* = 363.75 - 24.25 - 0 = 339.5

#### Step 2. Numbers censored during the current interval

around 8 patients in the research arm and 7 patients in the control arm were estimated to be censored during 15–18 month time interval:

#### Step 3. Numbers at risk during the current interval, adjusted for censoring

The numbers censored can be used to adjust (reduce) the numbers at risk during the time interval:

*At risk during current interval, adjusted for censoring* = *Event free at start of current interval* - *Censored during current interval*

Based on the data from step 1 and 2, the numbers at risk during the current 15–18 month time interval are:

*At risk during, adjusted for censoring (15 – 18 month), research* = 358.43 - 8.02 = 350.41

*At risk during, adjusted for censoring (15 – 18 month), control = 339.50 -7.60 = 331.90*

#### Step 4. Number of events during the current interval

#### Step 5. Estimate the HR, V and O-E for the current interval

*V*:

Gives estimates of the HR, *V* and *O-E* as 0.68, 15.17 and -5.74, respectively for the 15–18 month time interval. Note that if censoring had not been taken into account, the estimate of the HR for this time interval would still have been 0.68, but the *V* would be slightly greater at 15.52.

These steps are repeated for all time intervals.

#### Step 6, combining all time intervals

*V*of 128.81 (95%CI of 0.74–1.05) is obtained:

In this example, if the censoring model had not been applied the same HR, a smaller, but similar *V* (136.23) and a similar CI (0.74–1.04) would have been estimated. This is probably because it is a large trial with good follow-up, making both estimates fairly precise. In contrast, the ovarian cancer trial [5] accrued far fewer patients and had poorer follow-up. Using the curve method and accounting for censoring, gives a HR estimate of 1.21 (95% CI 0.62–2.36), but discounting censoring, the HR is slightly more extreme (1.26), with overly precise confidence intervals (95% CI 0.69–2.28). I n other situations the differences may be more pronounced.

### 11. Report presents Kaplan-Meier curve and the numbers at risk

The presentation of the numbers at risk at particular time points with a Kaplan-Meier curve, offers a more direct means of assessing the level of censoring [2], which is taken into account when the HR, *V* and *O-E* are estimated. However, this necessarily limits the division of the curve to these time points, which may be relatively few. Further this approach may be problematic when the event rate between time points is large, e.g. greater than 20% [1].

The number of patients event-free at each time point i.e. the numbers of patients event-free at the start and end of the each time interval is known, and so they do not need to be estimated. For each time interval for each arm, assuming that the level of censoring is constant within each interval, it remains to calculate the number of patients who were: 1) at risk during the interval and 2) the number of events during the interval. These can be used to 4) estimate the *O-E*, *V* and HR for the time interval and the data from all the intervals can be combined in 5) to obtain the *O-E*, *V* and HR for the complete curve. Although not required to estimate the HR, the number of patients who were 3) censored during the interval can also be calculated and is useful for comparison with the other curve method.

The bladder cancer trial report gave the numbers at risk annually until 5 years. These data, and the percentage survival (i.e. event-free) for each arm at the start of each time interval, are given in Table 2 and can be used to illustrate the steps of the method for the 0–12 month time period:

#### Step 1. Numbers at risk during the current interval

#### Step 2. Number of events during the current interval

There were approximately 106 events estimated on the research arm and 120 on the control arm.

#### Step 3. Numbers censored during the current interval

approximately 12 and 10 patients were estimated to be censored on the research and control arm respectively. Note that in section 10, by estimating the minimum follow-up to be 14 months and using the censoring model, we failed to take accurate account of censoring in the 0–12 month period.

#### Step 4a. Estimate the HR and V for the current interval using the number of events and the numbers at risk during the current interval

The results from steps 1 and 2 can then be used to estimate the HR, *V* and *O-E* for the time interval using equations (21), (22) and (8), as in section 10.

#### Step 4b. Estimate the O-E and V and HR for the current interval using the numbers of events and the numbers at risk during the current interval

*E*and then

*O-E*within in each interval:

*E*as:

And the *O-E*:

*V*. However, equation (13) is preferred if the randomisation ratio is not 1:1, or the numbers at risk during intervals are very different, e.g. because there is a big difference in effect between arms of the trial.

#### Step 6, combining all time intervals

Taking all time intervals and censoring into account and using equation (1) as in section 10, gives a pooled HR of 0.88 and *V* of 119.80 (95%CI of 0.74–1.05).

### Interpreting the hazard ratio (HR)

Usually a HR calculated for a trial or a meta-analysis is interpreted as the relative risk of an event on the research arm compared to control. However, it can also be translated into an absolute difference in the proportion of patients who are event-free at a particular time point or for particular groups of patient, assuming proportional hazards:

exp [ln(proportion of patients event-free) × HR] - proportion event-free

and then the difference between medians:

Median time event free on research - Median time event free on research

These measures require an estimate of the proportion of patients that are event-free in the control group or subgroup of interest and an estimate of the median time event-free in the control group, respectively. Such data may be obtained from a Kaplan-Meier curve of a representative trial or individual patient data meta-analysis, or even from epidemiological data. Alternatively, it may be possible to use 'typical' values from other literature.

Using the bladder cancer example, the HR of 0.85 and an estimated 2-year survival of 58% for patients on the control arm, gives an absolute improvement:

exp [ln(0.58) × 0.85] - 0.58 = 0.05

in survival of 5% at 2 years, taking it from 58% to 63%.

The median survival on control was estimated to be 37 months and so the median survival on the research arm is:

37.0/0.85 = 43.6

43.6 months, giving an absolute improvement in median survival:

43.6 - 37.0

of 6.6 months with the research treatment.

### Calculations spreadsheet

*V*, and

*O-E*by all possible methods. The user can also input data extracted from Kaplan-Meier curves and estimate censoring using the minimum and maximum follow-up or the reported numbers at risk, to obtain similar summary statistics. Graphical representations of the input data are produced for comparison with the published curves, to assist with data extraction or to highlight data entry errors. Results from all methods are provided in a single output screen, which facilitates comparison. The main features of the calculations spreadsheet are illustrated in Figure 2 and the spreadsheet itself is freely available to readers (see Additional file 1).

## Discussion

We have presented methods for calculating a HR and/or associated statistics from published time-to-event-analyses [1, 2] into a practical, less statistical guide. A corresponding, easy-to-use calculations spreadsheet, to facilitate the computational aspects, is available from the authors. The resulting summary statistics can then be used in the meta-analysis procedures found in statistical and meta-analysis software.

There is a hierarchy in the methods described [1, 2]. The direct methods make no assumptions and are preferable, followed by the various indirect methods based on reported statistics. The curve methods are likely to be the least reliable and it is not yet clear which method of adjusting for censoring is most reliable. If both curve methods are possible, the choice between the two may be a pragmatic one, depending on whether the minimum and maximum follow-up are reported or need to be estimated, and how many time points the number at risk are reported for and the event rate between those time points. The development of a hybrid of the two curve methods might optimise use of available data. Also, it is not clear how different schemes for dividing up the Kaplan Meier curves may impact on the resulting statistics. In fact, further research is required to assess how well all of the methods perform according to variations in, for example, trial size, levels of follow-up or event rates.

Although the methods provide a means of analysing time-to-event outcomes for individual trials, they cannot circumvent the other well-known problems of relying on only published data for systematic reviews and meta-analyses. For example, it may not be possible to include all relevant trials, either because trials are not published or because the trial report does not include the outcome of interest, situations which could lead to publication bias [11–13] or selective outcome reporting bias [14], respectively. Similarly, these methods cannot correct common problems with the original reported analyses, such as the exclusion of patients [15, 16], analyses which are not by intention-to-treat [17] or analyses confined to particular patient subgroups, which may also lead to bias [16]. Furthermore, if the time-to-event outcome of interest is a long-term outcome, such as survival, then any HR estimation for an individual trial or meta-analysis will be limited by the extent of follow-up at the time that trials are reported. Such issues are relevant to all trials, systematic reviews and meta-analyses and so they should always be taken into account in interpreting results of these studies. Their relative impact is likely to vary between outcomes, trials, meta-analyses and healthcare areas and some may be addressed by obtaining further or updated information direct from trial investigators.

While the methods described previously [1, 2] and elaborated here are not a substitute for the re-analysis IPD from all randomised patients, they offer the most appropriate way of analysing time-to-event outcomes, when IPD is not available or the approach is infeasible. Thus, whenever possible they should be used in preference to using a pooled OR or RR or a series of ORs or RRs at fixed time points. This should improve the quality of the analysis and subsequent interpretation of systematic reviews and meta-analyses that include time-to-event outcomes.

## Appendix 1: Previously published formulae for generating hazard ratios from published time-to-event data [1, 2]. The number in brackets link these to their descriptive equivalent in the text

*1. Generating the O-E, V, HR and lnHR from reported summary statistics*

For equations 1–16 and following the notation of Parmar *et al.* [1], for trial *i*:

*O*
_{
ri
}= observed number of events in the research group

*E*
_{
ri
}= logrank expected events in the research group

*O*
_{
ci
}= observed number of events in the control group

*E*
_{
ci
}= logrank expected events in the control group

*O*
_{
r
}- *E*
_{
r
}observed minus expected events in the research group

*O*
_{
i
}= total observed events (*O*
_{
ri
}+ *O*
_{
ci
})

*V*
_{
ri
}= logrank variance

ln(HR_{
i
}) = log HR

var[ln(HR_{
i
}) = variance of the log hazard ratio

UPPCI_{
i
}= Value for the upper end of the confidence interval

LOWCI_{
i
}= Value for the lower end of the confidence interval

Φ^{-1}(1-α_{
i
}/2) = z score for the upper end of the confidence intreval

*R*
_{
ri
}= number randomised to the research group

*R*
_{
ci
}= number randomised to the control group

p_{
i
}
*=* reported two-sided p-value associated with the logrank or Mantel-Haenszel test (or Cox model)

*Estimating a pooled lnHR from a series of trials*

*Estimating the O-E, V, HR and lnHR from reported summary statistics*

Indirect estimation of the variance of the lnHR from the number of events:

*V*
_{
ri
}= *O*
_{
ri
}
*O*
_{
ci
}/*O*
_{
i
}

*V*
_{
ri
}= *O*
_{
i
}/4

*2. Generating the HR and V from published Kaplan-Meier curves and follow-up*

For equations 17–22, and following the notation of Parmar *et al.* [1], for trial *i* and *T* non-overlapping time points (*t* = 1, ...,*T*) :

*t* = whole time interval (*t -* 1, *t*)

*t*
_{
s
}= start of the time interval (*t -* 1, *t*)

*t*
_{
e
}= end of the time interval (*t -* 1, *t*)

*R*
_{
ri
}(*t*) = effective number of patients at risk on the research arm during time interval (*t -* 1, *t*)

*R*
_{
ri
}(*t -* 1) = effective number of patients at risk on the research arm during time interval *(t - 2*,*t -* 1)

*D*
_{
ri
}(*t*) = effective number of events on the research arm during time interval (*t -* 1, *t*)

*D*
_{
ci
}(*t*) = effective number of events on the control arm during time interval (*t -* 1, *t*)

*D*
_{
ri
}(*t -* 1) = effective number of events on the research arm during time interval *(t - 2*,*t -* 1)

*C*
_{
ri
}(*t*) = effective number of patients censored on the research arm during time interval (*t -* 1, *t*)

*C*
_{
ci
}(*t*) = effective number of patients censored on the control arm during time interval (*t -* 1, *t*)

*C*
_{
ri
}(*t - 1*) = effective number of patients censored on the research arm during time interval (*t -* 2, *t - 1)*

S_{
ri
}(*t*
_{
s
}) = event-free probability on the research arm at the start of time interval (*t -* 1, *t*)

S_{
ri
}(*t*
_{
e
}) = event-free probability on the research arm at the end of time interval (*t -* 1, *t*)

*F*
_{
min
}= minimum follow-up

*F*
_{
max
}= maximum follow-up

Estimation of the numbers event-free at the start of a time interval:

*R*
_{
ri
}(*t*
_{
s
}) = *R*
_{
ri
}(*t* - 1)- *D*
_{
ri
}(*t* - 1) - *C*
_{
ri
}(*t* - 1)

Estimation of the numbers at risk during a time interval, adjusted for censoring

*R*
_{
ri
}(*t*) = *R*
_{
ri
}(*t*
_{
s
})- *C*
_{
ri
}(*t*)

Note that equations 17–20 are also are used for the control arm.

*V*for a time interval from a Kaplan-Meier curve

*3. Generating the HR and V from published Kaplan-Meier curves and the numbers at risk*

For equations 23–26, and following the notation of [2], for time interval *i*:

*j* = treatment group (where 1 = the control arm and 2= the research arm)

*t*
_{
i-1}= time at the start of the current interval

*t*
_{
i-1}= time at the start of the prior interval

*n*
_{
j,i
}= number at risk at end of interval [*t*
_{
i-1,}
*t*
_{
i
}) in group *j*

*n*
_{
j,i-1}= number at risk at start of interval [*t*
_{
i-1,}
*t*
_{
i
}) in group *j*

*n**
_{
j,i
}= number at risk during interval [*t*
_{
i-1,}
*t*
_{
i
}) in group *j*

*d**
_{
j,i
}= number of events during interval [*t*
_{
i-1,}
*t*
_{
i
}) in group *j*

*c**
_{
j,i
}= number censored during interval [*t*
_{
i-1,}
*t*
_{
i
}) in group *j*

*s**
_{
j,i
}= event-free probability at end of interval [*t*
_{
i-1,}
*t*
_{
i
}) in group *j*

*s**
_{
j,i-1}= event-free probability at start of interval [*t*
_{
i-1,}
*t*
_{
i
}) in group *j*

*e**
_{
j,i
}= logrank expected events during interval [*t*
_{
i-1,}
*t*
_{
i
}) in group j = 2 (the research arm)

## Appendix 2: Estimating or educated 'guesstimating' minimum and maximum follow-up

When the minimum and maximum follow-up are not explicitly reported, it may be possible to estimate them for a particular trial, provided that some indicators of extent of follow-up are provided. In descending order of preference, the following are some strategies that we have employed to estimate the minimum and maximum follow-up:

### For minimum follow-up, if the trial report presents

- 1.
Censoring tick marks on Kaplan-Meier curve

Assume first tick mark indicates the point of minimum follow-up

- 2.
Median follow-up and accrual period

Assume minimum follow-up = median follow-up minus half the accrual period

- 3.
Date of analysis and accrual period, could assume

Assume minimum follow-up = date of analysis minus final date of accrual

- 4.
Date of submission and accrual period

Assume estimated date of analysis = date of submission minus 6 months

Assume minimum follow-up = estimated date of analysis minus final date of accrual

### For maximum follow-up, if the trial report presents

- 1.
Censoring tick marks on Kaplan-Meier curve

Assume last tick mark indicates the point of maximum follow-up

- 2.
Median follow-up and accrual period

Assume maximum follow-up = median follow-up plus half the accrual period

- 3.
Date of analysis and accrual period, could assume

Assume maximum follow-up = date of analysis minus first date of accrual

- 4.
Date of submission and accrual period, could assume

Assume estimated date of analysis = date of submission minus 6 months

Assume maximum follow-up = estimated date of analysis minus first date of accrual

## Declarations

### Acknowledgements

We are grateful to Mahesh Parmar for comments on an earlier draft of the manuscript and to both him and Paula Williamson for advice on the methods. Also, the calculations spreadsheet was based on ones initially developed by Sarah Simnett and Josie Sandercock for calculating hazard ratios from Kaplan-Meier curves. This work was funded by the UK Medical Research Council and the Australian Medical Research Council.

## Authors’ Affiliations

## References

- Parmar MKB, Torri V, Stewart L: Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints. Statistics in Medicine. 1998, 17: 2815-34. 10.1002/(SICI)1097-0258(19981230)17:24<2815::AID-SIM110>3.0.CO;2-8.View ArticlePubMedGoogle Scholar
- Williamson PR, Tudur Smith C, Hutton JL, Marson AG: Aggregate data meta-analysis with time-to-event outcomes. Statistics in Medicine. 2002, 21: 3337-51. 10.1002/sim.1303.View ArticlePubMedGoogle Scholar
- Altman DG, De Stavola BL, Love SB, Stepniewska KA: Review of survival analyses published in cancer. British Journal of Cancer. 1995, 72: 511-8.View ArticlePubMedPubMed CentralGoogle Scholar
- Pocock SJ, Clayton TC, Altman DG: Survival plots of time-to-event ouctomes in clinical trials. Lancet. 2002, 359: 1686-9. 10.1016/S0140-6736(02)08594-X.View ArticlePubMedGoogle Scholar
- Mangioni C, Bolis G, Pecorelli S, Bragman K, Epis A, Favalli G, Gambino A, Landoni F, Presti M, Torri W, Vassena L, Zanaboni F, Marsoni S: Randomized trial in advanced ovarian cancer comparing cisplatin and carboplatin. Journal of the National Cancer Institute. 1989, 81: 1461-71. 10.1093/jnci/81.19.1464.View ArticleGoogle Scholar
- International Collaboration of Trialists on behalf of the Medical Research Council Advanced Bladder Cancer Working Party, EORTC Genito-urinary Group Australian Bladder Cancer Study Group, National Cancer Institute of Canada Clinical Trials Group, Finnbladder, Norwegian Bladder Cancer Study Group and Club Urologico Espanol de Tratamiento Oncologico (CUETO) group: Neoadjuvant cisplatin, methotrexate, and vinblastine chemotherapy for muscle-invasive bladder cancer: a randomised controlled trial. Lancet. 1999, 354: 533-40. 10.1016/S0140-6736(99)02292-8.View ArticleGoogle Scholar
- Yusuf S, Peto R, Lewis JA, Collins R, Sleight P: Beta blockade during and after myocardial infarction: an overview of the randomized trials. Progress in Cardiovascular Diseases. 1985, 27: 335-71. 10.1016/S0033-0620(85)80003-7.View ArticlePubMedGoogle Scholar
- DerSimonian R, Laird N: Meta-analysis in clinical trials. Controlled Clinical Trials. 1986, 7: 177-88. 10.1016/0197-2456(86)90046-2.View ArticlePubMedGoogle Scholar
- Tudur C, Williamson PR, Khan S, Best L: The value of the aggregate data approach in meta-analysis with time-to-event outcomes. Journal of the Royal Statistical Society A. 2001, 164: 357-70. 10.1111/1467-985X.00207.View ArticleGoogle Scholar
- Tierney JF, Burdett S, Stewart LA: Feasibility and reliability of using hazard ratios in meta-analyses of published time-to-event data. preparation.Google Scholar
- Dickersin K: The existence of publication bias and risk factors for its occurrence. Journal of the American Medical Association. 1990, 263: 1385-9. 10.1001/jama.263.10.1385.View ArticlePubMedGoogle Scholar
- Easterbrook PJ, Berlin JA, Gopalan R, Matthews DR: Publication bias in clinical research. Lancet. 1991, 337: 867-72. 10.1016/0140-6736(91)90201-Y.View ArticlePubMedGoogle Scholar
- Dickersin K, Min Y-I, Meinert CL: Factors influencing publication of research results. Journal of the American Medical Association. 1992, 267: 374-8. 10.1001/jama.267.3.374.View ArticlePubMedGoogle Scholar
- Chan A-W, Hróbjartsson A, Haarh MT, Gøtzche PC, Altman DG: Empirical evidence for selective reporting of outcomes in randomized trials. Journal of the American Medical Association. 2004, 291: 2457-65. 10.1001/jama.291.20.2457.View ArticlePubMedGoogle Scholar
- Schulz KF, Grimes DA, Altman DG, Hayes RJ: Blinding and exclusions after allocation in randomised controlled trials: survey of published parallel group trials in obstetrics and gynaecology. BMJ. 1996, 312: 742-4.View ArticlePubMedPubMed CentralGoogle Scholar
- Tierney JF, Stewart LA: Investigating patient exclusion bias in meta-analysis. International Journal of Epidemiology. 2005, 34 (1): 79-87. 10.1093/ije/dyh300.View ArticlePubMedGoogle Scholar
- Hollis S, Campbell F: What is meant by intention-to-treat analysis? Survey of published randomised controlled trials. BMJ. 1999, 319: 670-4.View ArticlePubMedPubMed CentralGoogle Scholar

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.