Using a patient-centred composite endpoint in a secondary analysis of the Control of Hypertension in Pregnancy Study (CHIPS) Trial
Trials volume 24, Article number: 99 (2023)
Clinical trials commonly use multiple endpoints to measure the impact of an intervention. While this improves the comprehensiveness of outcomes, it can make trial results difficult to interpret. We examined the impact of integrating patient weights into a composite endpoint on the interpretation of Control of Hypertension in Pregnancy Study (CHIPS) Trial results.
Outcome weights were extracted from a previous patient preferences study in pregnancy hypertension (N = 183 women) which identified (i) seven outcomes most important to women (taking medication, severe hypertension, pre-eclampsia, blood transfusion, Caesarean, delivery < 34 weeks, and baby born smaller-than-expected) and (ii) three preference subgroups: (1) ‘equal prioritizers’, 62%; (2) ‘early delivery avoiders’, 23%; and (3) ‘medication minimizers’, 14%.
Outcome weights from the preference subgroups were integrated with CHIPS data for the seven outcomes identified in the preference study. A weighted composite score was derived for each participant by multiplying the preference weight for each outcome by the binary outcome if it occurred. Analyses considered equal weights and those from the preference subgroups. The mean composite scores were compared between trial arms (t-tests).
Composite scores were similar between trial arms with the use of equal weights or those of subgroup (1) (95% confidence intervals [CIs]: − 0.03, 0.02; p > 0.50 for each). ‘Tight’ control was superior when using subgroup (2) weights (95% CIs: 0.002, 0.07; p = 0.03), and ‘less-tight’ control was superior when using subgroup (3) weights (95% CIs: − 0.11, − 0.04; p < 0.01).
Evidence-based recommendations for ‘tight’ control are consistent with most women’s preferences, but for a sixth of women, ‘less-tight’ control is more preference consistent. Depending on patient preferences, a single trial may support different interventions. Future trials should specify component weights to improve interpretation.
Clinical trials in cardiovascular medicine routinely use primary, secondary, and other endpoints to capture the breadth of an intervention’s effects. However, this can make the interpretation of trial results challenging, as an intervention’s effects can vary by outcome, including benefits and harms . Composite endpoints are often used to overcome these challenges, particularly in pregnancy, “ … to circumvent a contrived prioritization of one-half of the mother–infant pair and acknowledge the interconnectedness of mothers and babies at the time of childbirth.”  Composites are typically dichotomous, and treat outcomes as equal, which may not be the case. To apply trial results in practice, clinicians and patients must consider which endpoints are important to them and to what degree .
The international Control of Hypertension in Pregnancy Study (CHIPS; ClinicalTrials.gov NCT01192412)  randomized controlled trial (RCT) compared ‘less-tight’ with ‘tight’ control of blood pressure (BP) for management of chronic or gestational hypertension; women who progressed to preeclampsia remained in their allocated group. ‘Less-tight’ control aimed to minimize antihypertensive therapy (target diastolic BP of 100 mmHg), while ‘tight’ control aimed to normalize BP (target diastolic BP of 85 mmHg). While ‘tight’ (vs. ‘less-tight’) control did not change the incidence of the primary foetal/newborn and secondary maternal composite outcomes (with equally valued components) , ‘tight’ control has been recommended by many guidelines based on a decrease in severe maternal hypertension and some preeclampsia-related complications [5,6,7,8]. These findings were recently replicated in a separate trial . However, recommendations did not integrate women’s preferences or concerns, like taking medications during pregnancy .
In a secondary analysis of CHIPS Trial data, we explored whether weighting outcomes to reflect patient preferences would change the interpretation of trial results.
Outcome data from the 981 women enrolled in CHIPS were included (Table 1). The inclusion criteria were as follows: 14+0–33+6 weeks’ gestation, nonproteinuric chronic or gestational hypertension, office diastolic BP of 90–105 mmHg (or 85–105 mmHg if the women were taking antihypertensive medication), and a live foetus . On average, participants were ≈ 34 years of age and enrolled at ≈ 24 weeks. Most (75%) women had chronic hypertension. Roughly half were taking antihypertensives.
Preferences were obtained from a separate study , in which 183 pregnant women in Canada prioritized CHIPS Trial outcomes, including the primary perinatal (pregnancy loss and/or neonatal care unit admission > 48 h) and secondary maternal outcomes (serious maternal complications). In semi-structured focus groups and individual interviews, participants identified five maternal and two foetal/newborn outcomes as important and sufficiently different between treatment arms to influence their preferred BP control (Table 1) . Preference subgroup weights were derived from a best-worst scaling task (BWS). In this task, participants were shown a series of choice sets each comprising four of the seven prioritized outcomes and asked to select the outcome that was most important to them to avoid and the outcome that was least important to them to avoid . As the BWS used a balanced-incomplete block design, all outcomes were presented the same number of times and compared to all other attributes once. BWS analyses used Latent Gold 5.1 . Conditional logit models of BWS responses quantified the relative value of each prioritized outcome (where each outcome’s relative importance was expressed as a proportion, and all components summed to 100%) . Latent class analysis identified three preference subgroups and their respective weights (Table 1): (1) equal prioritizers (62%) who placed fairly equal weight on each outcome, (2) early delivery avoiders (23%) who prioritized avoiding delivery before 34 weeks (weight of 42%), and (3) medication minimizers (14%) who prioritized avoiding antihypertensive medication (weight of 58%).
We considered equal weights (as assumed in conventional analysis) and the three preference subgroup weights. For each approach, a composite score was derived for each CHIPS trial participant by multiplying the patient preference weight for each outcome by the binary outcome of its occurrence . Thus, higher composite scores indicated worse outcomes (more highly weighted events occurred). The mean composite scores between interventions were compared using t-tests with an a priori p-value set at < 0.05. Analyses were conducted using RStudio .
A threshold analysis for preference subgroups that supported ‘less-tight’ over ‘tight’ control was conducted to determine the extent to which preferences would need to shift to yield a finding congruent with current clinical guidance. The threshold analysis systematically reduced the weight assigned to the most highly weighted composite component and distributed the removed weight across the six other components proportionately to the weight assigned in the preference profile. The proportional weight for a given outcome was calculated as the weight assigned to that outcome divided by the sum of the weights assigned to all of the outcomes except the highest weighted outcome. For example, using subgroup (3) weights, the weight assigned to severe hypertension would increase by 0.047 which is equal to 0.01 (the weight reduction) multiplied by the weight assigned to severe hypertension (0.20) and divided by 1 minus the subgroup (3) weight assigned to minimizing antihypertensive medication—the highest weighted outcome (1 − 0.58 = 0.42). These redistributed weights were calculated for each one-point reduction in the highest weighted composite component. The primary analysis (t-test) was then repeated for each set of redistributed weights.
This study was reviewed and approved by the Behavioural Research Ethics Board (H17-01194) at the University of British Columbia.
Table 2 shows that using equal weights in the composite score produced no difference in score between treatment arms; the significantly higher frequency of antihypertensive medication use in ‘tight’ control was offset by the significantly higher frequency of severe hypertension in ‘less-tight’ control. Similar results were found using subgroup (1) weights (equal prioritizers).
Using subgroup (2) weights (early delivery avoiders), the apparently lower rate of early delivery (and significantly lower incidence of severe hypertension) in the ‘tight’ control arm resulted in a lower (better) composite outcome score for ‘tight’ (vs ‘less-tight’) control. The use of significantly more antihypertensive therapy in ‘less-tight’ control contributed little given the low weighting of this outcome (Table 2).
Using subgroup (3) weights (medication minimizers), the significantly lower frequency of antihypertensive medication use in ‘less-tight’ (vs. ‘tight’) control, combined with a high weighting (58%) resulted in a significantly lower (better) composite outcome score, despite significantly more severe hypertension (20% weight) (Table 2).
The threshold analysis conducted for subgroup (3) showed that once the weight applied to avoiding antihypertensive medication was reduced to 0.41 (from 0.58), ‘less-tight’ control was no longer the preferred treatment (Fig. 1).
This re-analysis of CHIPS trial outcomes incorporated patient views and demonstrated that integrating patient preferences for outcomes and their associated weights into trial analyses is feasible and can identify different management approaches based on the results of a single trial. Our findings suggest that while almost two-thirds of women prioritize adverse outcomes equally, as assumed in the primary CHIPS analyses, about one-quarter prioritize very preterm birth that clearly favours ‘tight’ control. A distinct minority prioritize minimizing antihypertensive medication above other adverse outcomes, making ‘less-tight’ control the most value-congruent BP management for them.
Recent clinical practice guidelines have recommended ‘tight’ control of pregnancy hypertension [5,6,7,8], based on the findings of a significant reduction in the development of severe hypertension and some preeclampsia-related complications, without an increase in perinatal risk, from CHIPS  and other RCTs . Our findings suggest that ‘tight’ control is appropriate for the vast majority (≈ 85%) of pregnant women.
While integrating preferences into composites has been considered in cardiology [3, 14] and other fields , this is the first study to integrate patient weights with individual event data from a high-quality RCT in pregnancy. Our findings show that specifying outcome weights may change the interpretation of trial results when applied to individual women. Importantly, our methods are easily adapted to other trial and non-trial approaches and can be used with other statistical methods that accommodate confounders and covariates (e.g. linear regression; ANCOVA).
Limitations of our work include the use of preference weights that reflect women’s values in Canada; despite its multiethnic population, values may differ elsewhere. Preferences were identified after CHIPS was completed; consequently, different composite components may have been identified a priori. However, CHIPS evaluated the standard obstetric outcomes that cover most of the subsequently published relevant core outcome set. These results are statistically significant at the group level, but clinical significance likely depends on individual preferences. Additionally, our method of preference elicitation may have been too cognitively burdensome for some participants. BWS was chosen because it can provide cardinal importance values on an additive scale. As a result, weights can be directly compared to one another and the magnitude of the difference in the importance of outcomes is known. Alternative methods of analysis which use ranks, rather than weights, to incorporate the importance of composite components were considered [18,19,20]. These approaches have advantages in that ranks may be easier to ascertain and more intuitive to use, but they also pose some challenges. For example, ranking approaches that compare intervention and control participant outcomes in order of composite component importance often stop at the first difference in component outcomes (e.g., win ratio ). These approaches risk excluding information on lower ranked components that are still important to patients. Other methods that use all components (e.g., O’Brien’s global rank method ) can lack specificity on how to rank components (e.g., rank all outcomes  or rank hierarchically ) and on how to address ties between participants. While a weighting approach seemed most appropriate for our analysis, there are potential benefits of a rank-based analysis in other contexts which should be considered in future composite analyses. Finally, our approach presents challenges for statistical power (e.g., power calculations), although these come with the benefit of improved interpretability.
This study illustrates that integrating patient values into trial analyses can change the interpretation of trial results for clinical decision-making. Future trials with composite or multiple outcomes should seek patient preference weights to improve the interpretation of trial results and support patient-centred care.
Availability of data and materials
The datasets used and/or analysed during the current study are not publicly posted as participant consent was not obtained for the open distribution of data. However, data are available from the corresponding author upon reasonable request. Aggregate data are available as part of the Supplementary Appendix of the initial publication of the CHIPS trial results (10.1056/NEJMoa1404595).
Analysis of covariance
Control of Hypertension in Pregnancy Study
Randomized controlled trial
Cordoba G, Schwartz L, Woloshin S, Bae H, Gøtzsche PC. Definition, reporting, and interpretation of composite outcomes in clinical trials: systematic review. BMJ. 2010;341:c3920.
Panariello N, Jurczak A, Spector J, Kumar V, Semrau K. Coherence in measurement and programming in maternal and newborn health: experience from the BetterBirth trial. J Clin Epidemiol. 2019;113:83–5.
Stolker JM, et al. Rethinking composite end points in clinical trials: insights from patients and trialists. Circulation. 2014;130:1254–61.
Magee LA, et al. Less-tight versus tight control of hypertension in pregnancy. N Engl J Med. 2015;372:407–17.
Butalia S, et al. Hypertension Canada’s 2018 Guidelines for the management of hypertension in pregnancy. Can J Cardiol. 2018;34:526–31.
World Health Organization. WHO recommendations on drug treatment for non-severe hypertension in pregnancy. (2020).
National Institute for Health and Care Excellence. Hypertension in pregnancy: diagnosis and management, vol. 55 https://www.nice.org.uk/guidance/ng133; 2019.
Magee LA, et al. The hypertensive disorders of pregnancy: the 2021 International Society for the Study of Hypertension in Pregnancy Classification, Diagnosis & Management Recommendations for International Practice. Pregnancy Hypertens. 2021. https://doi.org/10.1016/j.preghy.2021.09.008.
Tita AT, et al. Treatment for mild chronic hypertension during pregnancy. N Engl J Med. 2022;386:1781–92.
Sinclair M, Lagan BM, Dolk H, McCullough JEM. An assessment of pregnant women’s knowledge and use of the Internet for medication safety information and purchase. J Adv Nurs. 2018;74:137–47.
Metcalfe RK, et al. Patient preferences and decisional needs when choosing a treatment approach for pregnancy hypertension: a stated preference study. Can J Cardiol. 2020;36:775–9.
Mühlbacher AC, Zweifel P, Kaczynski A, Johnson FR. Experimental measurement of preferences in health care using best-worst scaling (BWS): theoretical and statistical issues. Health Econ Rev. 2016;6. https://doi.org/10.1186/s13561-015-0077-z.
Statistical Innovations. Latent Gold, vol. 5; 2016. p. 1.
Ahmad Y, et al. A new method of applying randomised control study data to the individual patient: a novel quantitative patient-centred approach to interpreting composite end points. Int J Cardiol. 2015;195:216–24.
RStudio Team. RStudio: integrated development for R: (PBC; 2020.
Abalos E, Duley L, Steyn DW, Gialdini C. Antihypertensive drug therapy for mild to moderate hypertension during pregnancy. Cochrane Database Syst Rev. 2018. https://doi.org/10.1002/14651858.CD002252.pub4.
Udogwu UN, et al. A patient-centered composite endpoint weighting technique for orthopaedic trauma research. BMC Med Res Methodol. 2019;19:242.
Pocock SJ, Ariti CA, Collier TJ, Wang D. The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. Eur Heart J. 2012;33:176–82.
Buyse M. Generalized pairwise comparisons of prioritized outcomes in the two-sample problem. Stat Med. 2010;29:3245–57.
O’Brien PC. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40:1079–87.
Felker GM, Maisel AS. A global rank end point for clinical trials in acute heart failure. Circulation. 2010;3:643–6.
We would like to acknowledge the time and contributions of the 981 participants in the CHIPS trial and the 183 participants in the preferences study that made this work possible, as well as the members of the CHIPS Study Group (Additional file 1: Table S1).
This study was funded by peer-reviewed grants from two government entities: the Canadian Institutes of Health Research (MCT 87522) and the BC SUPPORT Unit (RWCT-001). Funders had no involvement in the design of the study; the collection, analysis and interpretation of the data; or the presentation of findings.
Ethics approval and consent to participate
This study was reviewed and approved by the Behavioural Research Ethics Board (H17-01194) at the University of British Columbia.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Metcalfe, R.K., Harrison, M., Singer, J. et al. Using a patient-centred composite endpoint in a secondary analysis of the Control of Hypertension in Pregnancy Study (CHIPS) Trial. Trials 24, 99 (2023). https://doi.org/10.1186/s13063-023-07118-1