Skip to main content

Using a patient-centred composite endpoint in a secondary analysis of the Control of Hypertension in Pregnancy Study (CHIPS) Trial



Clinical trials commonly use multiple endpoints to measure the impact of an intervention. While this improves the comprehensiveness of outcomes, it can make trial results difficult to interpret. We examined the impact of integrating patient weights into a composite endpoint on the interpretation of Control of Hypertension in Pregnancy Study (CHIPS) Trial results.


Outcome weights were extracted from a previous patient preferences study in pregnancy hypertension (N = 183 women) which identified (i) seven outcomes most important to women (taking medication, severe hypertension, pre-eclampsia, blood transfusion, Caesarean, delivery < 34 weeks, and baby born smaller-than-expected) and (ii) three preference subgroups: (1) ‘equal prioritizers’, 62%; (2) ‘early delivery avoiders’, 23%; and (3) ‘medication minimizers’, 14%.

Outcome weights from the preference subgroups were integrated with CHIPS data for the seven outcomes identified in the preference study. A weighted composite score was derived for each participant by multiplying the preference weight for each outcome by the binary outcome if it occurred. Analyses considered equal weights and those from the preference subgroups. The mean composite scores were compared between trial arms (t-tests).


Composite scores were similar between trial arms with the use of equal weights or those of subgroup (1) (95% confidence intervals [CIs]: − 0.03, 0.02; p > 0.50 for each). ‘Tight’ control was superior when using subgroup (2) weights (95% CIs: 0.002, 0.07; p = 0.03), and ‘less-tight’ control was superior when using subgroup (3) weights (95% CIs: − 0.11, − 0.04; p < 0.01).


Evidence-based recommendations for ‘tight’ control are consistent with most women’s preferences, but for a sixth of women, ‘less-tight’ control is more preference consistent. Depending on patient preferences, a single trial may support different interventions. Future trials should specify component weights to improve interpretation.

Trial registration NCT01192412

Peer Review reports


Clinical trials in cardiovascular medicine routinely use primary, secondary, and other endpoints to capture the breadth of an intervention’s effects. However, this can make the interpretation of trial results challenging, as an intervention’s effects can vary by outcome, including benefits and harms [1]. Composite endpoints are often used to overcome these challenges, particularly in pregnancy, “ … to circumvent a contrived prioritization of one-half of the mother–infant pair and acknowledge the interconnectedness of mothers and babies at the time of childbirth.” [2] Composites are typically dichotomous, and treat outcomes as equal, which may not be the case. To apply trial results in practice, clinicians and patients must consider which endpoints are important to them and to what degree [3].

The international Control of Hypertension in Pregnancy Study (CHIPS; NCT01192412) [4] randomized controlled trial (RCT) compared ‘less-tight’ with ‘tight’ control of blood pressure (BP) for management of chronic or gestational hypertension; women who progressed to preeclampsia remained in their allocated group. ‘Less-tight’ control aimed to minimize antihypertensive therapy (target diastolic BP of 100 mmHg), while ‘tight’ control aimed to normalize BP (target diastolic BP of 85 mmHg). While ‘tight’ (vs. ‘less-tight’) control did not change the incidence of the primary foetal/newborn and secondary maternal composite outcomes (with equally valued components) [4], ‘tight’ control has been recommended by many guidelines based on a decrease in severe maternal hypertension and some preeclampsia-related complications [5,6,7,8]. These findings were recently replicated in a separate trial [9]. However, recommendations did not integrate women’s preferences or concerns, like taking medications during pregnancy [10].

In a secondary analysis of CHIPS Trial data, we explored whether weighting outcomes to reflect patient preferences would change the interpretation of trial results.


We integrated pregnant women’s preferences for the management of pregnancy hypertension [11] with individual event data from the CHIPS Trial [4].

Outcome data from the 981 women enrolled in CHIPS were included (Table 1). The inclusion criteria were as follows: 14+0–33+6 weeks’ gestation, nonproteinuric chronic or gestational hypertension, office diastolic BP of 90–105 mmHg (or 85–105 mmHg if the women were taking antihypertensive medication), and a live foetus [4]. On average, participants were ≈ 34 years of age and enrolled at ≈ 24 weeks. Most (75%) women had chronic hypertension. Roughly half were taking antihypertensives.

Table 1 CHIPS trial event rates of the seven outcomes women prioritized, overall and by trial arma

Preferences were obtained from a separate study [11], in which 183 pregnant women in Canada prioritized CHIPS Trial outcomes, including the primary perinatal (pregnancy loss and/or neonatal care unit admission > 48 h) and secondary maternal outcomes (serious maternal complications). In semi-structured focus groups and individual interviews, participants identified five maternal and two foetal/newborn outcomes as important and sufficiently different between treatment arms to influence their preferred BP control (Table 1) [4]. Preference subgroup weights were derived from a best-worst scaling task (BWS). In this task, participants were shown a series of choice sets each comprising four of the seven prioritized outcomes and asked to select the outcome that was most important to them to avoid and the outcome that was least important to them to avoid [12]. As the BWS used a balanced-incomplete block design, all outcomes were presented the same number of times and compared to all other attributes once. BWS analyses used Latent Gold 5.1 [13]. Conditional logit models of BWS responses quantified the relative value of each prioritized outcome (where each outcome’s relative importance was expressed as a proportion, and all components summed to 100%) [11]. Latent class analysis identified three preference subgroups and their respective weights (Table 1): (1) equal prioritizers (62%) who placed fairly equal weight on each outcome, (2) early delivery avoiders (23%) who prioritized avoiding delivery before 34 weeks (weight of 42%), and (3) medication minimizers (14%) who prioritized avoiding antihypertensive medication (weight of 58%).

We considered equal weights (as assumed in conventional analysis) and the three preference subgroup weights. For each approach, a composite score was derived for each CHIPS trial participant by multiplying the patient preference weight for each outcome by the binary outcome of its occurrence [14]. Thus, higher composite scores indicated worse outcomes (more highly weighted events occurred). The mean composite scores between interventions were compared using t-tests with an a priori p-value set at < 0.05. Analyses were conducted using RStudio [15].

A threshold analysis for preference subgroups that supported ‘less-tight’ over ‘tight’ control was conducted to determine the extent to which preferences would need to shift to yield a finding congruent with current clinical guidance. The threshold analysis systematically reduced the weight assigned to the most highly weighted composite component and distributed the removed weight across the six other components proportionately to the weight assigned in the preference profile. The proportional weight for a given outcome was calculated as the weight assigned to that outcome divided by the sum of the weights assigned to all of the outcomes except the highest weighted outcome. For example, using subgroup (3) weights, the weight assigned to severe hypertension would increase by 0.047 which is equal to 0.01 (the weight reduction) multiplied by the weight assigned to severe hypertension (0.20) and divided by 1 minus the subgroup (3) weight assigned to minimizing antihypertensive medication—the highest weighted outcome (1 − 0.58 = 0.42). These redistributed weights were calculated for each one-point reduction in the highest weighted composite component. The primary analysis (t-test) was then repeated for each set of redistributed weights.

This study was reviewed and approved by the Behavioural Research Ethics Board (H17-01194) at the University of British Columbia.


Table 2 shows that using equal weights in the composite score produced no difference in score between treatment arms; the significantly higher frequency of antihypertensive medication use in ‘tight’ control was offset by the significantly higher frequency of severe hypertension in ‘less-tight’ control. Similar results were found using subgroup (1) weights (equal prioritizers).

Table 2 Mean weighted composite outcome scorea by blood pressure control, and t scores for each analysis

Using subgroup (2) weights (early delivery avoiders), the apparently lower rate of early delivery (and significantly lower incidence of severe hypertension) in the ‘tight’ control arm resulted in a lower (better) composite outcome score for ‘tight’ (vs ‘less-tight’) control. The use of significantly more antihypertensive therapy in ‘less-tight’ control contributed little given the low weighting of this outcome (Table 2).

Using subgroup (3) weights (medication minimizers), the significantly lower frequency of antihypertensive medication use in ‘less-tight’ (vs. ‘tight’) control, combined with a high weighting (58%) resulted in a significantly lower (better) composite outcome score, despite significantly more severe hypertension (20% weight) (Table 2).

The threshold analysis conducted for subgroup (3) showed that once the weight applied to avoiding antihypertensive medication was reduced to 0.41 (from 0.58), ‘less-tight’ control was no longer the preferred treatment (Fig. 1).

Fig. 1
figure 1

Threshold analysis results for subgroup (3) medication minimizers


This re-analysis of CHIPS trial outcomes incorporated patient views and demonstrated that integrating patient preferences for outcomes and their associated weights into trial analyses is feasible and can identify different management approaches based on the results of a single trial. Our findings suggest that while almost two-thirds of women prioritize adverse outcomes equally, as assumed in the primary CHIPS analyses, about one-quarter prioritize very preterm birth that clearly favours ‘tight’ control. A distinct minority prioritize minimizing antihypertensive medication above other adverse outcomes, making ‘less-tight’ control the most value-congruent BP management for them.

Recent clinical practice guidelines have recommended ‘tight’ control of pregnancy hypertension [5,6,7,8], based on the findings of a significant reduction in the development of severe hypertension and some preeclampsia-related complications, without an increase in perinatal risk, from CHIPS [4] and other RCTs [16]. Our findings suggest that ‘tight’ control is appropriate for the vast majority (≈ 85%) of pregnant women.

While integrating preferences into composites has been considered in cardiology [3, 14] and other fields [17], this is the first study to integrate patient weights with individual event data from a high-quality RCT in pregnancy. Our findings show that specifying outcome weights may change the interpretation of trial results when applied to individual women. Importantly, our methods are easily adapted to other trial and non-trial approaches and can be used with other statistical methods that accommodate confounders and covariates (e.g. linear regression; ANCOVA).

Limitations of our work include the use of preference weights that reflect women’s values in Canada; despite its multiethnic population, values may differ elsewhere. Preferences were identified after CHIPS was completed; consequently, different composite components may have been identified a priori. However, CHIPS evaluated the standard obstetric outcomes that cover most of the subsequently published relevant core outcome set. These results are statistically significant at the group level, but clinical significance likely depends on individual preferences. Additionally, our method of preference elicitation may have been too cognitively burdensome for some participants. BWS was chosen because it can provide cardinal importance values on an additive scale. As a result, weights can be directly compared to one another and the magnitude of the difference in the importance of outcomes is known. Alternative methods of analysis which use ranks, rather than weights, to incorporate the importance of composite components were considered [18,19,20]. These approaches have advantages in that ranks may be easier to ascertain and more intuitive to use, but they also pose some challenges. For example, ranking approaches that compare intervention and control participant outcomes in order of composite component importance often stop at the first difference in component outcomes (e.g., win ratio [18]). These approaches risk excluding information on lower ranked components that are still important to patients. Other methods that use all components (e.g., O’Brien’s global rank method [20]) can lack specificity on how to rank components (e.g., rank all outcomes [20] or rank hierarchically [21]) and on how to address ties between participants. While a weighting approach seemed most appropriate for our analysis, there are potential benefits of a rank-based analysis in other contexts which should be considered in future composite analyses. Finally, our approach presents challenges for statistical power (e.g., power calculations), although these come with the benefit of improved interpretability.


This study illustrates that integrating patient values into trial analyses can change the interpretation of trial results for clinical decision-making. Future trials with composite or multiple outcomes should seek patient preference weights to improve the interpretation of trial results and support patient-centred care.

Availability of data and materials

The datasets used and/or analysed during the current study are not publicly posted as participant consent was not obtained for the open distribution of data. However, data are available from the corresponding author upon reasonable request. Aggregate data are available as part of the Supplementary Appendix of the initial publication of the CHIPS trial results (10.1056/NEJMoa1404595).



Analysis of covariance


Blood pressure


Best-worst scaling


Control of Hypertension in Pregnancy Study


Randomized controlled trial


Birth weight


  1. Cordoba G, Schwartz L, Woloshin S, Bae H, Gøtzsche PC. Definition, reporting, and interpretation of composite outcomes in clinical trials: systematic review. BMJ. 2010;341:c3920.

    Article  Google Scholar 

  2. Panariello N, Jurczak A, Spector J, Kumar V, Semrau K. Coherence in measurement and programming in maternal and newborn health: experience from the BetterBirth trial. J Clin Epidemiol. 2019;113:83–5.

    Article  Google Scholar 

  3. Stolker JM, et al. Rethinking composite end points in clinical trials: insights from patients and trialists. Circulation. 2014;130:1254–61.

    Article  Google Scholar 

  4. Magee LA, et al. Less-tight versus tight control of hypertension in pregnancy. N Engl J Med. 2015;372:407–17.

    Article  CAS  Google Scholar 

  5. Butalia S, et al. Hypertension Canada’s 2018 Guidelines for the management of hypertension in pregnancy. Can J Cardiol. 2018;34:526–31.

    Article  Google Scholar 

  6. World Health Organization. WHO recommendations on drug treatment for non-severe hypertension in pregnancy. (2020).

    Google Scholar 

  7. National Institute for Health and Care Excellence. Hypertension in pregnancy: diagnosis and management, vol. 55; 2019.

    Google Scholar 

  8. Magee LA, et al. The hypertensive disorders of pregnancy: the 2021 International Society for the Study of Hypertension in Pregnancy Classification, Diagnosis & Management Recommendations for International Practice. Pregnancy Hypertens. 2021.

  9. Tita AT, et al. Treatment for mild chronic hypertension during pregnancy. N Engl J Med. 2022;386:1781–92.

    Article  CAS  Google Scholar 

  10. Sinclair M, Lagan BM, Dolk H, McCullough JEM. An assessment of pregnant women’s knowledge and use of the Internet for medication safety information and purchase. J Adv Nurs. 2018;74:137–47.

    Article  Google Scholar 

  11. Metcalfe RK, et al. Patient preferences and decisional needs when choosing a treatment approach for pregnancy hypertension: a stated preference study. Can J Cardiol. 2020;36:775–9.

    Article  Google Scholar 

  12. Mühlbacher AC, Zweifel P, Kaczynski A, Johnson FR. Experimental measurement of preferences in health care using best-worst scaling (BWS): theoretical and statistical issues. Health Econ Rev. 2016;6.

  13. Statistical Innovations. Latent Gold, vol. 5; 2016. p. 1.

    Google Scholar 

  14. Ahmad Y, et al. A new method of applying randomised control study data to the individual patient: a novel quantitative patient-centred approach to interpreting composite end points. Int J Cardiol. 2015;195:216–24.

    Article  Google Scholar 

  15. RStudio Team. RStudio: integrated development for R: (PBC; 2020.

    Google Scholar 

  16. Abalos E, Duley L, Steyn DW, Gialdini C. Antihypertensive drug therapy for mild to moderate hypertension during pregnancy. Cochrane Database Syst Rev. 2018.

  17. Udogwu UN, et al. A patient-centered composite endpoint weighting technique for orthopaedic trauma research. BMC Med Res Methodol. 2019;19:242.

    Article  Google Scholar 

  18. Pocock SJ, Ariti CA, Collier TJ, Wang D. The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. Eur Heart J. 2012;33:176–82.

    Article  Google Scholar 

  19. Buyse M. Generalized pairwise comparisons of prioritized outcomes in the two-sample problem. Stat Med. 2010;29:3245–57.

    Article  Google Scholar 

  20. O’Brien PC. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40:1079–87.

    Article  Google Scholar 

  21. Felker GM, Maisel AS. A global rank end point for clinical trials in acute heart failure. Circulation. 2010;3:643–6.

    Google Scholar 

Download references


We would like to acknowledge the time and contributions of the 981 participants in the CHIPS trial and the 183 participants in the preferences study that made this work possible, as well as the members of the CHIPS Study Group (Additional file 1: Table S1).


This study was funded by peer-reviewed grants from two government entities: the Canadian Institutes of Health Research (MCT 87522) and the BC SUPPORT Unit (RWCT-001). Funders had no involvement in the design of the study; the collection, analysis and interpretation of the data; or the presentation of findings.

Author information

Authors and Affiliations




All authors contributed to the conceptualization and design of the study, have approved the submitted final version and have agreed to be accountable for their own contributions and the work as a whole. In addition, RKM contributed to the acquisition, analysis, and interpretation of the data and drafted the initial manuscript. MH, JS, and TL contributed to the analysis and interpretation of the data and provided feedback on the manuscript drafts. ML contributed to the acquisition and interpretation of the data. PvD and LAM contributed to the acquisition and interpretation of the data and provided feedback on the manuscript drafts. NB contributed to the acquisition, analysis, and interpretation of the data and contributed to the initial draft manuscript.

Corresponding author

Correspondence to Nick Bansback.

Ethics declarations

Ethics approval and consent to participate

This study was reviewed and approved by the Behavioural Research Ethics Board (H17-01194) at the University of British Columbia.

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

CHIPS Study Group.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Metcalfe, R.K., Harrison, M., Singer, J. et al. Using a patient-centred composite endpoint in a secondary analysis of the Control of Hypertension in Pregnancy Study (CHIPS) Trial. Trials 24, 99 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: