Outcome reporting in neonates experiencing withdrawal following opioid exposure in pregnancy: a systematic review

Background Neonatal withdrawal secondary to in utero opioid exposure is a growing global concern stressing the psychosocial well-being of affected families and scarce hospital resources. In the ongoing search for the most effective treatment, randomized controlled trials are indispensable. Consistent outcome selection and measurement across randomized controlled trials enables synthesis of results, fostering the translation of research into practice. Currently, there is no core outcome set to standardize outcome selection, definition and reporting. This study identifies the outcomes currently reported in the literature for neonates experiencing withdrawal following opioid exposure during pregnancy. Methods A comprehensive literature search of MEDLINE, EMBASE and Cochrane Central was conducted to identify all primary research studies (randomized controlled trials, clinical trials, case-controlled studies, uncontrolled trials, observational cohort studies, clinical practice guidelines and case reports) reporting outcomes for interventions used to manage neonatal abstinence syndrome between July 2007 and July 2017. All “primary” and “secondary” neonatal outcomes were extracted by two independent reviewers and were assigned to one of OMERACT’s core areas of “pathophysiological manifestation”, “life impact”, “resource use”, “adverse events”, or “death”. Results Forty-seven primary research articles reporting 107 “primary” and 127 “secondary” outcomes were included. The most frequently reported outcomes were “duration of pharmacotherapy” (68% of studies, N = 32), “duration of hospital stay” (66% of studies, N = 31) and “withdrawal symptoms” (51% of studies, N = 24). The discrepancy between the number of times an outcome was reported and the number of articles was secondary to the use of composite outcomes. Frequently reported outcomes had heterogeneous definitions or were not defined by the study and were measured at different times. Outcomes reported in the literature to date were mainly assigned to the core areas “pathophysiologic manifestations” or “resource use”. No articles reported included parent or former patient involvement in outcome selections. Conclusions Inconsistent selection and definition of primary and secondary outcomes exists in the present literature of pharmacologic and nonpharmacologic interventions for managing opioid withdrawal in neonates. No studies involved parents in the process of outcome selection. These findings hinder evidence synthesis to generate clinically meaningful practice guidelines. The development of a specific core outcome set is imperative.


Background
Neonatal abstinence syndrome (NAS) is a postnatal withdrawal syndrome that occurs after fetal exposure to substances (for example, opioids, antidepressants and stimulants) in utero [1]. Among neonates exposed to opioids in utero, 55-94% [2] will demonstrate the clinical manifestations of NAS which predominantly involve central nervous system irritability, autonomic dysregulation and gastrointestinal dysfunction [1]. Typically, opioid-exposed neonates are observed in hospital for 3 to 7 days for the development of NAS symptoms prior to discharge [2]. Approximately 50-80% of these infants exhibit moderate to severe withdrawal symptoms requiring pharmacologic management in addition to supportive nonpharmacologic interventions [3,4]. The use of pharmacotherapy varies depending on whether the neonate meets the "treatment threshold" per the diagnostic criteria utilized (i.e., the Finnegan Scale, Lipsitz score, or the Eat Sleep Console model) which varies between centers. The average length of stay for a neonate experiencing NAS is 16-23 days, accounting for monitoring and stabilization of symptoms with pharmacotherapy [5].
In recent years, NAS has become a global concern with increasing prevalence rates ranging between 2.7 and 5.8 per 1000 live births [5][6][7]. In the USA between 2009 and 2012, the aggregate hospital charges for NAS increased from $732 million to $1.5 billion dollars [5]. A similar trend is documented in Canada, with tripling of the daily hospital beds occupied by neonates with NAS from 19.7 beds in 2003 to 69.4 beds in 2014 [8]. The actual burden of disease is likely inaccurate given the complexity of exposure, genetic factors that may increase the risk for withdrawal, patterns of recognition and reporting, along with extensive variability in regional substance misuse and diagnostic criteria. The extent to which different prevalence rates vary by hospital related to recognition or recording requires further investigation. Robust evidence is lacking on the long-term consequences (i.e., medical, neurodevelopmental or psychosocial) for the individual who has experienced NAS [3]. The current increase in NAS diagnosis is believed to be multifactorial and driven by improvements in screening, awareness and practice guidelines, and by the increasing rates of both illicit drug use and prescription of opioids/psychotropic medications during pregnancy [9,10]. Historically, NAS presentation was primarily due to in utero opioid exposure [3]. Neonatal opioid withdrawal syndrome (NOWS) is a subset of NAS reflecting neonatal withdrawal following exposure to opioids in pregnancy. The later, more specific definition has recently been adopted and may not be reflected in previously published reports. As polysubstance exposures during pregnancy are becoming more prevalent, reportedly as high as 65% in one study [11] and even higher with the inclusion of alcohol and tobacco, NAS is becoming an increasingly complex syndrome with less predictable time of onset, severity and response to pharmacologic therapy [12,13]. From the clinical perspective, there has been a paradigm shift in infant assessment and treatment initiation (using thresholds evaluated by subjective features of withdrawal to using the Eat, Sleep, Console method) [14] and the emergence of novel pharmacokinetic-and pharmacodynamic-based dosing protocols [15]. The everincreasing burden of disease, evolving complexity in presentation and changing pharmacotherapy initiation models have led to a growing interest in NAS research from scientists, clinicians and policy makers.
In 2013, the World Health Organization evaluated all available evidence on identifying and managing neonates withdrawing from in utero substance exposure. The quality of evidence behind the recommendations was determined to be "very low" per the Grading of Recommendations, Assessment, Development and Evaluations framework [16,17]. This paucity of evidence is reflected in the heterogeneity of clinical care for infants at risk for NAS. National studies in Canada [18], the USA [19], the UK and Ireland [20] demonstrated significant variability in NAS/NOWS assessment tools, types and doses of opioids utilized, and addition of adjuvant agents across neonatal intensive care units. Current economical assessments continue to show a substantial rise in NASrelated morbidity and costs [21][22][23]. With stakeholders examining the same topic through various lenses from bench-to-bedside to policy development, it is imperative to establish clinically relevant and standardized outcome measures. At present, there is no consensus on what to measure nor consistent units of measurement in research and quality improvement initiatives on neonates with NAS/NOWS. The purpose of this study is to evaluate consistency in outcomes reported in all observational and interventional studies of neonates exposed to opioids during pregnancy who develop withdrawal (NAS/NOWS). We will assess neonatal outcomes reported in all studies investigating pharmacological and nonpharmacological interventions for infants that were exposed to opioids (such as methadone, buprenorphine, oxycodone, or prescription opioids with or without concomitant use of other illicit substances) in utero and who are diagnosed with NAS or NOWS in the postnatal period.

Eligibility criteria
Primary research studies including randomized controlled trials, clinical trials, case-controlled trials, uncontrolled trials, observation cohort studies, clinical practice guidelines, and case reports of any interventions used to manage NAS/NOWS were analyzed. For the purpose of this review, studies of pharmacological and nonpharmacological management of neonates exposed to opioids (including methadone, buprenorphine, oxycodone, prescription opioids) in utero who are diagnosed with NAS were included. We included all co-exposures based on the rationale that they commonly occur and that ICD 9 diagnostic coding does not separate withdrawals related only to opioids. All publications identified from the published search strategy between January 2007 and June 2017 were included. The rationale behind the time restriction is that a Cochrane review was published in 2009 encompassing studies reported prior to 2007 and a shift in clinical practice away from ubiquitous use of deodorized dilution of opium [26]. The steering committee felt that there was limited value in evaluating what outcome measures were used before widespread use of the Finnegan assessment tools and oral morphine treatment weaning.

Search strategy
The search strategy (Additional file 1) was developed in conjunction with a reference librarian at the Hospital for Sick Children on Ovid MEDLINE 1946 to present with daily update, Ovid MEDLINE in-process and other nonindexed citations. The search was also applied to EMBASE and Cochrane Central on 6 June 2019. The latest Cochrane review was published in 2010. Reference lists of four recently published systematic reviews of NAS were evaluated. In addition to the electronic search strategy completed as above, ClinicalTrials.gov was reviewed and identified 37 studies for ongoing trials related to NAS. Bibliographies of all included studies and systematic reviews were reviewed to identify relevant articles not generated in the search. In addition, the steering committee reviewed the list of articles to ensure comprehensiveness and to provide related articles not identified by the search strategy.

Study selection
Two independent reviewers (LEK and SM) screened titles and abstracts resulting from all the search strategies in EndNote X6. For studies that were deemed eligible by title and abstract, full-text articles were obtained. Fulltext articles were critically reviewed independently (LEK, SM and FS) to assess eligibility. Studies published prior to 2007 were excluded as most studies focused on the use of tincture of opium which is no longer utilized for the management of NAS. Reasons for exclusions were documented. Any disagreement in study eligibility criteria was resolved through discussion and consensus or by consulting the principle investigator (LEK). Studies were excluded if they did not describe NAS health outcomes or if the full text was not available in a language mastered by our team (English, French, Spanish or Dutch). Only infants with an NAS diagnosis (irrespective of the diagnostic tool utilized) following known opioid exposure in utero were included regardless of concomitant substance exposure.

Data extraction
Data were extracted independently and in duplicate by two reviewers (SM and FS). Disagreements were resolved by consultation with the principal investigator (LEK). A standardized table was utilized for data extraction which included the following information: year of publication, corresponding author and contact information, study design (randomized controlled tried, cohort study, quality improvement, case series, case report), NAS intervention type (pharmacologic or nonpharmacologic), intervention group, control group, randomization, sample size, study objective in full text, method of NAS diagnosis, frequency of monitoring NAS symptoms, duration of exposure, type of maternal exposure (methadone, suboxone, buprenorphine, illicit opioids, benzodiazepines, cocaine, and so forth), study inclusion criteria, study exclusion criteria, primary outcomes, secondary outcomes, and justification for outcome choice. An outcome was included as reported if it was included in the methods, results or discussion sections. The outcome was placed under "primary outcome" if it was explicitly stated as the primary outcome in the study, it was the only outcome reported, or it was implicit in their data reporting (i.e., used in the sample size calculation). Composite outcome measures were separated to gauge the full breadth of definitions utilized in primary research studies.

Categorizing similar outcomes
Considerable heterogeneity in the terminology used to report outcomes was noted during the data extraction. Outcomes of similar themes were grouped during the data analysis process. For instance, "days of infant opioid treatment", "length of infant methadone therapy" and "duration of infant oral morphine therapy" were all included in the outcome category "duration of pharmacotherapy for NAS".

Assignment of outcome category to core areas
Outcomes reported as either primary and/or secondary were assigned to one of the four core areas plus adverse events as defined in OMERACT Filter 2.0 [27]. OMER-ACT is a conceptual framework to ensure that a comprehensive set of outcomes is selected to formulate a core outcome set (COS). Its core areas encompass content that is measurable in a trial that include both patient-centered and intervention-specific information. These four core areas include "death", "life impact", "resource use" and "pathophysiologic manifestations". OMERACT recommends that "adverse events" should be measured within the core areas [27].

Results
The search strategy (Additional file 1) identified 2935 unique articles for screening (Fig. 1); a total of 47 original research publications met the specified inclusion criteria and were included in this review. Reference lists for three recently published NAS reviews and systematic reviews were evaluated and no additional relevant articles were identified [1,3,28]. The characteristics of the included studies are outlined in Table 1. The 47 articles published outcomes from ten randomized controlled trials, 21 retrospective cohort studies, five prospective cohort studies, one qualitative analysis, three case series, and one case report. The remaining studies utilized quality improvement methodologies (four studies), combined retrospective and prospective cohort analysis (one study), and a prospective within-subject analysis (one study). There were no disagreements in the study  inclusion process that could not be resolved through discussion.

Description of outcomes reported in primary research studies
A total of 107 primary outcomes and 127 secondary outcomes were reported in the included 47 studies. Of these outcomes, 67 primary and 94 secondary outcomes, respectively, were unique 'terms' that were not used in any other study. The individual outcomes were mapped into outcome categories as illustrated in Table 2. For ambiguous reported outcomes, the full text of the article was reviewed to determine how it was measured to classify the outcome into an outcome category. For instance, "effectiveness of methadone for the treatment of NAS" was the primary outcome in one article [52]. The surrogate markers of "effectiveness of methadone for the treatment of NAS" were "time since need for pharmacological treatment was established", "methadone dose in mg/kg/ day", "methadone dosing interval" and "Lipsitz scores". These surrogate markers were separated as individual outcomes to accurately reflect specific outcome measures. Fifteen outcomes were mapped into the outcome category of "miscellaneous single outcomes". The most frequently reported outcome category was "duration of hospital stay", which was reported in 30 primary research studies (63.8%) as a primary or secondary outcome. Twenty-one of these primary research studies reported "duration of hospital stay" as a primary outcome measure. As noted in Table 2, there is heterogeneity in how the most commonly reported outcome (duration of hospital stay) is measured in research studies using six different definitions. Figure 2 demonstrates the distribution of primary and secondary outcome terms across individual research studies. There is also a wide range in the number of outcomes reported (1 to 12) per study. For instance, only seven (15%) of studies reported all three of the most frequently reported outcome categories (duration of hospital stay, duration of pharmacotherapy, and presence of NAS symptoms). Heterogeneity of outcome selection is reflected by the large number of primary (53/107) and secondary (76/ 127) outcomes that were only reported in a single study. The timing of outcome measurement was poorly reported across studies and was inconsistent. For example, neurobehavior was measured in two studies: one study reported Bayley Scale scores at 1 year and the second reported the NICU Network Neurobehavioral Scale (NNNS) at 5-7 days old and at 44 weeks postmenstrual age. Withdrawal severity scoring outcomes were reported after 1 week of treatment, six times daily, and "according to hospital policies". Training of outcome assessors was only mentioned for seven primary outcomes and six secondary outcomes. Training protocols were only provided in one study. Outcome assessors were described for 12 primary and 14 secondary outcomes, and were mostly clinicians (physicians and/or nurses, n = 9/ 12 and n = 12/14, respectively). The study design may affect the outcomes reported as demonstrated in Table 3. For the most commonly utilized study designs such as retrospective cohort studies (n = 21), randomized controlled trials (n = 10) and prospective cohort studies (n = 5) there was less variability in outcomes selected compared to quality improvement studies and case reports. Cohort studies and clinical trials reported "duration of hospital stay" and "duration of pharmacotherapy" most frequently. Qualitative improvement studies also reported length of stay, which reflect the nature of the study design, targeted to reduce adverse events and resource consumption. Case series/reports, classically written to report unusual or novel occurrences, tended to report "miscellaneous single outcomes" that were not studied at a larger scale in cohort or randomized controlled trials.

Assignment of primary outcomes to OMERACT core areas
One hundred and seven primary outcomes and 127 secondary outcomes were mapped to core domains defined by OMERACT filter 2.0 in Figure 3 [27]. The core area most commonly studied was "Resource Use/Economical Impact" through 72 primary outcomes and 67 secondary outcomes. This was followed by "pathophysiological manifestations" covered by 23 primary outcomes and 26 secondary outcomes and "life impact" examined by nine primary outcomes and 15 secondary outcomes. "Adverse events" were reported as three primary outcomes and 16 secondary outcomes. "Death" was not reported as a primary outcome in any study and was reported as three secondary outcomes.

Discussion
This review summarizes the primary and secondary outcomes that are reported in current primary research studies (randomized controlled trials, cohort studies, qualitative analysis, case series, case reports, quality improvement studies) of neonates with NAS. We found substantial heterogeneity in outcomes selected and poor overall standardization of definitions and timing of measurement across commonly reported outcomes. These findings have important implications for clinical practice and for research as inconsistency in outcome selection, definition and measurement yields results that cannot be combined in meta-analyses. For example, "weight" is used as a proxy for NAS-related feeding concerns in seven studies, but it was defined as "weight Costs for all opioid-exposed infants Costs for all opioid-exposed infants treated pharmacologically

30-day readmission (withdrawal)
n represents the number of times an outcome category was reported as either a primary or secondary outcome; difference between number of times an outcome category was reported and the number of articles reporting an outcome category is secondary to the use of composite outcomes DTO diluted tincture of opium, ICU intensive care unit, NAS neonatal abstinence syndrome, NICU neonatal intensive care unit, NOWS neonatal opioid withdrawal syndrome, PO per oral, mLmililiters, kg kilograms gain", "discharge weight", "weight loss greater than 10% in the first week of life", "poor weight gain" and "time to regain birthweight", jeopardizing comparison and synthesis of study results. This finding further substantiates the challenge in data synthesis of quantitative data if a range of time points are utilized to measure one outcome.
In our systematic review, 47 original research articles reported 107 primary outcome measures resulting from the use of composite outcomes. These findings are consistent with a recent overview of Cochrane systematic reviews in neonatology which found that over half of recent reviews were inconclusive secondary to heterogeneity of the literature and poor methodologic quality of  the studies [76]. Poor outcome definitions may contribute to reporting bias. Chan et al. noted that 62% of trials had primary outcomes changed, introduced or omitted in the process of protocol to publication [77]. Inconsistently selected and defined outcomes led to results that are not reproducible, transparent or comparable [78]. Timing of outcome measure was poorly reported across studies, which may be understandable for outcomes which measure time (for example, duration of time in hospital or duration of treatment) but create challenges for interpreting time-dependent variables such as neurodevelopment and withdrawal severity scoring. Training and descriptions of outcome assessors were poor and, given the subjective nature of some withdrawal scoring criteria, this creates barriers for interpretation and replication. Not only does this contribute to research waste, but also can create misleading conclusions that clinicians use to inform patient care. COSs have been proposed as a method of standardizing outcome selection, measurement and reporting to optimize research data in terms of transparency, reproducibility and clinical utility [27,79,80]. The absence of a COS results in: 1) meaningful outcomes (i.e., patient preferences) being overlooked or lost in study design; 2) inconsistent definitions or measurement tools used across similar studies; and 3) choosing outcomes for publication based on the results of individual studies (reporting bias) [27,79,80]. Thus, establishment of a COS representing a minimum set of outcomes that must be measured and reported in all research on NAS interventions and prognosis is critical to improving research quality and evidence-informed care. This review is the first step in a series of methods to develop consensus on what should be measured in all studies on neonatal withdrawal. Consistency in reporting would enhance the value of NAS/NOWS literature by reducing reporting bias of prespecified outcomes and ensuring that research efforts contribute clinically relevant information. In addition, standardization of outcomes would allow trial results to be compared, contrasted and combined in systematic reviews and meta-analyses to create a more robust foundation of knowledge for clinical decisionmaking and funding allocation [79].
Furthermore, this study found that outcomes related to the core area "life impact" are scarce in existing NAS literature, suggesting that public stakeholder involvement in outcome selection is limited to date. With the rise of substance use disorders comes an increase in neonates with in utero exposure who are experiencing NAS/NOWS. Key stakeholders, the parents and caregivers, are often in a situation of insufficient community support due to lack of knowledge and awareness about  Table 2 for definitions of adverse events and treatment failure. ER emergency room, NAS neonatal abstinence syndrome, NICU neonatal intensive care unit the life impact of NAS for that infant and family. Most outcomes reported are related to "resource utilization" reflecting the increasingly high economic burden of NAS, the main concern from the policy-maker point of view. While important, these outcomes do not provide evidence for clinicians on the optimal care for the infant, the mother and/or dyad involved. By estimating the burden of NAS based on nearsighted benchmarks such as "length of hospital stay" or "duration of pharmacotherapy", while simplifying the methodology of studies, the cost to society may be grossly underestimated. There are still principle variables such as the impact of in utero opioid exposures on physical health, neurodevelopmental, psychiatric health and psychosocial interactions of the growing child that have yet to be explored. How do these long-term sequelae for the child affect parents, families and support workers?

Strengths and limitations
The strengths of our study are the inclusion of all primary research studies, following a preregistered protocol, and methods following the PRISMA guidelines [81]. A comprehensive search of the bibliographies for the studies included and recent systematic reviews was undertaken to identify any articles missed in our search strategy. Furthermore, the steering committee reviewed the list of included articles and were invited to suggest articles that were not identified using our search strategy. The main limitations to our review are the time boundaries of 2007-2017 for the search strategy and most included studies reported data from western countries which may limit external validity. This is, however, a limitation of the literature available and we do not feel this is related to our search or selection process. In this review, we did not contact the research groups for missing information. The aim of our study was to abstract primary and secondary outcomes and, as such, we did not intend to assess the rigor to which the studies reported this information as missing data are unlikely to affect the conclusions formulated here.
The conclusions from this systematic review advocate for the formation of a COS for NAS/NOWS to establish consistently reported outcomes with standardized definitions. Other COSs and studies have demonstrated that public involvement leads to research that is comprehensive and relevant to the patients receiving evidencebased interventions [82]. Since 2010, the COMET initiative has set out to establish standardized sets of outcomes for intervention clinical trials and clinical audits [83]. Their overarching objective is to limit outcome heterogeneity, include relevant outcomes, and reduce outcome reporting bias through the development of COSs. Trials evaluating the effectiveness of a treatment should measure and report at least each outcome in the COS.
However, the primary outcome should be chosen and powered to answer the research question. COS development should be comprehensive and include outcomes relevant for multiple stakeholders from patients/families, clinicians and scientists.
Recent expert reviews in the Journal of the American Medical Association [28] and the New England Journal of Medicine [3] concluded several areas of uncertainty including standardized and validated assessment tools, optimal medication treatment regimen, optimal location for weaning, long-term neurodevelopment and family outcomes in NAS/NOWS that require further research. Given the heterogeneity in outcome selection, measurement and reporting highlighted by this study, it is imperative and timely to devise a COS for NAS/NOWS. A NAS/NOWS COS will standardize research methodology to reduce research waste and enhance transparency for clinical decision-making. With a scarcity of knowledge on the long-term patient and family impact of NAS/NOWS, the COS should reflect the needs and concerns for the stakeholders that are most affected by this condition. A consensus and evidence-based NAS/ NOWS COS is underway led by a multidisciplinary international steering committee. The development of this COS includes family interviews to ensure that future studies will evaluate outcome measures that are meaningful to the population it affects the most.

Conclusion
Inconsistent selection and definition of primary and secondary outcomes exists in the literature of pharmacologic and nonpharmacologic interventions for managing opioid withdrawal in neonates. No studies involved parents in the process of outcome selection. These findings hinder evidence synthesis to generate clinically meaningful practice guidelines. The development of a specific COS reflecting the needs of stakeholders, including families, is required to improve the quality of clinical practice guidelines.
Additional file 1. MEDLINE search strategy.