Moderators of the effectiveness of an intervention to increase colorectal cancer screening through mailed fecal immunochemical test kits: results from a pragmatic randomized trial

Background Colorectal cancer (CRC) screening rates remain suboptimal, particularly in low-income and underserved populations. Mailed fecal immunochemical testing (FIT) may overcome common barriers to screening; however, the effect of mailed FIT kits may differ across important subpopulations. The goal of the current study was to examine sociodemographic and health-related factors that moderate the effect of an intervention of automated direct mail of FIT kits at health clinics serving low-income populations. Methods This study is a secondary analysis of the Strategies and Opportunities to Stop Colon Cancer in Priority Populations (STOP CRC) study, a cluster-randomized pragmatic trial to increase uptake of CRC screening in patients seen at federally qualified health centers. The intervention involved tools embedded in the electronic medical records to enable participating clinics to mail FIT kits and related materials to eligible participants. We examined the rate of FIT completion by potential moderating characteristics using electronic health record data supplemented by the American Community Survey and the Centers for Medicare & Medicaid Services Geographic Variation datasets, linked via geocoding to patients’ addresses. All patients aged 50–75 seen in participating health clinics who were eligible for CRC screening were included. Results Although not always statistically significant, we saw a consistent pattern of increased FIT return rates among intervention participants compared to control participants across all subgroups studied, with incidence rate ratios (IRRs) generally ranging from 1.25 to 1.50. FIT completion in the intervention group ranged from 15 and 20% across subpopulations, typically three to six percentage points higher than the control group participants. The only moderator with a statistically significant interaction was race: persons of Asian descent showed a twofold response to the intervention (adjusted incidence rate ratio [aIRR] = 2.06, 95% confidence interval 1.41 to 3.00). Conclusions Response to a mailed FIT intervention was generally consistent across a wide range of individual and neighborhood-level patient characteristics, including typically underserved patients and those in low-resource communities. Trial registration ClinicalTrials.gov, NCT01742065. Registered on 5 December 2012.


Background
Colorectal cancer (CRC) is one of the leading causes of cancer mortality [1,2]. The US Preventive Services Task Force (USPSTF) gives CRC screening an A-level recommendation for adults aged 50 to 75 [3], and this service is among the highest rated clinical preventive services in the USPSTF's portfolio for its potential to avoid morbidity and mortality and also save costs [4]. A microsimulation model estimated that annual fecal immunochemical testing (FIT) among adults aged 50 to 75 would result in 244 life-years gained per 1000 persons, and other CRC screening methods (e.g., periodic sigmoidoscopy and colonoscopy) showed similar levels of benefit [5]. Despite this, CRC screening is well below targets set by both Healthy People 2020 [6] and the National Colorectal Cancer Roundtable [7].
In addition, there are disparities in CRC screening rates. According to the National Health Interview Survey, CRC screening rates are lower for those with low income, lack of health insurance, low education levels, who lack a source of regular medical care, or who are recent immigrants [8]. Rates are also lower in several race/ ethnicity subgroups, including patients who are Hispanic, Native Hawaiian or other Pacific Islander, and American Indian/Alaska Native [9]. CRC screening is also associated with a number of health-related factors, such as the presence of medical conditions [10][11][12][13] and utilization of other preventive health services [10,12].
CRC screening is typically initiated at a medical visit, but there are important known barriers to this approach, such as cost, lack of health insurance, and difficulty attending medical appointments. A mail-based intervention may boost CRC screening rates and reduce disparities in underserved populations by reducing these barriers. A number of studies have shown that mailing FIT kits directly to patients can substantially increase screening rates in low-income, minority, and racially diverse settings [14][15][16][17][18][19][20]. Screening rates were variable in these studies, ranging from 2 to 37% at baseline, and with the introduction of a FIT kit, mailing program rates increased by a factor of two to six, with absolute changes typically ranging from 21 to 29 percentage points. Two trials found that mailing FIT kits to patients who were unscreened was more effective in increasing CRC screening than phoning people to schedule colonoscopy appointments after 1 year [16,20], although this effect did not hold up with a 3-year followup [21,22]. Further, a recent study in a health maintenance organization (HMO) setting demonstrated that, among patients who had completed one FIT, 75-86% completed two additional rounds of screening within 4 years, suggesting good acceptability of this screening method among those who had used it [23]. Similarly, in a study of veterans who had completed a FIT, 89% found it easy to use and convenient, and 97% reported that they were likely to complete a FIT by mail annually [24]. In this group of veterans, 79% completed a second annual FIT test by mail [24].
Understanding whether mailed FIT interventions are broadly effective could assure health systems administrators that this approach would benefit a wide swath of patients and be unlikely to exacerbate or introduce disparities. This study explores whether sociodemographic and health-related factors moderate the effect of an automated direct mail of FIT kit program delivered to patients receiving care at health clinics serving primarily low-income populations.

Materials and methods
This study is a secondary analysis of data from the Strategies and Opportunities to Stop Colon Cancer in Priority Populations (STOP CRC) study, a clusterrandomized pragmatic trial to increase uptake of CRC screening [25]. The study was approved by the Institutional Review Board of Kaiser Permanente Northwest (Protocol # 4364), with ceding agreements from Group Health Research Institute and OCHIN (formerly Oregon Community Health Information Network), and is registered at ClinicalTrials.gov (NCT01742065).

Study design and randomization
The design of the parent trial is described elsewhere in detail [25,26] and is only summarized here. Primary attention here is focused on methods unique to this secondary analysis. Twenty-six clinics from eight federally qualified health centers serving low-income populations were randomized in a one-to-one ratio using a computer-generated randomization strategy prepared by a statistician. Neither clinic staff nor research staff had access to the allocation schedule prior to randomization. Allocation assignments were stratified by health center and blocked to assure maximum balance within health centers. Clinics were required to have a minimum of 450 patients aged 50-75 years as well as the necessary clinical and laboratory capacity and electronic health record (EHR) infrastructure to comply with the study's requirements. Randomization occurred in February 2014. Due to startup delays trigged by a scheduled upgrade to the EHR, clinics were unable to begin intervention activities until May 2014. As a result, we developed a secondary, lagged dataset that effectively did not begin recruiting intervention or control participants until May 2014 [27]. Sensitivity analyses using this lagged dataset were conducted for the cohort overall to provide what we believe is a more accurate estimate of the true intervention effect [25]. For similar reasons, and to maximize power to observe subgroup and interaction effects, it is this cohort that was used for the present analysis as well (Fig. 1).

Participants
Patients from both intervention and control clinics were included in the primary analysis sample if, at any time during the first 12 months post randomization, they were aged 50-75 years, did not already have CRC or other exclusionary diagnoses for CRC screening, and were not compliant with current USPSTF guidelines for screening [3]. The date this occurred defined the starting point for follow-up assessment for each individual.

Intervention
Tools were developed to enable clinics to use the EHR to generate mailing lists and materials for a series of three mailings: (1) an introductory letter, (2) a FIT kit with a specially designed instructional insert appropriate for use in low-literacy and non-English-speaking populations, and (3) a reminder postcard. Clinics used their own staff to access tools that had been developed collaboratively by clinic administrators, researchers, and the EHR provider and were embedded in the EHR. Staff used these tools to print materials and assemble mailings periodically (typically monthly or quarterly, but the timing of the mailings was determined by the clinics). Research staff provided additional implementation support by facilitating Plan-Do-Study-Act cycles carried out by staff at each health center.

Main measures Outcome
The primary outcome for this analysis was completion of a FIT, as identified through EHR laboratory data, after becoming eligible for the intervention. As noted elsewhere, we defined the follow-up interval for outcome assessment for each individual as the earlier of 12 months post initial accrual or August 2015, when intervention activities were initiated in the control clinics [25,27]. Follow-up windows therefore ranged from 6 to 12 months but were comparably distributed for intervention and control participants. All participants in the lagged dataset were included in this analysis; those with no evidence of having a returned FIT in their medical record were counted as not completing a FIT. We did not attempt to identify or remove patients who had moved away from the area or had moved their care to another health system.

Moderators
We primarily explored moderators related to socioeconomic status, healthcare access, language (as a proxy for recency of immigration), and demographics such as race/ethnicity, which are known to be associated with screening rate disparities. In addition, we explored individual characteristics, identified from the EHR and related administrative data, including age, gender, race, Hispanic ancestry, primary language, federal poverty level category, insurance status, body mass index (BMI), smoking status, whether the participant had a flu shot in the year prior to randomization, whether the participant was current on Pap test and mammography screening (females only), number of Charlson comorbidities [28], and whether they had a visit for diabetes, depression, or a chronic pulmonary condition in the year prior to their enrollment date. The last values entered in the EHR prior to each person's enrollment date were used for employment status, poverty level, insurance status, and BMI. Neighborhood characteristics were defined based on the participant's address at the time of enrollment, which was linked via geocoding to variables from the American Community Survey census data [29] and the Centers for Medicare & Medicaid (CMS) Geographic Variation database [30]. These characteristics included emergency department (ED) visits per 1000 CMS enrollees, Generalized Gini Inequality Index [31], median household income, percentage of college graduates, population density, percentage of residents who are at or below the poverty level, and unemployment rate. These neighborhood-level variables were dichotomized based on associated figures in the USA, as close to the year 2014 as possible (for consistency with the timeline of the study). Exact cut-points are shown in Table 1. Dichotomized outcomes were used to enhance interpretability of the findings, although sensitivity analyses were also conducted using the original, continuous measures to ensure dichotomizing the outcomes did not substantially affect the results.

Statistical analysis Primary analysis
The analytic methods used here are a direct extension of the primary outcome analysis used in the main outcomes paper [25], with the addition of the relevant subgroup variable main effect and treatment interaction terms to permit subgroup-specific treatment estimates and formal estimates of subgroup by treatment interaction. In addition, we used a Poisson rather than logistic link function for the generalized estimating equation (GEE) models and weighted all patients equally to reflect our focus on patient rather than clinic-level effects for this analysis. Finally, we summarized the treatment effects as risk ratios (RRs) rather than as absolute differences or odds ratios for improved comparison with other trials and ease of interpretability. The GEE models used robust variance estimators and specified clinic as a clustering variable to account for intra-clinic correlation. The analysis was conducted in 2017 and 2018.

Results
We included 30,667 individuals from 26 clinics who were aged 50-74 and were not current on CRC screening. The intervention and usual care groups showed very similar distributions on baseline characteristics, generally within one to four percentage points of each other ( Table 2).
Most of the persons in the sample were aged 50-64 (81.5%), White (88.6%), and non-Hispanic (86.4%), and more than half were female (55.4%). Most of the participants had household incomes that were below 200% of the federal poverty level (82.3%), and the most common form of health coverage was Medicaid (38.5%), followed by Medicare (24.5%), and no insurance coverage (22.3%). Records suggested relatively low completion preventive services; 24.6% had a flu shot in the past year, 30.9% of women had a mammogram in the past 2 years, and 38.7% of age-eligible women had a recent Pap smear. Tables 3 and 4 show the percentage of patients who had completed a FIT in the subgroups of interest, by intervention group. Although not always statistically significant, we saw a consistent pattern of increased FIT return rates among intervention participants compared to control participants across all subgroups studied, with incidence rate ratios (IRRs) generally ranging from 1.25 to 1.50. FIT completion in the intervention group ranged from 15 and 25% for most subgroups, typically three to six percentage points higher than the control group participants. Also shown in Tables 3 and 4 are the relative risks for having completed a FIT (vs. not) in each subgroup and the P value for the treatment*moderator interaction. The only moderator with a statistically significant interaction was race; persons of Asian descent showed a twofold response to the intervention (adjusted incident rate ratio [aIRR] = 2.06, 95% confidence interval [CI] 1.41 to 3.00). Intervention response was in the more typical range for participants who were White (aIRR = 1.32, 95% CI 0.99 to 1.76) and Black (aIRR = 1.28, 95% CI 0.85 to 1.92). Among persons of Asian descent, 18.9% in the usual care group completed a FIT, compared with 37.7% in the intervention group. In contrast, usual care completion rates among White and Black persons were 12.9 and 14.9%, respectively, compared to 15.8 and 20.2% for the intervention group participants.
Although no other interaction tests were statistically significant, a few other characteristics were statistically significant at P = 0.10. Specifically, we found larger effects for those with non-obese range BMIs than for participants with BMI ≥ 30.0 (aIRR = 1.44 vs. 1.28, P = 0.08), for those without vs. with a visit for diabetes in the past year (aIRR = 1.42 vs. 1.23, P = 0.06), and those living in lower poverty vs. higher poverty neighborhoods (aIRR = 1.47 vs. 1.30, P = 0.06). However, the preponderance of evidence suggests that intervention effects were fairly consistent across patient subpopulations. We reran the analyses of BMI and the neighborhood-level characteristics that we dichotomized, keeping the moderators as continuous variables (data not shown). None of the interaction terms were statistically significant in these analyses (P > 0.14 in all cases), supporting the robustness of these findings.

Discussion
In this population, drawn from safety net clinics in Oregon, Washington, and California serving low-income patients, a wide range of patient subpopulations generally showed fairly comparable responses to the mailed FIT intervention. However, the intervention effect was largest among persons of Asian descent, with a statistically significant incident rate ratio of 2.06 (95% CI 1.41 to 3.00). It is unclear why this subgroup showed large effects, and this result needs replication. One possible explanation we explored was that 77% of persons of Asian descent in the study population reported that English was not their preferred language, so it was possible that the wordless FIT instructions developed for this trial were particularly helpful for the Asian subpopulation. However, we did not find a greater benefit of the intervention among non-English speakers in general, nor was there a parallel effect in persons of Hispanic descent, who had a similar proportion of non-English speakers (76%) as the Asian subpopulation.
We found two other trials of mailed FIT interventions that reported on moderators of treatment effect [14,16], although these trials did not report specifically on differential effects in persons of Asian descent compared to other race/ethnic groups. One of these mailed FIT trials found that the intervention effect was comparable across age, gender, race/ethnicity (Hispanic vs. other), preferred language (English vs. Spanish), and insurance status, but did find a larger treatment effect among persons with no visits during the follow-up period than those with three or more visits, a variable we did not explore [14]. In their study, among persons with no visits during follow-   up, 3 % of the control group participants and 59% of the intervention group participants had completed a FIT within 6 months, a 56 percentage point difference between groups. Among those with three or more visits during follow-up, 58% of the control group and 86% of the intervention group completed a FIT within 6 months, a 28 percentage point difference. The other trial of mailed FITs that reported effect moderators found no differences in intervention response by gender or race/ ethnicity (comparing non-Hispanic white, black, and Hispanic subgroups) [16]. We adhered to most recommendations outlined by the checklist for the appraisal of moderators and predictors (CHAMP) [37]. First, we examined characteristics related to those that have been shown to be related to CRC screening rates, including demographics, socioeconomic factors, health status, and use of preventive services. The broad factors were selected a priori; however, the specific fields were restricted to those available in the EHR and in the databases of neighborhood-level data use by this study. We used measures taken prior to the start of the interventions, employed statistical interaction testing, and presented results for all moderators examined. In addition, the setting and study population were comparable to the settings and populations in which the mail FIT would be used clinically. Because of the large number of moderators we examined, the relatively small number of participants of Asian descent, and the lack of an effect related to non-English language preference (a construct related to Asian ethnicity), we view the finding of a positive moderating effect in Asian patients as exploratory and in need of replication. We also believe the overall pattern of consistent benefit across a range of patient characteristics in this setting is plausible.
One of the main limitations of this study is related to our reliance on the EHR for capture of moderator variables. Patients in this low-income population may be more mobile than typical, both in terms of where they live and where they receive their healthcare. As such, the neighborhood-level characteristics may not be current for people who struggle with homelessness or insecure housing, and healthcare-related services may be received at non-study clinics. However, low-income patients' mobility is likely primarily between neighborhoods with similar economic profiles, so we believe the information on neighborhood-level characteristics will often remain reasonably similar when patients have moved. However, EHRs are simply not always complete and accurate, so some patients will have been dropped from some analyses due to missing moderator data and some will have been misclassified in the EHR. In addition, some participants may have completed a FIT within a health system that was not covered by the OCHIN collaborative, so they would be misclassified as not completing a FIT.
Another limitation of our study is that we tested a larger number of potential moderators without adjusting our analyses to maintain a type I error rate of 5%. Thus, even though we did find one statistically significant interaction indicating a larger benefit for patients of Asian descent, this finding may be due to chance and not be robust to replication. An additional limitation is that we did not conduct power calculations specifically for the moderator analyses, given the wide range of subgroup sizes, and analyses for some subgroups may be underpowered.
Despite these limitations, our study has a number of important strengths. Our sample included more than 30,000 patients, and so had substantial numbers of patients across a variety of patient subgroups. In addition, these clinics are part of a collaborative that uses a common EHR system, meaning differences in data storage and capture were minimized across the clinics and that data on study participants seen at other clinics under the umbrella EHR provider would be captured. Another very important strength is that this was a pragmatic effectiveness trial, conducted in real-world safety net clinics, using the existing staff and infrastructure. While the overall effect of this intervention was not as large as that seen in some other trials of mailed FITs, the effect was robust across patient subpopulations and was implemented within the constraints of real-world, low-resourced clinics.
The relatively modest effects of an automated FIT mailing intervention were generally consistent across a

Conclusions
Response to a mailed FIT intervention was generally consistent across a wide range of individual and neighborhoodlevel patient characteristics, including typically underserved patients and those in low-resource communities.