Evaluation of alternative school feeding models on nutrition, education, agriculture and other social outcomes in Ghana: rationale, randomised design and baseline data

Background ‘Home-grown’ school feeding programmes are complex interventions with the potential to link the increased demand for school feeding goods and services to community-based stakeholders, including smallholder farmers and women’s groups. There is limited rigorous evidence, however, that this is the case in practice. This evaluation will examine explicitly, and from a holistic perspective, the simultaneous impact of a national school meals programme on micronutrient status, alongside outcomes in nutrition, education and agriculture domains. The 3-year study involves a cluster-randomised control trial designed around the scale-up of the national school feeding programme, including 116 primary schools in 58 districts in Ghana. The randomly assigned interventions are: 1) a school feeding programme group, including schools and communities where the standard government programme is implemented; 2) ‘home-grown’ school feeding, including schools and communities where the standard programme is implemented alongside an innovative pilot project aimed at enhancing nutrition and agriculture; and 3) a control group, including schools and households from communities where the intervention will be delayed by at least 3 years, preferably without informing schools and households. Primary outcomes include child health and nutritional status, school participation and learning, and smallholder farmer income. Intermediate outcomes along the agriculture and nutrition pathways will also be measured. The evaluation will follow a mixed-method approach, including child-, household-, school- and community-level surveys as well as focus group discussions with project stakeholders. The baseline survey was completed in August 2013 and the endline survey is planned for November 2015. Results The tests of balance show significant differences in the means of a number of outcome and control variables across the intervention groups. Important differences across groups include marketed surplus, livestock income, per capita food consumption and intake, school attendance, and anthropometric status in the 2–5 and 5–15 years age groups. In addition, approximately 19 % of children in the target age group received some form of free school meals at baseline. Conclusion Designing and implementing the evaluation of complex interventions is in itself a complex undertaking, involving a multi-disciplinary research team working in close collaboration with programme- and policy-level stakeholders. Managing the complexity from an analytical and operational perspective is an important challenge. The analysis of the baseline data indicates that the random allocation process did not achieve statistically comparable treatment groups. Differences in outcomes and control variables across groups will be controlled for when estimating treatment effects. Trial registration number ISRCTN66918874 (registered on 5 March 2015).


(Continued from previous page)
Conclusion: Designing and implementing the evaluation of complex interventions is in itself a complex undertaking, involving a multi-disciplinary research team working in close collaboration with programme-and policy-level stakeholders. Managing the complexity from an analytical and operational perspective is an important challenge. The analysis of the baseline data indicates that the random allocation process did not achieve statistically comparable treatment groups. Differences in outcomes and control variables across groups will be controlled for when estimating treatment effects. Trial registration number: ISRCTN66918874 (registered on 5 March 2015).
Keywords: School feeding, Impact evaluation, Education, Nutrition, Agriculture Background School feeding programmes have been a key response to the recent food and economic crises and function to some degree in nearly every country in the world [1]. School feeding is a multi-sectoral intervention with effects across education, health and nutrition, and with the potential for benefits across a life course. Rigorous studies have shown that school feeding programmes can improve school attendance and learning, as well as a child's physical and psycho-social health (see [2] for a recent review). These effects are heterogeneous and context-specific, depending also on the quality of programme implementation. There is no rigorous evidence on the impact of providing a reliable market for smallholder farmers through 'home-grown' school feeding (HGSF) approaches [1,2]. In HGSF, the demand for food and services from school feeding is channelled explicitly to smallholder farmers and other stakeholders involved in the school feeding supply chain. As most of the studies in the scientific literature in low-income settings involve humanitarian aid, there is also a paucity of evidence on government-led programmes operating at scale in low-and middle-income countries [1]. This study is aimed at addressing these research gaps by evaluating the full cost and impacts of alternative school feeding implementation approaches, across education, health and nutrition, and agriculture domains in Ghana.

Country context
Ghana is a lower-middle income country with a population of 25 million people, over 40 % of whom are under 15 years of age [3]. Despite the high rates of economic growth occurred in the past two decades, Ghana is ranked 138th in the 2014 Human Development Index table, with a life expectancy at birth of 61 years, 7 mean years of schooling for adults and a Gross National Income (GDP) based on per capita purchasing power parity (PPP) of US$3532 [4]. The domestic economy is centred on subsistence farming, which accounts for nearly 40 % of the GDP and employs over 50 % of the workforce [5]. Around 25 % of the country's population live in poverty based on the national-level poverty line, with this percentage increasing to 38 % in rural areas in contrast to 10 % in urban ones [6]. Food security in the marginal agricultural and arid areas varies with the seasons. The peak hunger seasons for the south of Ghana are from May to August whereas the north of Ghana experiences peak hunger seasons between July and October. The incidence of malnutrition in Ghana has been assessed through the Ghana Demographic and Health Surveys (GDHS) conducted every 5 years since 1988. From 1993 to 2008 there was some progress in reducing the rate of chronic malnutrition, with rates of stunting decreasing from 34 % to 29 % [6]. According to the 2003 and 2008 GDHS the prevalence of anaemia among children of 6-59 months of age increased marginally from 76 % in 2003 to 78 % in 2008. In 2008, the prevalence of anaemia among rural children aged under 5 years (84 %) was higher than in urban areas (68 %). The overall prevalence of stunting among schoolaged children was 17 %, ranging from 13 % in the forestsavannah transitional zone to 21 % in the northern savannah [6]. The same study estimated that the prevalence of anaemia among school-aged children was 39 %. This, however, varied widely across ecological zones. Anaemia rates were highest in the northern savannah (65 %) and the coastal savannah zones (59 %) and least prevalent in the transitional zone (16 %).

Complex intervention
This evaluation focusses on the Government of Ghana school feeding programme. As of 2011, the Ghana School Feeding Programme (GSFP) reached over 1.6 million primary school children in all 170 districts of Ghana. The programme is directly funded by the Government of Ghana, with a 4-year programme budget of over US$200 million. The GSFP was piloted in 10 schools in late 2005. By the end of 2009, GSFP had progressively grown to serve 1695 public schools with 656,624 pupils across the country. The GSFP is a complex intervention and was designed as a strategy to increase domestic food production, household incomes and food security in deprived communities [7]. The objectives of the strategy combined child-level education and nutrition, alongside household food production. GSFP co-ordination and implementation are undertaken by a national secretariat, with programme oversight provided by the Ministry of Local Government and Rural Development (MoLGRD). Line ministries offer technical support through the programme steering committee, although a number of NGOs and bilateral agencies are also involved with that support. The GSFP service delivery is provided through private caterers who are awarded contracts by the GSFP to procure, prepare and serve food to pupils in the targeted schools. Each caterer is responsible for procuring food items from the market, preparing school meals and distributing food to pupils. Cash transfers are made from the district assemblies, under the supervision of the District Implementation Committees (DICs), to caterers based on 40 Ghana pesewas (circa US$0.33) per child per day. Caterers are not permitted to serve more than three schools each, and profit is derived from savings made after food has been procured, prepared and distributed. Supervision at the school level is by the School Implementation Committee (SIC) and funds are intended to be released to caterers every 2 weeks. Storage is the responsibility of caterers and no rigid tendering process is enforced. The caterers are not restricted or guided in their procurement and are able to procure on a competitive basis without commitment to purchasing from small-scale farmers. The GSFP project document prioritises procurement from the community surrounding the assisted schools, broadening the focus to the district and national levels when food items are not available.
A recent supply chain analysis describes how caterer procurement decisions depend on costs (of food, transport, preparation) and on cash availability [8]. According to this study, the way and the extent to which caterers store food varies from district to district, but most have access to storage facilities (small household storage, school storage, or private storage). Caterers generally hire cooks to prepare food for students either in their homes or at school facilities. The main challenges faced by caterers include managing changes in food prices, hampered by the inability to mitigate price fluctuations due to delays in payments from the GSFP. Caterers reported that seasonal price variations between harvest and lean periods included price increases of up to 400 % [8]. The GSFP payments are received after the meals are served, resulting in caterers not having the resources to buy in bulk and guarantee a better and stable price to smallholder producers. Caterers were also reported to buy on credit from traders known as 'market queens' in Ghana, weakening their overall negotiation position. In addition, caterers also reported that payments often do not reflect the real number of pupils since enrolment often increases during the school term, which could possibly lead to either less food being served per child or higher costs faced by the caterers [9]. In practice, caterers often adapt to these challenges by reducing the quantity of food provided or by adjusting the quality of the food and adapting the menus. According to the supply chain study, procurement of food from smallholder farmers could help to mitigate the price volatility challenge. The study found that caterers were willing to procure their food from local farmers and that by buying from farmers, caterers could benefit from lower and more stable prices than those offered by traders on the market. Nonetheless, the reality is that almost all the food is still bought from markets [8].

Challenges in linking agriculture
The most recent evaluation of the GSFP undertaken in 2012 identified the need for 'a more strategic approach in linking farmers to the programme' [10]. This gap between the food production side and the caterers has been documented in other studies as well, including a recent supply chain analysis that highlighted a number of key constraints in the current model ( Fig. 1), including: Mismatch of cash flow: farmers need money as soon as they harvest. Caterers receive money after serving children Lack of trust between farmers and caterers (especially for future payments): farmers do not trust caterers to advance food for later payment. Inconsistent payment from government worsen their perceptions Difficult for caterers to access farmers: no contact information, difficult to reach, widely spread out, a lot of interaction necessary No structure in place to facilitate caterer and farmer negotiations The HGSF pilot An innovative capacity-building component is being integrated alongside the GSFP and constitutes one of the treatment arms of the experiment. The details of the pilot were developed by a multi-disciplinary working group composed of in-country stakeholders under government leadership. This pilot involves the development of an integrated package of community-level activities aimed at enhancing the impact of the GSFP on poverty and food insecurity and involves two main components [11].
Agriculture: this component is designed to stimulate the economy at community level by purchasing food from smallholder farmers. The component aims to bring the actors of the school feeding supply chain and GSFP community programme together to discuss the demand and supply needs to the school feeding market. Farmers and caterers would then be able to negotiate a price and payment agreement to address the issue of mistrust. This agreement will be backed by a master contract Nutrition: this component will include activities to improve the nutritional quality of the school meals (e.g. menu planning), promotion of improved health, nutrition and hygiene behaviours (e.g. behaviour change campaigns), and the provision of multiple micronutrient fortification

Methods
Programme theory of the intervention School feeding interventions linked to smallholder agriculture can have multiple goals in the following areas: Education: increasing school enrolment, attendance and reducing drop-out, and improving cognition and learning achievement Health: improving nutritional status of school age children Agriculture: supporting incomes of recipient households (those consuming food) and farmer households (those providing the food) Small enterprise development: supporting incomes of caterers and cooks involved in the food service provision Figure 2 illustrates in very broad terms the impact theory of school feeding on agriculture, education, and health. School feeding affects educational outcomes directly by increasing enrolment, attendance and completion (line 'a' in the figure). It affects health directly by improving nutritional status (line 'b'); this in turn has an indirect impact on education, as improving nutritional status has a positive impact on learning outcomes (line 'd'). The intervention can also affect income directly by increasing households' food security (line 'c'). In addition, the intervention can benefit the small enterprises involved in the school food service provision. Finally, there are effects running through increased income and health and nutrition and vice versa, as richer families are investing more in human capital and more educated and healthier adults are more economically productive (lines 'e'). However, these latter effects (represented as dotted lines in Fig. 2) only occur in the long term and certainly not before children have left school: therefore, we will not discuss them in the following design. Whilst the evidence base on the effects on child education, health and nutrition is generally well-established (see [12] for a recent systematic review) this evaluation is the first to also examine the effects on agriculture and enterprise development.
It must be emphasised that the ability of the school feeding intervention to deliver the effects depicted in Fig. 2 critically depends on the appropriate implementation of the programme. The management and implementation of the intervention involves several actors, and there is evidence that in Ghana there are several problems of information flow, supervision and monitoring between these different stakeholders. Programme success will also depend on the ability of communities  [8] to actively engage in the programme and in the strengthening of the public institutions involved.

Main hypotheses and outcome indicators
We summarise here the expected impact of the intervention on education, nutrition and agriculture as captured in the programme theory. The detailed programme theory for the different domains is captured in [13].

Education
The intervention will have a positive impact on enrolment, attendance and drop-out rates The intervention will have an impact on cognitive abilities and class behaviour including attention The impact on learning (test scores) will be moderate as school quality is unlikely to change in the short term

Nutrition and health
The intervention will have a limited impact on physical growth of children because of the increase in physical activity levels (PAL), substitution effects and the age range (5-15 years) of the targeted population. An impact on siblings of school-going children is possible if substitution effects are strong The intervention will have a moderate impact on the diet because on the one hand, food purchases by caterers do not follow nutritional guidelines, and on the other nutrition education will be a component of the school-level trainings The intervention will have some impact on micronutrient status where the food provision is fortified, and only moderate effects on diet diversity are expected

Agriculture and community development
The intervention will have an impact on a small number of farmers in the intervention communities. Other persons in the community may benefit either directly or indirectly via an increase in income The programme will have an impact on a small number of caterers involved in the school feeding service provision In addition to examining the potential effects in the different domains, the evaluation will also assess the pathways through which these effects are mediated. Table 1 includes a list of the main outcome indicators of the study. The data collection section below describes how data will be collected using different survey instruments. All the main study outcomes, including school enrolment, attendance and test scores, will be obtained through the household-and childlevel interviews.
For the pathways analysis, in addition to the outcome indicators in Table 1 we will also observe the programme impact on intermediate indicators, particularly for those outcomes that are more difficult to observe directly. In the case of farmer income, we will look at several intermediate outcomes such as input use (labour, land, seeds and fertiliser), investments (farm capital such as tools and machinery), and market access (marketed surplus, prices and markets). In terms of other intermediate indicators in the nutrition and health pathway, we will observe the effect of the programme on knowledge and practices of caterers and school management members, and on the quantity, quality, and timeliness of the preparation and delivery of the school meals.

Design of the randomised evaluation
The impact evaluation will be an integral component of the monitoring and evaluation activities of the GFSP. Two rounds of surveys are envisioned, with the baseline planned in the intervention and control sites in June 2013 and a follow-up planned in November 2015. After the follow-up survey, the control schools and community will be fully integrated in the intervention. We will consider the possibility of conducting further surveys in the following years, building matched control groups in order to detect long-term effects of the intervention on smallholder agriculture.
The GSFP will be expanded across the 10 regions of the country. The GSFP has set clear criteria for the selection of the intervention areas as captured in the retargeting exercise conducted in 2012. Poverty rankings were developed using the Ghana Living Standards Survey and Core Welfare Indicators Questionnaire carried out in 2005/2006 and 2003 respectively. Food consumption scores were calculated using the Comprehensive Food Security and Vulnerability Assessment 2008/2009 and spatial data variables computed by the World Food Programme (WFP). The data were then used to generate district-level composites for share of national poverty and food insecurity that were then used to allocate programme resources.

Random assignment and manipulation of treatments
Households and schools were randomly assigned to three treatment arms: 1. Control group: these are schools and households from communities where the intervention will not be implemented. The intervention will be delayed by at least 3 years in these communities, preferably without informing schools and households. After the 3-year period, these schools will be covered by the GSFP. 2. Regular GSFP group: these are schools and communities where the standard GSFP is implemented, with caterers responsible for the food procurement and preparation 3. HGSF+ group: these are schools and communities where the programme is implemented in addition to a pilot capacity-building component, including training of community-based organisations and other stakeholders, on food procurement, nutrition education, and feedback monitoring. This group will be randomly divided into two sub-groups (HGSF+ and HGSF++) as part of a study focussing on anaemia.
Note that the HGSF+ intervention will be conducted at the district level. Training and monitoring systems involve caterers and exert their effects at the district level, affecting outcomes in schools where the HGSF+ programme is not implemented. On the other hand, the number of districts where the programme is implemented is rather small, which reduces the statistical power of the analysis, and the effects of the school feeding intervention against the control group are best observed at the school level. Hence, we opted for a design that compares the outcomes of the school feeding and control groups at the school level, and that compares outcomes of HGSF+ and regular school feeding (GSFP) at the district level.
The GSFP selected 58 districts in which the programme will be implemented. In each of these districts, two candidate schools were selected and each school was randomly assigned to the treatment or to the control. A protocol was designed in order to ensure that the schools were comparable based on data from the Education Management Information system (EMIS) and that contamination between the two schools in each district will be minimised. This will allow comparison of outcomes of the intervention against the control group at the school level in 58 districts. The 58 schools assigned to the intervention were then randomly assigned to regular GSFP and HGSF+. In this way the randomisation of the HGSF+ intervention occurs at the district level. The number of 58 schools is based on power calculations (see Appendix 1) determined with the objective of achieving statistical validity and representativeness for the main outcomes of interest. Anaemia sub-study The impact evaluation includes a sub-study focussing on nutrition in school feeding with and without micronutrient fortification. A sub-group of 14 of the 29 HGSF+ groups was randomly assigned to receive food fortification (the HGSF++ group) in addition to training and sensitisation activities that are part of the HGSF+ pilot (see Fig. 3). Data will be collected from children aged 5-15 years in the HGSF++, HGSF+, GSFP and control communities. Targeted schools were surveyed as part of the broader impact evaluation baseline.

Sample sizes
For the impact evaluation, power calculations and resource availability suggested the adoption of a sample of 25 households from the communities in the areas of the 58 schools receiving the intervention and of 20 households in the communities of the 58 control schools. Households were randomly selected from household listings in the catchment areas of the selected schools for the survey interviews. The household listings were stratified into farmer/non-farmer households, based on agriculture classification data from the national census. Farmer households were sampled in both areas in the following way: 10 out of the 25 households in the 60 intervention communities were farmer households and 5 out of the 20 households in the 60 control communities were farmer households. Non-farmer households with children in the 5-15 years age group were randomly selected from the household listings. This distribution of the sample between farmer and non-farmer households and between project control groups allows the construction of comparable samples (see Table 2).
In each household, all children aged between 5 and 15 years were asked education outcome-related questions (enrolment, attendance, drop-out) and were tested in literacy, maths, forward and backward digit span and Raven-like matrices. Anthropometry and haemoglobin level measurements were administered to children aged 5-15 years. Anthropometry indicators were also be measured for children aged 2-5 years. As each school is assigned a caterer by the GSFP programme, the sample also included 58 caterers who were interviewed using a semi-structured questionnaire.

Threats to validity
The main potential threats to the internal validity of the study, including contamination, spill-over effects and Hawthorne-like effects were examined for each of the outcome indicators. From Table 3 it seems that most threats could be avoided by: i. Assigning treatments to districts rather than to communities within districts in order to avoid contamination effects; ii. Avoid informing teachers and households of the control communities that the programme will be implemented after 3 years in order to avoid expectancy effects; iii. Adopt strategies in conducting cognitive and achievement tests that prevent teachers and children from over-performing.
Given the panel structure of the data there is a potential risk of differential attrition. However, it is difficult to predict why households or farmers from the control Fig. 3 Schematic view of the design of the randomisation groups should respond to the interviews in different ways. Refusal to take part in the interview by households not benefiting from the project seems to be the main threat. However, as shown in Table 3, the project has limited impact on households' expectations in both project and control groups and, therefore, should have limited impact on response rates.

Study area and site selection
Selection of the target areas involved three key steps: 1) the first step involved selecting 58 districts at random within Ghana from a sample frame including all districts in the country. The sample frame was stratified by region, and district inclusion was prioritised using data from the GSFP retargeting exercise including data on the prevalence of poverty and food insecurity; 2) the second step involved identifying 2 comparable schools within each of the 58 selected districts. A list was obtained through the GSFP secretariat including schools not currently covered by the GSFP in each district. Data from the annual school census from 2011 to 2012 were then used to match schools not receiving the GSFP and identify 'best matched' pair. The allocation of school feeding and control was then randomised (lottery style) within each pair; 3) the third step in the site selection protocol involved the random allocation of districts to the HGSF+/GSFP groups by modelling pilot selection using a set of community-and district-level variables and selecting the permutation of allocation that minimises the R 2 for the predicted selection [13].

Survey instruments
The impact evaluation includes child-, household-, school-, caterer-and community-level data collection as shown in Table 4.

Methods of analysis
The randomised design allows for the identification of causal impacts of interventions using comparisons of mean outcomes between the randomised treatment arms at endline. The analysis will follow the intention-to-treat approach as protocol and as treated, using econometric  analysis for all the relevant outcomes of the intervention. Following Bruhn and McKenzie, impact will be assessed for the different treatment arms using both a 'differencein-difference' (DID) estimator and a single difference analysis of covariance (ANCOVA) model [14]. The DID estimate is calculated as the average change in the outcome of interest (Y) in the treatment arm (T) minus the change in outcome in the control group (C), or: A difficulty of DID analysis is serial correlation [15] resulting from unobserved factors affecting the outcomes that are themselves correlated over time and that produce auto-correlated errors and invalid standard errors. Serial correlation affects estimated standard errors and can lead to erroneous acceptance or rejection of null hypotheses but not the estimation of the effect size of the intervention. Thus, it may lead to erroneously finding or not finding a statistically significant impact of the intervention. Angrist and Pischke illustrate how this problem can be addressed by calculating clustered standard errors [16], a procedure that is easily implemented using Stata software. Clustered standard errors will also be employed in all cases in which correlated outcomes are observed within the same unit of analysis. For example, when the impact of the intervention is analysed at the school level and test scores within school are obviously correlated. Similarly, clustered standard error will be used at the household level when the project is affecting more than one child within the same family, as in the case of impact on younger siblings. The single difference model specification has the following form: where Y i0 is the outcome variable at baseline, Y i1 is the outcome variable at endline and T i is a dummy variable for the treatment. The ANCOVA estimator has been shown to provide a more efficient estimate of programme impact when auto-correlation of outcomes is low [14].
As additional robustness checks, depending on the level of clustering of the outcome under analysis, we will employ multi-level regression models that account for the hierarchical nature of the data [17]. Multi-level models, also known as mixed-effects models, use both fixed effects (covariates) and random effects at school and household level.

Markets
Early studies of food prices in Ghana found negligible price differences across the country [18]. Regional equality of consumer prices, however, does not imply the equality of producer prices at a more localised level. The ability of market interventions to influence local price dynamics depends on the level of spatial market integration between local markets. Abdulai [19] analysed the maize market in Ghana and found a high level of integration, meaning a quick transmission of prices from one locality to the other. In these circumstances large purchases of staple food in localised markets are unlikely to produce price changes. Cudjoe et al. tested for market integration for several staple foods in Ghana and found a high level of integration for rice and maize but much • Education (school enrolment, attendance, education of all household members, time spent in class and working, distance and transport to school, meals while in school, parents' aspirations, PTA membership and involvement) • Household assets and farm assets (household facilities and durables including land and livestock holdings) • Economic activities (simple income questionnaire on time spent working by household members in wage work, own business and own farm) • Expenditure (monetary expenditure and own production of food, education, health, durables, and non-food expenditure) • Anthropometry (height and weight of parents and children above 6 months of age -parents measurements are taken to assess the genetic potential) • Micronutrient status (haemoglobin levels, anaemia prevalence) • Cognitive and literacy and maths achievement tests (test scores on maths, literacy, Raven's matrices and digit span test) • Farm income (agricultural production and revenues, input expenditure and depreciation of farm assets) • Other income (a simplified income questionnaire for other income sources like microenterprises, transfers, remittances, gifts, etc.) School questionnaire • School facilities (school characteristics including boards, toilets, furniture, books and all school-feeding related characteristics -kitchen, storage room, etc.) • School participation (school-level data on enrolment, attendance and drop-out) • School management and food procurement • Teachers (qualifications, living conditions and aspirations) • Training and monitoring activities PTA Parent-teacher Association less for tubers such as cassava and yam [20]. Prices of the latter items may be strongly localised and transmission between markets may not be easy. It should also be noted that the studies quoted above looked at market integration across large wholesale markets that are well-connected by roads and communication flows. Differences in prices might emerge in more remote and isolated areas even for more commercial crops like maize and rice. We therefore considered studying the impact of the intervention on local market prices, particularly when the food purchased consists of food items that are not highly commercialised such as cassava and yam.
Impact on prices could, in principle, be observed through the household-level questionnaires. The farm gate price could be observed at the household level by including in the questionnaire questions related to prices paid and time of sales. This, however, would complicate the income section of the farmer questionnaire. Consumer prices are more difficult to observe in a standard household survey because the recall time is 7 or 30 days and there is only one survey per year. As part of the programme monitoring activities, price data will be collected, on a monthly basis, for main staple crops in the local market next to each of the selected schools for a sub-sample of farmer households. Collection of prices does not even require visits to markets if stable contacts can be established with collectors in each of the markets and prices could be communicated by phone.

Heterogeneity of impact
The large dataset will allow for extensive sub-group analysis, including gender, age and geographic characteristics. The impacts of school feeding in different contexts are quite heterogeneous and context-specific [12]. School feeding, for instance, has been associated with marked improvements school participation by girls in rural areas with large gender disparities in access to education [21]. Smallholder farmers targeted by the programme will, in large proportions, be women. From the educational perspective, school feeding impact has also been found to vary with pupil age, as household schooling decisions are also affected by the opportunity costs of education, that tend to increase with age and vary by gender.
The programme is targeted to disadvantaged groups. The main beneficiaries are located in poor, rural districts of the country and the programme has a potential poverty inequality reduction impact at the national level. At the local level, the programme has a potential poverty reduction impact, but the inequality reduction impact will depend on whether: The project will increase enrolment. Children going to school are likely to be from a richer background and from more accessible areas The project will involve small farmers. The programme might rely on large farmers or traders for the provision of food

Cost-effectiveness
Cost data will be collected retrospectively following an ingredients approach using a semi-structured questionnaire. The survey will be based on a standardised costing framework capturing capital (fixed) and recurrent costs incurred at the school level. The questionnaire will also cover both cash and in-kind contributions and will be used to estimate both financial and economic costs. Financial costs capture actual expenditures in terms of programme implementation on an annual basis. Economic costs included the opportunity costs of community members, teaching staff and other school-level stakeholders involved in the school feeding and school health and nutrition (SHN) service provision. Opportunity costs of school staff and community members will be calculated using local pay scales. Capital costs will be annuitised over the useful life of all relevant school-level assets using a discount rate of 3 % as per World Bank recommendations. Annuitisation enables an equivalent annual cost to be estimated and reflects the value in-use of capital items, rather than reflecting when the item was purchased [22].
Process and output data covering the adequacy of the service delivery will be collected from monitoring visits on a quarterly basis using standardised data collection forms. Output data will be combined with the costs to provide estimates of cost-efficiency metrics, including costs per beneficiary, kilocalories, iron, and vitamin A delivered. Sensitivity analysis will be undertaken to account for uncertainties in the economic evaluation. The figures obtained in this way will then be compared to figures calculated for other interventions.
Of particular interest is the cost-effectiveness of the community-level component of the intervention. The   comparison between the HGSF+ and the regular GSFP groups is roughly equivalent to the comparison between a 'home-grown' school feeding project and a standard school feeding project. Many would expect HGSF to be cheaper and more cost-effective because of lower transport costs. However, the alternative procurement source, its distance and affordability is unknown, and hence the difference in costs between the two programmes is an empirical question.

Data collection
The enumerators were recruited from Noguchi Memorial Institute for Medical Research (NMIMR) and Institute of Statistical, Social and Economic Research (ISSER), and trained for the baseline survey. Each team, led by a supervisor and assisted by community leaders conducted household listings and sampling in each enumeration area (EA). Maps were obtained for most of the EAs from the Ghana Statistical Service. The EA maps made it possible to identify all dwelling structures within a geographical space with a well-defined boundary. All dwelling/housing structures within each EA were serially numbered to facilitate the complete listing of households. The list of households in each EA constituted the sampling frame from which participating households were selected at random for the interview. A total of 2626 households in 116 communities were surveyed (see Table 5 for the data collection coverage) between the 22 June and the 2 September 2013.
In each household, all children aged between 5 and 15 years were asked education-related questions (enrolment, attendance, drop-out) and were tested in literacy, maths, forward and backward digit span and Raven's matrices. Anthropometry measurements were undertaken for children aged 2-15 years. Tests and measurements were made at the household level because not all the children in the targeted schools resided in the selected localities where the schools were situated. Height measurements were taken with Leicester Height Measures and weights were measured using Tanita Electronic Scales WB-100A/WB-110A Remote Display Version scales, which allow height measurements of up to 2 m 10 cm to the nearest 1 mm. The height and weight measures were assembled and placed on a level surface. In the absence of a level ground in the household, a suitable place was identified for the measurement in the community. A sub-set of children aged 5-15 years were randomly selected for haemoglobin and parasitology measurements. Haemoglobin levels were collected using HemoCue Hb 201+ analyser, with standard controls reagents (Hemotrols) used to verify appropriate device function on a daily basis.

Data management and analysis
All questionnaires were checked in the field for consistency and completeness by field supervisors before data entry. Data were entered in CSPro and later transferred to Stata 12 for data cleaning and analysis. Simple frequency tables of variables from each module in the questionnaire were generated from the database and examined for inconsistencies. Errors related to wrong entries were verified from the specific questionnaire and corrected appropriately.

Ethical approval
Ethical clearance was obtained from the Institutional Review Board of the Noguchi Memorial Medical Research Institute of the University of Ghana and sought at the Imperial College Research Ethics Committee. Meetings were held from early stages in the study development with relevant government ministries both at central and decentralised levels to discuss the purpose, procedures and risks involved in the study. Informed consent was obtained from parents/guardians of children through written and verbal information provided before interviews. Table 6 summarises the characteristics for key variables of interest in the study population and by study group. We also report the main evaluation comparisons, including school feeding (combined GSFP and HGSF) versus control (no school feeding), regular school feeding (GSFP) versus HGSF (combined HGSF+ and HGSF++) and HGSF with micronutrient sprinkles (HGSF++) versus HGSF without sprinkles (HGSF+). The tests of balance show evidence of small differences across the treatment arms for several variables across education, nutrition, agriculture and other socio-economic domains. In addition, approximately 19 % of children in the target age group (5-15 years) received some form of free school meals at baseline. Of the total 8407 children aged 15 years or younger, 48 % were girls.

Results
In the education domain, 92 % of children aged 5-15 years were enrolled in school, and mean enrolment rates were marginally lower in the control population (0.91, SD 0.29) compared to the school feeding group (0.93, SD 0.26). Significant differences were also found for age of first enrolment, the number of times that a year was repeated, and across all the four test scores.
In the nutrition domain for children aged 5-15 years, the mean z-scores for the anthropometrics measures of height for age and BMI for age were −0.925 (SD 1.35) and −0.592 (SD 0.924) respectively, with significant differences across the GSFP versus HGSF comparison groups. Iron status, as measured through haemoglobin levels, for the sub-sample of children (n = 714) who were assessed, was on average 11.3 g/dL (SD 1.34), just below the 11.5 g/dL cut-off for non-anaemia in the 5-11 years age group.
In terms of household socio-economic characteristics, there were neither significant differences among the treatment groups for the mean education levels of mothers and household heads, nor for household size. There was, however, a significant difference in terms of per-capita household expenditure quintiles between households in the GSFP and HGSF groups, but no other substantive differences with regards to household expenditure were observed.
In the agriculture domain, across the survey population the mean production of maize over the previous 12 months was 787 kg (SD 1751), with average household sales of maize during the same period of 393 kg (SD 1196). Mean household production of rice was 141 kg (SD 625), with average annual sale volumes of 84 kg (SD 484). Significant differences were found across treatment arms in terms of maize production and sales.

Discussion
School feeding interventions are implemented in nearly every country in the world, with the potential to support the education, health and nutrition of school children from low-income households [23]. To date, there is little evidence on the potential for agriculture and community development. This paper described the design and baseline results for a randomised evaluation of school meals interventions linked to smallholder agriculture. As far as we are aware, it is the first to examine explicitly from a holistic perspective the simultaneous impact of a national school meals programme on micronutrient status, alongside outcomes in nutrition, education and agriculture domains. The evaluation builds on a trial design taking place in Mali that includes an extensive analysis of the programme theory for the intervention. As the intervention is complex, the scope of this evaluation is also very broad and includes measurement of a range of outcome indicators across multiple traditional disciplines. Designing and implementing such an evaluation is in itself a complex undertaking, involving a multi-disciplinary research team working in close collaboration with programme-and policy-level stakeholders. The survey also required a range of different expertise in the enumeration teams in order to collect data including anthropometry, haemoglobin levels, and educational tests, alongside expenditure, income and other socio-economic-related modules. The use of the survey tools required to capture the data was inevitably fairly time-intensive. Extensive analysis of the rich baseline data is currently underway.  A number of important considerations can be drawn from the baseline data analysis. Firstly, the tests of balance showed evidence of small differences across the treatment arms for several variables across education, nutrition, agriculture and other socio-economic domains. The randomisation of treatment across the arms of the cluster-randomised trial is aimed at minimising the systematic differences in the outcomes between the intervention groups. In practice, differences between the intervention groups can arise due to sampling error in moderate sample sizes. When estimating programme impact it is important to control for these differences where they exist.
In addition, approximately 19 % of children in the target age group (5-15 years) received some form of free school meals at baseline. Similar findings were reported in a similar study in Mali in 2013 by Masset and Gelli [13] where, because of information flow constraints, the original list of schools used in the randomisation included schools with school feeding. This finding has important implications in terms of the evaluation design, considerably reducing sample sizes available for comparisons after the follow-up survey. The small sample sizes between the HGSF+ and HGSF++ comparisons are a particular concern, and power calculations using the baseline data suggest folding these two arms into one, adding micronutrient sprinkles to the HGSF+ intervention.
Significant differences were found in the means of a number of outcome and control variables across the intervention groups. It appears, therefore, that at baseline the random allocation process did not achieve statistically comparable treatment groups. In particular, important differences across groups include marketed surplus, livestock income, per capita food consumption and intake, school attendance, anthropometric status in the 2-5 and 5-15 years age groups. Differences in outcome and control variables across groups will be controlled when estimating treatment effects. More in-depth analyses of the very rich baseline dataset, examining the associations between key outcomes and variables along the complex agriculturenutrition are also underway.

Conclusions
Assessing the simultaneous impact of 'home-grown' school feeding on micronutrient status, health, education and agriculture is a complex undertaking, involving coordination across policy, programme and research stakeholders. This study is the first to examine the effects of alternative implementation modalities of school meals on nutrition, health education and agriculture in Ghana. The findings of this evaluation will provide important evidence to support policymakers in the scale-up of the national programme.

Power calculations School attendance
We used the rural sample of the GDHS data of 2008 to estimate attendance rates of children in the age group 6 to 14 and we found rates of 79 % for boys and 81 % for girls. The chart below plots values of power for increasing number of clusters assuming a project impact of 5 percentage points on attendance rates in primary school. A sample of just 60 clusters and collecting data on 40 children is sufficient to detect such an impact with 80 % statistical power (Fig. 4).

Cognitive tests
We obtained data on outcomes of cognitive tests from a sample of rural children tested in 2003 using Raven's matrices. The average score on the test was 15.3 out of 36 questions with a SD of 5.9 and an intracluster correlation coefficient (ICC) of 0.14. The chart below plots that minimum detectable difference against the number of clusters. We assumed a number of 40 children per cluster considering 20 households interviewed in each cluster and an average of 2.3 children in the relevant age group per each household with children (Fig. 5).
The table below summarises the standardised detectable differences and corresponding absolute values of the tests for different study designs (Table 7).

Anaemia
Data for power calculations were obtained from the 2008 GDHS. We calculated means, SDs, and ICCs for rural children aged 6 months to 5 years and rural mothers aged 15-49 years. See the tables below for the level of haemoglobin and prevalence rates of any anaemia (including severe, moderate and mild) ( Table 8).
The chart below plots the minimum detectable difference in terms of SDs from the mean for children (ICC = 0.13) and mothers (ICC = 0.08). In both cases it is assumed that the size of the sample in each cluster is 20. This is consistent with 20 household interviews per  community and considering that several children may end up not being tested. In any case, only a marginal gain can be obtained by expanding the sample beyond 20 as power is mainly driven by the number of clusters (Fig. 6). The table below reports the standardised detectable differences and their equivalent level values for 3 different designs: 30 clusters, 60 clusters and 120 clusters. In each case 50 % of the sample is allocated to the project group. Differences between groups of mothers can be estimated more precisely because the ICC is lower for mothers though the sample variance is slightly larger ( Table 9).

Child nutrition
We used data from the GDHS 2008 to estimate mean and SD of height-for-age z-scores of rural children and we found these to be −1.03 and 1.57 respectively. The ICC is 0.08. The chart below plots the standardised minimum detectable difference against the number of clusters assuming a sample of 30 children measured in each community (Fig. 7).
The table below summarises the values of the standardised and equivalent absolute values of the detectable differences (Table 10).

Farm income
We used data from GLSS4 of 1998/1999 to estimate average farm income of rural households (1200 cedis) and relative SD (1400 cedis). We found an extremely high ICC. Income is the most difficult outcome to estimate with sufficient precision. The chart below plots the standardised minimum detectable difference against the number of clusters assuming 20 farmers interviewed in each community. Since the SD is roughly similar to the mean the vertical axis can be interpreted as a percentage difference (Fig. 8).
The table below summarises the standardised differences and the corresponding percentage changes in income that can be estimated with different study designs (Table 11).