Does theory influence the effectiveness of health behavior interventions? Meta-analysis.

Objective : To systematically investigate the extent and type of theory use in physical activity and dietary interventions, as well as associations between extent and type of theory use with intervention effectiveness. Methods : An in-depth analysis of studies included in two systematic reviews of physical activity and healthy eating interventions ( k = 190). Extent and type of theory use was assessed using the Theory Coding Scheme (TCS) and intervention effectiveness was calculated using Hedges’s g . Meta-regressions assessed the relationships between these measures. Results : Fifty-six percent of interventions reported a theory base. Of these, 90% did not report links between all of their behavior change techniques (BCTs) with specific theoretical constructs and 91% did not report links between all the specified constructs with BCTs. The associations between a composite score or specific items on the TCS and intervention effectiveness were inconsistent. Interventions based on Social Cognitive Theory or the Transtheoretical Model were similarly effective and no more effective than interventions not reporting a theory base. Conclusions : The coding of theory in these studies suggested that theory was not often used extensively in the development of interventions. Moreover, the relationships between type of theory used and the extent of theory use with effectiveness were generally weak. The findings suggest that attempts to apply the two theories commonly used in this review more extensively are unlikely to increase intervention effectiveness.

Applying theory to the design and evaluation of complex behavior change interventions is viewed as good practice (Glanz & Rimer, 1995;MRC, 2008). Although there is some evidence for an increasing trend of interventions to refer to a theoretical basis (Noar, Palmgreen, Chabot, Dobransky, & Zimmerman, 2009), a substantial proportion of studies do not, as noted in a variety of reviews and commentaries (e.g., Albarracin et al., 2005;Davies, Walker, & Grimshaw, 2010;Hardeman, Johnston, Johnston, Bonetti, Wareham, & Kinmonth, 2002;Molloy, 2010). These reviews and commentaries have not included a detailed examination of how theory has been used in the development and evaluation of interventions.
However, other reviews have detected small or no association between reported theory use in intervention design and intervention effectiveness (e.g., Albarracin et al., 2005;Roe et al., 1997;Stephenson et al., 2000) while one review indicated interventions reported to be based on theory were less effective compared to those not reporting a theory basis (Gardner et al., 2011). These inconsistencies in the literature between theory application and intervention effects require an indepth examination of how theory has been used and whether using theory in different ways is associated with larger behavior change effects.
One possible reason for inconsistencies in the associations between theory use and intervention effectiveness is that earlier reviews report theory use in simple categorical terms (yes/no) (e.g., Ammerman et al., 2002). More in-depth assessments of how theory has been applied may lead to different and/or more consistent findings. Recent methodological developments enable such an approach. One approach to specifying theory use in health behavior research is a general coding frame of four items: informed by theory; applied theory; tested theory; built theory (Painter, Borba, Hynes, Mays, & Glanz, 2008). A more detailed method of specifying reported theory use is the 19-item Theory Coding Scheme (TCS, . The TCS specifies whether theory is mentioned, how theory is directly used in intervention design, how theory influenced interventions indirectly via the selection of participants and via delivery to different groups of participants, how theory explains intervention effects on outcomes and the implications of the results for future theory development. Theory can be used to inform interventions by highlighting the constructs or types of individuals that should be targeted by the intervention  or to inform the selection and sequence of intervention strategies (Wingood & DiClemente, 1996).
Consequently, there are at least three major pathways through which basing an intervention on a specific theory can influence intervention effectiveness: via the selection of specific behavior change techniques (BCTs) or a combination of these techniques which prove effective or ineffective; influencing the inclusion of participants into the study who are likely to benefit from the intervention (and the exclusion of participants unlikely to benefit) and the tailoring of BCTs to individuals (tailoring) based on their theory-relevant characteristics. These pathways are not necessarily related; for example, researchers could use theory to decide which participants are eligible for their study but not use theory to select the intervention behavior change techniques or to tailor the intervention. As a consequence, interventions that apply theory more extensively to address each of these pathways could be more effective than studies that apply theory less extensively.
An important consideration when examining whether interventions using theory are more effective than those that do not is to assess how theory has been applied in the comparison condition (see Michie, Prestwich, & De Bruijn, 2010;Williams, 2010). For example, studies that use theory to tailor an intervention to both intervention and comparison conditions need to be differentiated from studies that use theory to tailor an intervention only to the intervention condition. This is necessary as the intervention effect relates to the difference between the intervention and control groups. Reviews of the impact of theory use on intervention effectiveness, therefore, need to consider whether theory has been used to develop the intervention in both the intervention and comparison conditions.

Aims
The current review had two aims: First, to use the TCS to assess the extent to which studies have reported using theory to develop interventions (Aim 1). Second, to investigate whether differential theory use was associated with intervention effectiveness (Aim 2). In relation to this second aim, we investigated how much specific types of theory use (measured through individual items on the TCS), as well as the extent of theory use (measured through composite scores of multiple items on the TCS) were associated with intervention effectiveness.
These key analyses were re-run within five sets of sensitivity analyses. First, given intervention effects reflect the difference between the intervention and comparison groups we coded theory use in both conditions. Consequently, within sensitivity analyses, we tested whether the associations between theory use in the intervention group and intervention effectiveness changed when statistically controlling for theory use in the comparison condition.
Second, we took into account the type of control (i.e., any active control vs. waitlist, no or minimal intervention). Third, because just over a third of studies were conducted using participants with, or at risk of, chronic diseases, we included this factor in the analyses. Fourth, due to concerns that studies that use theory to develop their interventions may be methodologically more rigorous, we coded risk of bias and statistically controlled for this in an additional set of analyses. Finally, we identified and removed statistical outliers from the analyses.

Studies
Studies included in the current review were based on two recent systematic reviews, including a review of the association between BCTs and physical activity and diet (Michie, Abraham, Whittington, McAteer, & Gupta, 2009) and a review investigating BCTs within obese adults with, or at risk of, obesity-related co-morbidities (Dombrowski, Sniehotta, Avenell, MacLennon, & Araujo-Soares, 2012). The review covered both physical activity and healthy eating because a number of studies targeted both of these behaviors and in order to maximize the scope of the review and the power of the analyses. The dataset for the review included 140 separate studies comprising 190 comparisons of interventions.

Inclusion/Exclusion Criteria
In the Michie et al. (2009) review, the inclusion criteria were: 1. Adults aged 18 or over; 2. Interventions targeting physical activity and/or healthy eating; 3. Use of experimental or quasi-experimental designs; 4. Incorporated objective, standardized, or validated outcome measures; and 5. Use of cognitive or behavioral change strategies beyond simple provision of information. Their review excluded interventions sampling specific populations (i.e., pregnant women or recently postnatal women, athletes, individuals already engaged within another health programs, individuals not living in the free-population, individuals with physical or mental health problems). In the Dombrowski et al. (2012) review, the inclusion criteria were: 1. Adults with a mean/median age above 40, mean/median BMI above 30, and at least one other risk factor for morbidity or an already present co-morbidity; 2. Behavioral interventions targeting physical activity and/or healthy eating; 3. Randomized controlled trials with follow-up data at 12 weeks or later; 4. Reported behavior change data for diet and physical activity by self-report or objective measures at baseline and follow-up. All studies included in this analysis, therefore, were evaluations of behavioral interventions aimed at increasing physical activity and/or healthy eating, and reported measures of these behaviors and statistical information upon which effect sizes could be calculated.

Data Extraction
The following information was extracted from each study: (a) bibliographic information (author, year of publication, associated papers), (b) behavioral outcomes (physical activity and/or healthy eating), (c) duration of intervention period, (d) intervention/control group information items within the TCS requires yes/no responses and has shown good inter-rater reliability. The specific theory (e.g., SCT, TTM) upon which the intervention was reported to be based was also coded (item 5).
While items on the TCS were treated individually in some of the analyses, some of the TCS items were also combined to reflect the extent of theory use (see Table 1 for TCS items).
Specifically we created three composite measures reflecting: -(a) the extent to which each BCT reported by the authors was linked to a theory-relevant construct [BCTs → theory-relevant constructs]. As there are three items on the TCS that relate to this aspect of theory use (items 7-9), the composite score was determined using these three items. Of these items, studies coded 'yes' for item 7 ('All intervention techniques are explicitly linked to at least one theory-relevant construct') reflects the most optimal use of theory thus was weighted as +2 in the composite measure. Studies coded 'yes' for either item 8 ('At least one, but not all, of the intervention techniques are explicitly linked to at least one theory-relevant construct') and/or item 9 ('Group of techniques are linked to a group of constructs') were weighted as +1 to reflect some (but not extensive) use of theory. Studies coded 'no' to items 7, 8 and 9 were weighted 0 to reflect no theory use. Thus, the 'BCTs → theory-relevant constructs' measure ranged from 0 (no theory use) to +2 (optimal theory use).
-(b) the extent to which the constructs within the underlying theory were specifically targeted by the BCTs [theory-relevant constructs → BCTs]. As there are three items on the TCS that relate to this aspect of theory use (items 9-11), the composite score was determined using these three items. Of these items, studies coded 'yes' for item 10 ('All

theory-relevant constructs are explicitly linked to at least one intervention technique')
reflects the most optimal use of theory thus was weighted as +2 in the composite measure. Studies coded 'yes' for either item 11 ('At least one, but not all, of the theory relevant constructs are explicitly linked to at least one intervention technique') and/or item 9 ('Group of techniques are linked to a group of constructs') were weighted as +1 to reflect some (but not extensive) use of theory. Studies coded 'no' to items 9, 10 and 11 were weighted 0 to reflect no theory use. Thus, the 'theory-relevant constructs → BCTs' measure also ranged from 0 (no theory use) to +2 (optimal theory use).
-(c) an 'overall theory score' was also generated based on all of the TCS items that relate to using theory to develop the intervention (items 3-11). Studies received a +1 weight for each of the following: the intervention was based on a single theory (item 3); theory was used to select recipients for the intervention (item 4); the intervention was explicitly based on a specific theory or combination of theories (item 5); theory was used to tailor intervention techniques to recipients (item 6). The sum of these items were added to the 'BCTs → theory-relevant constructs' (see (a) above) and 'theory-relevant constructs → BCTs' (see (b) above) composite scores, creating a scale from 0 (no theory use) to +8 (most extensive theory use). 2 Because TCS items 3, 4, and 6 may be dependent on the type of theory application, a sensitivity analysis was used to test the impact of removing these from the overall composite score, providing an alternative overall theory score.
The TCS, where appropriate, was applied separately both to the intervention and comparison conditions to take into account the use of theory in the study (see 'Data Analysis' and 'Online supplement table'). This allowed the examination of whether theory use in the intervention condition was associated with intervention effectiveness when controlling for theory use in the comparison condition (see Williams, 2010).
For the TCS, pairs of coders independently coded the theory items from 42 studies. A Cohen's kappa value between .61 and .80 reflects substantial agreement while a kappa value above .80 reflects almost perfect agreement (Landis & Koch, 1977). On this basis, the levels of agreement were typically almost perfect (mean kappa=.88; median kappa=.89) and at least substantial (>.71) for all theory items.

Risk of bias.
Based on the Cochrane Collaboration's tool for assessing risk of bias (Higgins, Altman, Gøtzsche, Jüni, Moher, Oxman, et al., 2011), an assessment was made using the following items coded yes/no: (i) does the study report randomization?; (ii) was the allocation sequence concealed?; (iii) was there any blinding?; (iv) was incomplete data adequately addressed?; (v) Are reports of the study free of suggestion of selective outcome reporting?; (vi) is the study free from any other bias? 3

Data Analysis
To assess the extent to which studies have used theory to develop and evaluate interventions, the percentage of studies that were coded 'yes' for each item on the TCS 4 was calculated (see Table 1). Calculations were performed for both percentages of all studies and studies that explicitly stated a theory base.
To examine whether items from the TCS predicted intervention effects, a series of metaregressions (using a random-effects model with restricted maximum likelihood estimation, computed with the meta-reg command in Stata version 12.1, Stata Corp 2011) were conducted.
In our analyses, the regression coefficients are the estimated increase in the effect size per unit increase in the covariate(s). To assess the proportion of between-study variance explained by each covariate, the adjusted R 2 value is reported. Secondary meta-regression analyses were conducted to control for potential moderating factors. This was done by adding the following factors to the meta-regression models: theory use in each control group (i.e., as assessed with the TCS), type of control group (i.e., any active control vs. waitlist, no or minimal intervention), disease chronicity (i.e., chronic or at risk vs. non-chronic) and factors associated with the risk of bias (i.e., randomization, allocation concealment, blinding, missing outcome data, selective outcome reporting, and any other problems). Sensitivity analyses were conducted to assess the impact of removing outlying effect sizes (determined using the Sample-Adjusted Meta-Analytic Deviancy Statistic; Huffcut & Arthur, 1995). Sensitivity analyses are reported as supplementary online material (see online supplement table). Intervention effect size calculations followed the approach used in the original individual reviews, indexing effects as Hedges's g (the difference between two means divided by their pooled standard deviation, with correction for small sample size) (Hedges & Olkin, 1985).

Studies Included in the Review
The studies included in this review were published between 1990 and 2008. Across the 190 comparisons, the interventions were typically delivered by non-healthcare professionals (46.8%), directly to individuals (51.1%), within community-based settings (54.2%) over a mean period of 8 months. The mean final follow-up was taken at 10-months. Many of the outcomes were self-reported but most of the studies (65.3%) used behavioral measures that had been previously validated. The total number of participants across the studies was 61,649.

Risk of Bias
In regard to risk of bias, the vast majority clearly reported that they randomized participants to condition (94.2%). However, few clearly reported that the allocation sequence was concealed (15.8%) or that any form of blinding was used (25.3%). Across the 190 comparisons, few reported an adequate method to blind their participants (1.6%), the intervener (1.1%), or outcome assessors (10.0%). Incomplete data was judged to be adequately addressed in most comparisons (56.8%). The comparisons were typically judged to be free from selective reporting (96.3%) and free from other problems that could put them at high risk of bias (73.2%). Table 1 outlines the 19 items within the TCS. These numbered items are subsequently referenced while addressing the study aims below.

Findings
Aim 1: The extent to which studies reported using theory to develop and evaluate their interventions. Table 1 illustrates how theory was used across all 190 comparisons of interventions. Out of 190 interventions, 107 (56.3%) explicitly reported that they were based on theory (i.e., coded yes to item 5). Of the 107 interventions reporting a theory base, 51 (47.7%) were reported to be based on a single theory (item 3), 8 (7.5%) reported using theory to recruit study participants (item 4) and 42 (39.3%) reported using theory to tailor BCTs to recipients (item 6). Of these same 107 interventions, 11 (10.3%) reported explicit links between all BCTs within the intervention and the targeted theoretical constructs (item 7) while 10 (9.3%) interventions reported targeting all the constructs within a specified theory with specific BCTs (item 10).
Fifty-two (48.6%) tests of interventions reported measuring theoretical constructs postintervention and 45 (42%) measured constructs both pre-and post-intervention. However, only 4 (3.7%) tests of interventions reported statistically significant mediated effects (item 16d). A similarly small number (3, 2.8%) reported suggestions for theoretical refinement on the basis of their findings (item 19).
Insert Table 1 about here Aim 2: Is reported theory use associated with intervention effectiveness?
The relationship between reported specific theory use and intervention effectiveness is reported in Table 2 (see also the Supplement Table).
Insert Table 2

about here
Interventions reporting to be based on a single theory were associated with larger effects (g = .33) compared to those reporting multiple theories or a combination of theory and additional theory-based predictors for intervention development (g = .23) (see model 2). Of the TCS items, using theory to determine which participants should be recruited into the study yielded the most positive increase in effectiveness (g = .51 vs. .29; see model 3), although this variable did not explain any of the between-study variance. Studies that reported using theory to tailor the intervention yielded smaller intervention effects than those that did not report this tailoring (g = .21 vs. .33; see model 5).
Interventions referring to a theory base were not more effective than those not explicitly referring to a theory base (see model 4). Interventions reporting links between BCTs and theoryrelevant constructs were not more effective than others (see models 6 to 10). Interventions reported to be based on TTM or SCT were not different in terms of effectiveness, nor were they more effective than those interventions that were not reported to be based on these theories (see model 11).
With regard to the extent of theory use (see Table 3, model 12), there was little evidence that the following were associated in any meaningful way with intervention effectiveness: extent to which each BCT was linked to a theoretical construct ('BCTs → theory-relevant constructs'; adjusted R 2 = 1.19%); the extent to which the constructs within the underlying theory were specifically targeted by the BCTs ('theory-relevant constructs → BCTs'; adjusted R 2 = 0.48%); the 'overall theory score' (adjusted R 2 = 1.21%). It should be noted that the seven studies that scored most highly on the overall theory score produced, on average, a larger effect size than the 83 studies that scored the lowest score. However, when all studies were taken into account, there was no evidence that this represents a real effect. The same results were found in the subset of studies reporting theory use (model 13), using SCT (model 14), and using TTM (model 15).
Sensitivity analyses suggested that these results were relatively stable with respect to outliers (studies judged to be outliers were excluded from the analyses). In addition, the results were similar when controlling for disease chronicity and use of theory in the control group and type of control group (these were included as a factors in the meta-regression analyses) (see online supplement table). In addition, when variables relating to the risk of bias were entered individually into a meta-regression model, randomization appeared to be the most important factor associated with intervention effectiveness (i.e., studies reporting to use randomization produced, on average, larger effects than non-randomized studies). 5 Therefore, all models were re-run controlling for randomization (see online supplement table) and yielded similar effects. A sensitivity analysis using the alternative overall theory score produced similar results (B = -.02, adjusted R 2 = 1.46%).

Discussion
This review has systematically investigated the extent and type of theory use in interventions to increase physical activity and healthy eating, as well as associations between theory use and intervention effectiveness. About half of the interventions reviewed were reported to be explicitly based on theory. Of these, theory was rarely used extensively to develop or evaluate the intervention as defined by the TCS: few targeted all theoretical constructs, linked all BCTs to theoretical constructs, used theory to select recipients of their intervention, used theory to tailor their intervention, used theory as an explicit basis for their intervention or based their intervention on a single theory. This limits the possibility of evidence accumulation and of studies to experimentally evaluate specific theories and therefore to refine them on the basis of evidence.
The majority of the analyses revealed no association between theory use (assessed through individual TCS items or combinations of TCS items) and intervention effectiveness.
Where there were significant associations these tended to be weak; thus, inferences made from these findings should be considered cautiously. The finding that interventions that report using theory to select recipients yielded the largest intervention effects (albeit still small) is consistent with results of a recent review examining internet-based interventions (Webb et al., 2010).
However, this may be due to the selection of participants most likely to respond to interventions (e.g., because they are more motivated to change). Basing an intervention on multiple theories appeared to be somewhat less effective than basing it on a single theory. Basing an intervention on two theories providing contradictory accounts of how behavior changes may explain this (Dombrowski, Sniehotta, Avenell, & Coyne, 2007). A multi-theory approach without a clear rationale, described by Bandura (1998) as "cafeteria style research" (p. 628), may also fail to capitalize on the potentially beneficial impact that a coherent theory base may offer. We attempted to enhance the confidence in our findings by co-varying a number of potential moderators in our analyses. Given the possibility that studies that make greater use of theory to design their intervention would adopt more stringent methodological controls, these moderators included the type of control group (active control vs. delayed or no intervention control) and risk of bias. The results appeared generally robust within these sensitivity analyses and also in relation to similar analyses taking into account the use of theory in the control group (see Williams, 2010;, outliers and disease chronicity within the target population. As a consequence, the findings of this review are not in line with the findings from some earlier reviews that argued that basing interventions on theory should increase effectiveness (e.g., Albada et al., 2009;Fisher & Fisher, 2000;Glanz & Bishop, 2010;Kim et al., 1997;Swann et al., 2003), although these reviews typically assessed a wider range of theories and did not use a quantitative measure such as the Theory Coding Scheme. Specifically, our findings may suggest that applying the two theories commonly used in this review more extensively is unlikely to increase intervention effectiveness. However, we note particular caveats.
It should be noted that our findings apply to the extent of theory use as measured by the TCS and for the two theories with sufficient data to analyse, SCT and TTM. Second, the overall theory score was based on the summation of all relevant items. Particular combinations of items, reflecting certain elements of theory use, may increase intervention effects. Third, caution should be taken in generalising the results on the basis of a null finding (i.e., no association between the extent of theory use and intervention effectiveness). Fourth, we were only able to investigate theory use as reported in published articles, and this is likely to underestimate actual practice (Lorencatto, Michie, West & Stavri, 2011). Ideally, full study and intervention protocols would be publicly available to supplement this information, as they are often difficult to obtain from authors. Some journals now make this a requirement for publication (e.g. Addiction, Implementation Science) and wider adoption of this practice would be of great help to advancing our science. Fifth, previous research has suggested considerable discrepancies between protocol and delivery in practice (e.g., Borrelli, 2011;Hardeman et al., 2008), which has implications for any investigation of associations between theoretical underpinning and intervention content and effects . Fidelity thus needs to be taken into account when considering our own and other researchers' findings in this area. A precise estimate of the relationship between theory use and intervention effectiveness can only be obtained from studies with high fidelity of delivery. Sixth, a more comprehensive search strategy may have identified a greater number of eligible studies. This is a common problem with any review. However, we have synthesized evidence from two substantial reviews leading to the inclusion of 140 studies. As a consequence, the results should be reasonably robust against the omission of studies not identified through our search strategy.
This review demonstrates an in-depth analytic method for investigating how theories are used and how they relate to intervention effectiveness. Using this method, we have found that theories (particularly SCT and TTM) are not typically used as extensively as they could be in the development of interventions and that applying SCT and TTM more extensively may not increase effectiveness. However, developing more explicit links between type of theory, possible mediating pathways (including the selection of recipients; tailoring; mechanisms of behavior change techniques) and outcomes would represent an important step in advancing our understanding of behavior change and intervention effects.  ('Targeted' construct refers to a psychological construct that the study intervention is hypothesized to change). Evidence that the psychological construct relates to (correlates/ predicts/causes) behavior should be presented within the introduction or method (rather than the Discussion).

Intervention based on single theory
The intervention is based on a single theory (rather than a combination of theories or theory + predictors) 18 Appropriate support for theory Support for the theory is based on appropriate mediation OR refutation of the theory is based on obtaining appropriate null effects (i.e., changing behavior without changing the theory-relevant constructs).

10.3 19 Results used to refine theory
The authors attempt to refine the theory upon which the intervention was based by either: a) adding or removing constructs to the theory, or b) specifying that the interrelationships between the theoretical constructs should be changed and spelling out which relationships should be changed   .08 (0%) 9 All theory-relevant constructs are explicitly linked to at least one BCT (TCS Item 10) Yes (1)  .01 (0%) .002 (0%) .004 (0%) .006 (0%) .001 (0%) Overall theory score (range = 1 to 6) .01 (0%