By David Comerford

 

Applying behavioral insights in practice remains in its early stages and there is still much that is unknown. A striking demonstration of this point is that behavioral scientists themselves did a poor job predicting the effectiveness of a range of behavioral interventions, typically overestimating their effects by orders of magnitude. We might then expect that practitioners also overestimate the gains that will be delivered by the interventions they roll out.

This matters because it is costly to devise and implement an intervention. If behavioral interventions repeatedly deliver less bang for the buck than had been promised then we might expect behavioral science – and behavioral scientists – to be treated with skepticism.

This piece aims to help practitioners calibrate their forecasts of how effective an intervention is likely to be. I outline six steps to help assess whether an effect found in a published study is likely to be useful in practice. These six steps are expanded on in a broader working paper that is available here.

A Six-Step Guide to Using Behavioral Science Research

1. Read the title and abstract of a paper until you are confident that you know what question the paper addresses.

Academic researchers have the incentive to make a specific contribution to knowledge. Often the questions they ask will explore an effect on some specific outcome, ignoring others that are relevant. In some cases, researchers will be more interested in proving a theory than in its real-world applicability. Make sure that you have a precise understanding of which specific intervention caused a change in which specific outcome.

2. Think for yourself – what would my ideal experiment testing this intervention look like?

Typically, you would like to see an experiment that observes the real-world behavior of your real-world stakeholders. Published studies, however, typically use methods that are convenient for demonstrating proof-of-concept e.g. student samples; hypothetical surveys, etc.

3. Compare your ideal experiment with the methods employed in the paper. What are the differences in Context, Outcome, Sample, and Treatment (COST)?

Context

Context matters. A striking example is a study in which the authors manipulated the size of forks and estimated their effect on how much people ate. In the laboratory, smaller forks caused less food to be consumed relative to larger forks. In a field experiment at a restaurant where the subjects were unsuspecting diners, the effect not merely failed to replicate but actually reversed: smaller forks caused significant increases relative to larger forks in the amounts consumed. The authors explain the reversal as deriving from differences in attention across a lab setting and a real-world dining experience. The general point is that a directional result may reliably obtain under specific conditions but may be worse than useless for predicting real-world behavior.

Outcomes

In the real world, you will likely care about various outcomes e.g. sales volumes, customer satisfaction, etc. Be aware that a positive result on one specific outcome is compatible with a negative result on other valued outcomes. For instance, in its early days, behavioral literature celebrated the power of defaults and social norms. Since then, we have come to recognize that people sometimes avoid situations where they expect to be nudged. An intervention that causes your customers to increase the behavior that you desire may have the negative effect of causing you to lose customers.

Sample

The effect of an intervention is the change in an outcome induced by that intervention. The sample that has been selected for study will influence the magnitude of that change. In the full paper that this post summarizes, I review a study that tests whether computer use harms or helps student performance. The sample in that study was students at West Point Military Academy. We might consider that elite army personnel is especially self-disciplined relative to the wider population. To the extent that this is true, we might expect that the average West Point student had characteristics that buffered them against distraction from computers. In short, a result found in a West Point sample might be more negative in a sample from the general population.

Treatment

The treatment is the specific intervention that was applied in the study. In the study on computer use at West Point, there were two treatment groups. One group was allowed to use electronic devices with their screens obscured from the teacher. The other was only allowed use tablet computers flat on the desk. As a teacher, I find these two treatments useful because I could apply either of them in my classroom. That said, there is a third treatment that was not applied but I would be curious to test – I could forewarn students that computer use in classrooms has been demonstrated to reduce grades and then see if this information helps students make profitable use of computers. The big picture point here is that it might make sense for you to tweak a published intervention.

4. Read the results, discussion, and conclusions of the paper.

The reported results will be more sensibly interpreted after having reviewed the methods section in detail by applying the COST mnemonic above.

5. In what direction would the differences you identified in Step 3 lead the treatment effect relative to the results of your ideal experiment?

Step 5 is the key point for extrapolating from the study to your real-world context: is there anything about the authors’ methods that would cause us to question the likelihood that this intervention would deliver similar results when applied to our business? If so, do these methodological differences lead you to think the results would be more positive or more negative?

6. Ask yourself – would I act on the conclusion drawn from these data?

Steps 1-5 consider the question: is the intervention likely to deliver a desirable change in a specific dependent variable? Step 6 considers the question of whether that effect is so desirable that it warrants implementing the intervention. Things to consider when answering this question are:

Are the results robust enough that you believe them? Are the sample sizes large enough to give reliable results (e.g. > 100 individuals)? Are the expected gains so large that it is worth the financial and reputation costs of implementing the treatment? A 0.5% gain in grade be robustly statistically significant but may not be worth intervening for. Is the proposed intervention ethical?

Readers of this website will already know that governments, firms, and charities are increasingly looking to behavioral science to help guide their policies and tactics. This is a heartening development – there are undoubtedly win-wins to be enjoyed by leveraging behavioral insights. This guide should help practitioners choose the winning win-wins.

 

This article was edited by Carina Müller.

 

David Comerford
David Comerford researches judgment and decision making, particularly in relation to forecasting and planning. He has published in a wide range of outlets from the New England Journal of Medicine to Economica. He is professor of economic and behavioural science at the University of Stirling.