Via Marc Bellemare, an interesting paper (PDF) by Bold et al. on “scaling up” randomized controlled trials, with an application to educational reform in Kenya. The general problem here is that RCTs are a great way to identify the effects of particular interventions on particular outcomes, but armed with a finding that, say, “exposure to a contract teacher in government schools in Western Kenya raises test scores by 0.21 standard deviations relative to being taught by civil service teachers,” we often have a difficult time “scaling up” this outcome to the national level. The problems are both practical (how to get the entire country to do something?) and conceptual (should we expect that the Western Kenya findings are applicable across Kenya? To say nothing about Uganda, India, Bolivia, etc.). We call latter a question about external validity.
Marc describes another problem, which he labels implementation bias:”with only one implementing partner, it is impossible to tell whether the success or failure of any intervention is due to the perception people in the treatment group have of the implementing partner.” I have no disagreement that that’s an issue, and Bold et al.’s paper is a great way to explore it, but I never thought about calling that implementation bias. To me, the term implementation bias conjures up two other problems.
- An intervention may only be possible in the very places where it’s most (or perhaps least) likely to be successful. If you want to generalize about the effects of an public accountability intervention in Yogyakarta, Indonesia, you ought to wonder if that intervention could have been implemented in the first place in Solo. This is related to the “OMFG exogenous variation” syndrome: we seize on findings that we can find using the techniques we prefer, and ignore ones that we cannot. The danger is that there is some feature of Yogya that allows public accountability to be manipulated and which is related to the outcome we want to measure.
- We may only implement things that are relatively uncontroversial to implement. Take the classic paper (PDF) by Miguel and Kremer on deworming. The authors find (once again in Kenya) that where “school-based mass treatment with deworming drugs was randomly phased into schools” there are large externalities to deworming: the benefits spill over to some children who did not experience the treatment. This is a great finding and a great paper, and we can think of its findings as evidence that public health interventions are investments in human capital, but I wonder how many minds it changed. Take something a lot more expensive and a lot more controversial—something like forced sterilization of people with low IQs, which I of course do not support—could we ever know if that would be a cost-effective investment in human capital? Probably not, because we can’t implement it. Between the two extremes of deworming and forced sterilization there are lots of public health interventions that are more or less controversial but which may or may not be good investments. We might only implement the ones that are probably the most likely to yield good outcomes.
Are all of these things implementation bias? Or are there other proper names for them? I don’t know, and I suppose that it doesn’t matter substantively, but as an interested observer I’d like to know.