kirstyevidence

Musings on research, international development and other stuff

Fighting the RCT bogeyman

14 Comments

Sometimes I have the feeling that development experts want to paint the debate about randomised controlled trials as more polarised than it really is. They seem to think they are fighting against a maniacal, narrow-minded, statistics-obsessed pro-RCT bogeyman who they fear is about to take over all development funding. It may seem that they are committing the classic logical fallacy of painting the opposing argument as ludicrous in order to strengthen their point – but having had many discussions on this topic I believe that many people really do believe in this bogeyman. So, here is my attempt to kill him!

I think RCTs can give some important information on some questions which can help us to make some decisions in the international development sector. BUT this does not mean that I think that RCTs are the only form of evidence worth considering or that they are the best method in all cases – and I am not sure that anyone does think this.

I get frustrated that every time I mention something about RCTs people seem to respond by arguing with things that I have not actually said! For example I often get people telling me that RCTs don’t necessarily tell us about individual responses (I agree), that many interventions cannot practically be evaluated by RCTs  (I agree), that the results of RCTs may not be transferable to different contexts (I agree), that policy decisions are made on other factors than just evidence of ‘what works’ (I agree), that scientists often get things wrong (I agree -in fact I even have a blog post on it!) etc etc

.

What bothers me is that by focussing on a debate that doesn’t in truth exist, we are missing the opportunity to have much more useful and interesting discussions. For example, I would love it if more people thought about more innovative ways of integrating RCTs (or quasi-experimental approaches) with qualitative research so that both arms add value to the other. A valiant attempt is described here but as you can see, the poor social scientists were frustrated by the ‘hard’ scientists’ unwillingness to change their protocol. I wonder if qualitative research could be integrated more easily into adaptive trial designs which are designed so that protocol can be changed as research is gathered?

Another discussion that I would love to hear more on is how do we measure impacts such as behaviour/attitudes/capacity in a more objective way (a discussion which is validwhether one is using a RCT approach or not)? I feel that too much evaluation in international development relies on self-reporting of these things – for example measuring whether people report an increase in ca

pacity following a training event. When I worked at INASP we did some work to compare self-reported ability (of policy makers) to use research with actual ability as measured using a diagnostic test (sorry – this work is still ongoing but I hope it will be published eventually). Similarly, an excellent piece of work here looked at researchers’ reported barriers to using journals with their actual skills in using them while this report compared the work of the Ugandan parliament in relation to science and technology with the reported competence of MPs. In all cases there was little correlation. This makes me think that we need to be more creative in measuring actual impacts -rather than relying so much on self-reporting.

These are just a couple of the interesting questions that I think we could be exploring – but I am sure if people could stop spending their time fighting the RCT bogeyman, they could come up with a lot more interesting, valid and important questions to discuss.

14 thoughts on “Fighting the RCT bogeyman

  1. First of all, I have learnt what is a bogeyman! Secondly, I totally agree with you Kirsty – let’s discuss about real issues, and not about an imaginary war. When I carried out an IE of an educational programme in rural Nicaragua in 2011, I have used the Propensity Score Matching (PSM), a quasi-experimental approach – however, living in that context and carrying out focus groups (qualitative observation) beforehand, got me to think about the research questions I wanted to give an answer to with my PSM and helped me to understand and interpret the results with awareness of the context the data came from . Finally, I also agree that self-reported skills/knowledge are not a reliable indicator – looking for more objective proxies measuring them is a challenging but also interesting (almost funny!) task.

    Cheers
    Antonio

  2. Pingback: Measuring capacity… to play tiddly winks « kirstyevidence

  3. Pingback: Kirsty Newman: The interview - Research to Action - Research to Action

  4. Hi Kirsty – I fought this particular bogeyman 14 years ago and he’s still not dead! At the socio-economic methodologies programme we got fed up with the sterile debate about the gold standard for research, so we put together a team of statisticians from Reading with a team of participatory gurus from NRI and asked them: how can you combine quantitative and qualitative evidence? What can and can’t it tell you? How can and can’t you mix them together? The results are still available: see Marsland, Wilson, Abeyasekara and Kleih, 1998 – old, but still good. http://www.nrsp.org.uk/pdfs/bookshelf/BPG10_Marlsand%20et%20al_survey%20methods.pdf

  5. Yet again, and rather boringly, I agree with you on all of this. I try to get people to understand the benefits of doing mixed methods (qualitative supplementing the quantitative), in addition to being able to change your direction as you go along. Luckily (or not) my area doesn’t really do a lot of RCTs. Well, none, actually. But they want to. I wonder whether some of this has come from the Cabinet Office / Goldacre paper. Its strength (being a simple intro) was also its weakness, and left it open for people to criticise what wasn’t there – the discussion of the complexities of using RCTs for policy formation/evaluation, the problems associated with reductionism when applied to complex problems, or even whether quantitative methods can and should be used over and above qualitative inquiries.

    My last comment, and slighly tenuous, is that we published on the use of surveys in Afghanistan. We ended up stopping a specific one due to methodological issues, and a feeling that we were getting much better data (and engagement) from the Shuras. See http://www.rusi.org/publications/journal/ref:A49AD5EF900195/#.UGnyNqN428E

    • “When we talk of hard evidence, we will therefore have in mind evidence from a randomized experiment” (Banerjee, Making Aid Work, 2007)

      One problem with randomistas is that they are constantly shifting the goalposts, but the above quote should suffice to make the point. As should the burgeoning existence of JPaL, 3ie, IPA, etc, etc. Not sure what you want from me: a stone tablet with Banerjee and Duflo’s ten commandments with fingerprints? Go to any JPaL training workshop for policymakers.

      As for your example from DFID: ask yourself why qualitative data is okay for uncovering mechanisms but not causal effects. Banerjee just waves his hands frantically in the air at this kind of question and I doubt you will do any better.

      The broader point, somewhat patronising but has to be said, is that you are part policy advisor, part policy advocacy person; your job – as with too many others – is not to reflect on the merits of whatever tools are in vogue at the time but to sell them as best you can. Further discussion is therefore futile.

      • I recognise this is Kirsty’s blog, and so Kirsty, I apologise for responding to this, but I am intrigued by Anecnon’s comment. I will make several points:
        – I’ve not seen Banerjee’s paper that you quote (2007), but perhaps he meant ‘hard evidence for a given stage in inquiry’? This leads into by next point…
        – You appear to discriminate between ‘mechanisms’ and ‘causal effects’. I would therefore ask what are mechanisms, if not mechanisms of causation? Were you trying to say that qualitative research is good for exploring phenomena (and responses to given interventions into a complex system) and therefore identifying candidate causative mechanisms as part of hypothesis- and theory-building, whilst quantitative research methods (e.g., RCTs and meta-analyses of them) are good for inferentially testing them to aid generalisation?
        – I think that Kirsty’s example of a programme that uses both quantitative and qualitative research is great, and I would hope you would applaud these efforts. Qualitative research can help us to understand what MIGHT work for a given set of people, in a given set of circumstances, when the intervention is implemented in a given way. If a decision (e.g., follow-on funding) has greater evidential requirements, then that is a good base for an RCT to test the hypotheses the qualitative research helped to generate.

        Or am I wrong?

  6. Pingback: I disagree that I disagree! « kirstyevidence

  7. Sorry Kirsty, but the bogeyman does exist (try Banerjee, 2007). Just because you (perhaps) happen to hold a more nuanced view of the use of RCTs does not mean that the crude view does not exist or is not in fact much more prevalent than the view you hold. I don’t see why this is so difficult to understand. The story RCT practitioners are selling to donors and governments is that RCTs are better forms of evidence than pretty much anything else. Of course, since some big name economists like Deaton and Heckman waded-in to the debate the randomistas have had to become a lot more circumspect in their academic papers, but in the policy and funding realm the behaviour continues as before.

    • Sorry for this delayed response ‘Anecon’ – I just came accross this comment and wanted to pick up a couple of points. First I am delighted that you think that I (perhaps) have a more nuanced view than the bogeymen but interested to hear that you are still very determined that these bogeymen exist and are indeed more prevalent than people who think like I do. The reason I find this ‘so difficult to understand’ is that I work within a major international development agency (DFID) and I have never met someone who does think that ‘RCTs are better forms of evidence than pretty much anything else’. As I state in this article and in the other one on which you have commented, generally people where I work would agree that RCTs can be an excellent way to demonstrate causality but there are many other methods which will be used to answer other questions. For example an RCT might answer does intervention X cause outcome Y (in given environment compared to suitable control) whereas qualitative data might help us to understand why the intervention works (or not). Indeed, just the other day in DFID I attended a fanscinating seminar on a project which had carried out a systematic synthesis of a number of RCTs of a particular intervention which basically demonstrated that the intervention had not worked along with some excellent qualitative analysis to understand donors’ perceptions of causal pathways and to compare these with the actual experiences of beneficiaries. This was research funded by DFID which I expect will have big impacts on DFID policy (sorry I can’t give more details since its not yet been published).

      Having said that, just because I have never met these ‘bogeymen’ does not of course prove that they do not exist (although it does suggest they may not be that prevalent in my immediate work environment). I would actually be genuinely interested to find people who hold the views you are talking about – that RCTs are the be all and end all of evidence. You mention Banerjee (2007) – I believe in that paper he talks about how important RCTs are for demonstrating whether interventions work but as far as I know he at no point specifically suggests that other forms of research are not important. I wonder if you could provide some specific quotes from papers – by Banerjee or others – demonstrating ‘bogeyman’ type views?

  8. Pingback: Fighting the RCT bogeyman | Millennium Development Goals and Research and Development | Scoop.it

  9. Dear Kirsty,
    the Bogeyman is commonly known as Hierarchy of Evidence, a concept from evidence-based medicine. Dominant opinion holds it that best evidence is meta-analysis of RCTs, then RCTs, and then a number of other types of evidence follow in cascading order. The Wikipedia article is a bit short. I found this British Journal of Medicine article and the comments that followed highly enlightening: http://www.bmj.com/content/327/7429/1459. The hierarchy of evidence and the evidence based medicine concept are disputed in medicine itself. You will notice that most RCTs that had an effect on policy were from types of somatic medicine that have a direct cause-effect relationship. In anything complex like auto-immune diseases or psychological illnesses, they hardly apply, or become very expensive, nor in long-term prevention, and in anything that depends strongly on people’s cooperation like physiotherapy or eating habits.

    My understanding so far is that also outside medicine, RCTs work better in something that complexity theory calls the domain of the simple, less in the complicated and hardly ever in the complex (cf. Michael Patton’s books Developmental Evaluation and Utilization Focused Evaluation).

    DFID has been great in supporting alternatives to RCTs and the Stern et al. study “Broadening the Range of Impact Evaluations” , but out there are still lots of organisations that insist on RCTs on any intervention, and then limit their interventions to issues that can be tested by RCTs, or commission sub-standard studies. 3ie and J-LAB have done some really good studies but in their PR work they make claims that cannot seriously be sustained. Lots of people struggle with the after-effects of such over-simplification. So the bogeyman might have left DFID, but he is still alive and kicking.

    Best Bernward

  10. Pingback: Experimental methodologies… and baby pandas | kirstyevidence

  11. Pingback: What do you disagree with on this flowchart? | tribsantos

Leave a Reply (go on, you know you want to...)