Musings on research, international development and other stuff

I disagree that I disagree!


Fight a straw man anyone?
flikrstream: RobinEllisActor

I have been meaning for some time to write a response to Andries de Toit’s paper Making Sense of Evidence: Notes on the Discursive Politics of Research and Pro-Poor Policy Making and this blog written by Enrique Mendizal in support of the paper’s conclusions.

My initial reaction to both the paper and the blog is to agree with many of the underlying concerns presented but to protest that both authors are weakening their case by presenting the other side of the argument as far more extreme than it actually is. They are reacting against a caricatured ‘randomista’ – an unthinking, automaton who slavishly demands experimental evidence (and only experimental evidence) before making any decision. To be honest, I would also react against such a person – but I am not convinced that such people actually exist!

I work for the UK Department for International Development in the Evidence into Action team. This team’s mission is to promote and support use of quality research evidence in decision-making both within DFID and more widely. Readers of De Toit and Mendizabal’s articles, might imagine that this team is full of ‘randomistas’ – but this couldn’t be further from the truth. The team consists of 20 people with a wide variety of experience and expertise. We have social scientists, basic scientists, international development experts, evaluators and economists. We have people with experience in other government departments, academia, grass-roots charities, activist groups and think tanks. Almost every day within the team, I hear intelligent and nuanced discussions about research evidence and how it contributes to policy and practice.

There is general consensus that for some questions (e.g. whether X intervention leads to Y outcome in environment Z) experimental or quasi-experimental approaches offer the strongest evidence.  And that for other questions (e.g. whether X intervention is cultural or politically acceptable, why X leads to Y, whether another environment is similar to Z etc) alternative approaches would be more useful. However beyond those general statements, I would not say that we have a universal preference for one form of evidence or another. It completely depends on what question needs to be answered.

For example, this afternoon within DFID, there was a seminar about male circumcision in Africa and its role in reducing HIV transmission. The reading for the seminar included a systematic review which looks at the efficacy of circumcision as a preventative measure. But there were also papers examining the social and ethical issues around this issue.

Sometimes, we might push for more experimental (or quasi-experimental) approaches. For example, I recently looked over a policy brief which discussed the outcomes of a particular type of governance intervention. I felt that this brief was relying too much on case study based evidence to support a certain hypothesis. My suggestion was that they needed to either summarise more rigorous evidence or, if this evidence does not exist, they need to discuss why this is the case and explain the limitations of the evidence which is presented.

However, on other occasions, the team might suggest that more qualitative research is needed. For example, a colleague was at a meeting recently where the results of a randomised controlled trial to increase vaccine uptake by providing food handouts were discussed. He felt that the discussion was focussing too much on a single question (does the intervention ‘work’ in this setting?) and his advice was that there was a need to consider other questions (e.g. what does the intervention mean for people’s dignity? how ethical is the intervention? is the intervention socially acceptable?) and that to answer these questions different forms of evidence would be required.

I am not in any way suggesting that DFID gets the balance right in every case; of course we make mistakes. But it is not true to suggest that the team in which I work, or DFID as a whole, is dominated by people who only value one form of research evidence as the basis for decisions.

Unfortunately I am not able to go to the PLAAS conference in November at which these issues will be discussed – but I would like to make a plea to those who are attending. Please try not get too bogged down discussing your perceptions of policy makers’ prejudices. I suspect that the discussions will be more useful if they focus on concrete examples of policy decisions which contributors feel were overly influenced by one type of research evidence. By analysing these examples, it may be possible to come up with some genuinely useful case studies and some recommendations on how policy makers can consider a range of factors (including citizens’ voice, local cultures, power dynamics etc. etc. as well as research evidence) as they make their decisions.

17 thoughts on “I disagree that I disagree!

  1. Just in case people do not read my blog or Du Toit’s article… I did not mention DFID’s Evidence into Action team. But, still, you have to accept that the name it self says a lot…. I am glad to hear that DFID has interesting and nuanced discussions about their interventions. Shame they are not public, though. That would certainly be a good thing.

  2. Thanks Enrique – and yes, sorry I should have made it clear that neither article specifically referenced DFID.
    On making the discussions more public, this is a good point. I have to admit that before joining DFID I also suspected they were rather technocratic and did not expect discussions of evidence to be so nuanced (and I promise that I have not been brainwashed since joining!!). I think that the fact that people have such misconceptions suggests that DFID is not doing a good job at getting their view out there. I hope that that will change in the future…

  3. It seems like you are suggesting that a mix of methods is needed to understand the range of questions that is being asked about a given intervention – makes sense. But it also seems like you are saying that for any given intervention, experimental evidence probably forms part of the package. Is that fair? Are there circumstances where you would say experimental evidence is not an essential part of the mix of methods you need in order to understand whether an intervention is worthwhile?

  4. Hi Kirsty, it’s always valuable to get first hand insights into how DFID works – or for that matter insights into any organisation with such a degree of influence. From the outside even some of the most mundane details of how things are done to day can be quite illuminating, A couple of things occur to me – but in relation to any agency (I don’t know enough about DFID’s internal structures and daily interactions to comment on that particular case) – and I’m no expert in this either, so this is probably a bit woolly…

    The first would be if what you describe for the evidence team would be true of policy teams – or is there a danger of pretty nuanced ‘evidence literate’ evidence teams, but policy making carries on as before, because of either the knowledge and skills in particular teams (and ability to handle different types of evidence), or just because of the realities of being a Whitehall facing department where policy is determined by a whole host of reasons which may have little to do with the evidence base.

    Secondly, what would be really interesting to know more about (again in any agency) is where the expertise on research and evidence comes together with the expertise on political economy and governance. Political economy and power analyses have gained a fair bit of attention in recent years and from what I know there have been lots of attempts to thread these kind of approaches into daily practice. If the same is beginning to happen with ‘evidence’ approaches, how are the two being combined (if they are), or how might this be done best?

  5. Interesting points. I personally would have thought that you always try to triangulate, i.e. add as many angles/evidence to the discussion as feasible under the circumstances. How much evidence is enough, then, is a normative question, which can be answered coherently (although of course not necessarily with full consensus). I would also be very curious about “genuinely useful case studies”.

  6. Interesting post and comments. Like Jon and Enrique said, good to hear of the nuanced and multi-disciplinary approach at DFID and indeed any organisation with that much influence.

    The question of combining political analysis with evidence as suggested by Jon is an interesting one and I think one that will go some way to addressing Matt Greenall’s question: ”Are there circumstances where you would say experimental evidence is not an essential part of the mix of methods you need in order to understand whether an intervention is worthwhile?”

    Experimental and quasi-experimental approaches go some way to address the ‘X-Y-Z’ scenarios that Kirsty discusses but political economy and power analyses may better inform if an intervention is ‘worthwhile’. I am of course making a distinction between something working in an empirical sense and an intervention being worthwhile (which may include not working to its fullest potential).

    Are there examples out there of organisations, programmes or projects combining the two approaches?

  7. Kirsty, you betray your biases unconsciously. Basically, from the above it is clear that you consign non-RCT evidence to discussing issues like ethics and ‘social acceptability’. What that tells me, for a start, is that, and with all respect, you clearly don’t understand the debates going on in econometrics about whether we need randomisation at all to have ‘rigorous evidence’. It sounds like you have a diverse research team who in the end still have to validate whatever positions they develop with an RCT, which in the end suggests that you still see RCT-based evidence as superior to other forms, contradicting your protestations to the contrary. I would welcome correction if my perception is mistaken.

    • Hi Anecon, you mention debates in econometrics about randomisation and rigorous evidence which you suggest Kirsty is unaware of. Could you explain what you are referring to? I also think that some of the points you’re making suggest that you haven’t really understood the essence of the article. For example you still seem convinced that the author believes RCT evidence is intrinsically superior to other forms of evidence but I think in her article she clearly states that the relevant form of evidence depends on the question.

      • Thanks Stephen. I have understood the essence of the article, i’m just sceptical because there has been some deliberate obfuscation around these issues recently (see my reference to Rachel Glennester below).

        As for the econometrics debate. Structural econometricians argue that there are some questions that are better addressed by structural models combined with observational data than RCTs, simply because RCTs often answer sucha a narrow version of the question of interest that the evidence is of questionable usefulness beyond the trial itself. So an academic economist might get themselves an easy American Economic Review article (where most of the work has been done by research peons anyway), but they have not produced anything that is necessarily useful to policymakers. There are three journal special issues (Journal of Economic Perspectives, Journal of Economic Literature and Journal of Econometrics) from 2010 dealing with some of these disagreements – see in particular the contributions by Michael Keane, James Heckman and Angus Deaton. Martin Ravallion has also written about this, and James Heckman has done a huge amount of theoretical work with coauthors. The simple message is: RCTs are heavily over-sold. (Which is not to say, of course, that the structural modellers are producing work that is better).

  8. Thanks for all the comments. Here are a few responses:

    Matt, that is a good question. I am tempted never to say never; I suspect that if we thought about it carefully we could maybe come up with a policy scenario where only experimental or only qualitative research would be useful… but I yes I would guess that for most policy decisions a mixture of evidence gathered using different approaches and answering different questions would be best.

    Alex, in response to your question about combining different types of evidence, yes I think this happens all the time. For example, ‘The wisdom of whores’ by Elisabeth Pisani gives a great overview of how different forms of evidence fed into HIV policy in various contexts (and its also a bloomin good read!).

    Jon, your question about policy teams is spot-on; for evidence-informed policy making those who make policy/practice decisions need to understand the evidence. I think this is at the root of the work that Alex is continuing at INASP and I think it will continue to be an area of development within DFID; currently policy staff are offered some Critical Appraisal Skills training and there are discussions about whether/how to extend this.

    Anecon, like Stephen, I get the impression that you have not totally understood (or perhaps you just don’t believe me!) when I say that I don’t think that RCTs are superior to all other forms of evidence. There are many questions which cannot be answered by RCTs so it would be absurd to say they are the best form of evidence in all cases. I am also surprised that you think I have ‘consigned’ non-RCT evidence to being about ethics and social aceptability a. because the word ‘consign’ suggests you think these are not particular important issues (I disagree) and b. because at no point in the argument do I state that these are the only things that non-RCTs can tell you.

    Thanks again for all the comments!

  9. Thanks Kirsty. Issues of ethical and social acceptability are evidently important. When I use the word ‘consign’ what I am gesturing it as that you seem to have partitioned questions into two types, and seem to believe that only RCTs can answer the one type of question,namely that relating to evidence. I simply infer this from the examples you present. E.g. You don’t present an example relating to evidence itself, but rather examples about issues that are not inherently about evidence (ethics, social acceptability, etc).

    Even when you say “There are many questions which cannot be answered by RCTs so it would be absurd to say they are the best form of evidence in all cases”, I am not reassured. This statement is entirely consistent with a view within which RCTs are best if they are possible, and again that is a point many people disagree with. The plain question is this: do you accept that for some questions, *even if you can do some kind of RCT* that may not be the best way of addressing the issue of interest? I suspect not.

    The reason I am pushing the point is because many ‘randomistas’ these days are making statements that are rather inconsistent with past views and with the methodology they advocate. E.g. Rachel Glennester did this on Development Drums some months back. The bogeyman is still with us, it’s just doing its best to take on other guises.

    • My answer to your plain question is… Yes!
      I’m sorry that you doubt that that is what I think since it suggests I have not managed to get my point accross well in the blog.

      • Excellent, I consider myself reassured then. And I wouldn’t say the blog didn’t get its point across, I am just being picky about details (albeit unapologetically so). I think the majority of readers will have got the point you intended to make. Thanks for the replies.

  10. Du Toit’s article is a useful one, and eloquently describes some of the genuine challenges to drafting and implementing evidence-based policy. However, the paper leaves the sense that du Toit is himself rather uncertain about the way ahead: the policy implications of his research, if you like. On the one hand, he notes that “social scientists have to give up the claim that their work can provide a privileged and incontestable ground for policymaking, situated beyond the messiness of ‘guesswork and ideology.'” But on the other, he also argues that “it is important to defend the critical independence of academic research, and not to allow a situation in which the need for ‘user uptake’ can cause researchers to abandon their integrity and independence, so that ‘evidence-based policymaking’ starts turning into ‘policy-based evidence making’.” Too often, the article verges on the very negativity about the value of evidence in policy making that it says it eschews.

    I suspect that the challenge is somewhat similar to that of ‘underground’ music artists. So long as their work remains untainted by more popular cultural influences, they are assured of maintaining their integrity and credibilty in the eyes (or ears) of a relatively limited audience or fan base: they are often destined to remaining ‘undiscovered’ or unpopular. But frequently, artists are tempted by the greater commercial successes of ‘the mainstream’. They engage in the sights and sounds of popular culture, and in so doing, their music becomes more commercially successful. They are, in a sense, rather like the researcher who has begun to produce findings more easily ‘usable’ in the policy environment. Musicians and researchers alike are prone to ‘capture’.

    Given this challenge, (and speaking as a ‘knowledge intermediary’ or Evidence Broker sitting between DFID’s Policy Department and their Research & Evidence Department) I do not see it as problematic that the ‘evidence based policy’ discourse is “normative” and, in some ways, “aspirational.” Striving to shape policies such that they are ‘more’ evidence-based (even if we sometimes do not succeed) requires that we set ourselves a standard. In aiming for ‘evidence-based policy’, we may be able to achieve ‘evidence-informed policy’.

    With regards ‘ends’ or ‘means’, I am typically more comfortable (as a civil servant) leaving the desired ‘ends’ to policy-makers, and attempting to focus my own efforts on what the evidence says about the ‘means’. In so doing, I would always recognise that RCTs, for example, are just one aspect of the diverse array of relevant evidence.

    Lastly, I note du Toit’s point that whilst evidence matters in policymaking “what matters is not the careful and meticulous description of one particular aspect of reality, but rather using evidence rhetorically to buttress arguments.” This is certainly true, in the same way as evidence is used by lawyers in a courtroom to buttress their arguments. We trust, nevertheless, that in a well-managed courtroom, occupied by a professional legal cadre, overseen by a legitimate judiciary, evidence is used more or less responsibly. The ‘risk’ of manipulation of evidence (in policy, no less than in the courtroom) is no reason not to strive for its greater usage.

  11. Pingback: Research uptake: what is it and can it be measured? | on think tanks

  12. Pingback: Chapter 2…in which kirstyevidence meets a randomista! | kirstyevidence

  13. Pingback: Experimental methodologies… and baby pandas | kirstyevidence

Leave a Reply (go on, you know you want to...)

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s