Musings on research, international development and other stuff


Impact via infiltration

Two blogs ago I linked to an article which I attributed to “the ever-sensible Michael Clements”. Shortly afterwards @m_clem tweeted:

.@kirstyevidence compares @JustinSandefur to @jtimberlake 🔥—but me, I’m “sensible”.Some guys got it, some don’t.

I mention this in part because it made me chortle but mainly because it prompted me to look back at that Justin Sandefur interview. And on re-reading it, I was really struck by one of Sandefur’s models for how research relates to policy:

My third model of what research has to do with development policymaking is borderline cynical: let’s call it a vetting model.  The result of your narrow little research project rarely provides the answer to any actual policy question.  But research builds expertise, and the peer review publication process establishes the credibility of independent scientific experts in a given field.  And that — rather than specific research results — is often what policymakers are looking for, in development and elsewhere.  Someone who knows what they’re talking about, and is well versed in the literature, and whose credentials are beyond dispute, who can come in and provide expert advice.

Since that interview was published, I wrote a literature review for DFID which looked at the impact of research on development. And, having spent months of my life scouring the literature, I am more convinced than ever that the Sandefur/Timberlake effect (as it will henceforth be known) is one of the main ways in which investment in research leads to change.

This pathway can be seen clearly in the careers of successful researchers who become policy makers/advisors. For example, within DFID, the chief scientist and chief economist are respected researchers. But the significant impacts they have had on policy decisions within DFID must surely rival the impacts on society they have had via their academic outputs?

And the case may be even stronger if you also examine ‘failed scientists’ – like, for example, me! The UK Medical Research Council invested considerable amounts of funding to support my PhD studies and post-doc career. And I would summarise the societal impact of my research days as… pretty much zilch. I mean, my PhD research was never even published and my post-doc research was on a topic which was niche even within the field of protozoan parasite immunology.

Undercover nerds: creating societal impact all around you?

Undercover nerds – surely the societal impact of all those current and former academics goes beyond their narrow research findings?

In other words, I wouldn’t have to be very influential within development to achieve more impact than I did in my academic career. My successful campaign while working at the Wellcome Trust to get the canteen to stock diet Irn Bru probably surpasses my scientific contributions to society! But more seriously, I do think that the knowledge of research approaches, the discipline of empirical thinking, the familiarity with academic culture – and, on occasion, the credibility of being a ‘Dr’ – have really helped me in my career. Therefore, any positive – or indeed negative – impact that I have had can partly be attributed to my scientific training.

Of course, just looking at isolated individual researchers won’t tell us whether, overall, investment in research leads to positive societal impact – and if so, whether the “S/T effect” (I’m pretty sure this is going to catch on so I have shortened it for ease) is the major route through which that impact is achieved. Someone needs to do some research on this if we are going to figure out if it really is a/the major way in which research impacts policy/practice.

But it’s interesting to note that other people have a similar hypothesis: Bastow, Tinkler and Dunleavy carried out a major analysis of the impact of social science in the UK* and their method for calculating the benefit to society of social science investments was to estimate the amount that society pays to employ individuals with post-grad social science degrees.** In other words they assumed that the major worth of all that investment in social science was not in its findings but in the generation of experts. I think the fact that the authors are experimenting with new methodologies to explain the value of research that go beyond the outdated linear model is fabulous.

But wait, you may be wondering, does any of this matter? Well yes, I think it does because a lot of time and energy are being put into the quest to measure the societal impact of research. And in many cases the impact is narrowly defined as the direct effect of research findings and/or research derived technologies. The recent REF impact case studies did capture more diverse impacts including some that could be classified within the S/T™ effect. But I still get the impression that such indirect effects are seen as secondary and unimportant. The holy grail for research impact still seems to be linear, direct, instrumental impact on policy/practice/the economy – despite the fact that:

  1. This rarely happens
  2. Even when we think this is happening, there is a good chance that evidence is in fact just being used symbolically
  3. Incentivising academics to achieve direct impact with their research results can have unintended and dangerous results

Focussing attention on the indirect impact of trained researchers, not as an unimportant by-product but as a major route by which research can impact society, is surely an important way to get a more accurate understanding of the benefits (or lack thereof) of research funding.***

So, in summary, I think we can conclude that that Justin Sandefur is quite a sensible bloke.

And, by the way, have any of you noticed how much Michael Clemens resembles George Clooney?


* I have linked to their open access paper on the study but I also recommend their very readable book which covers it in more detail along with loads of other interesting research – and some fab infographics.

** Just to be pedantic, I wonder if their methodology needs to be tweaked slightly – they have measured value as the cost of employing social science post-grad degree holders but surely those graduates have some residual worth beyond their research training? I would think that the real benefit would need to be measured as the excess that society was willing to pay for a social science post-grad degree holder compared to someone without..?

*** Incidentally, this is also my major reason for supporting research capacity building in the south – I think it is unrealistic to expect that building research capacity is going to yield returns via creation of new knowledge/technology – at least in the short term. But I do think that society benefits from having highly trained scientific thinkers who are able to adapt and use research knowledge and have influence on policy either by serving as policy makers themselves or by exerting evidence-informed influence.


Scottish independence and the falacy of evidence-BASED policy

indyrefAs I may have mentioned before, I am a proud Scot. I have therefore been following with interest the debates leading up to the Scottish referendum on independence which will take place on the 18th September (for BBC coverage see here or, more entertainingly, watch this fabulous independence megamix). Since I live in England, I don’t get to vote – and even if I did, as a serving civil servant it would not be appropriate for me to discuss my view here. But I do think the independence debate highlights some important messages about evidence and policy making – namely the fact that policy can not be made BASED on evidence alone.

The main reason for this is that before you make a policy decision you need to decide what policy outcome you wish to achieve – and this decision will be influenced by a whole range of factors including your beliefs, your political views, your upbringing etc. etc. So in the case of the independence debate, as eloquently pointed out by @cairneypaul in this blog, the people of Scotland need to decide what their priorities for the future of Scotland will be. Some will feel that financial stability is the priority, others will focus on the future of the Trident nuclear deterant, some will focus on their desire for policy decisions to be made locally, while others will care most about preservation of a historic union.

Only once people are aware of what their priorities are, will evidence really come in to play. In an ideal world there would then be a perfect evidence base which would provide an answer on which option (yes or no) would be most likely to lead to different policy outcome(s). But of course we all know that we don’t live in an ideal world, and so in the independence debate – as in most policy decisions – the evidence is contradictory, incomplete and contested. And therefore a second reason why a decision cannot be fully ‘evidence-based‘ is that voters will need to assess the evidence, and a certain degree of subjectivity will inevitably come into this appraisal.

It is for the above reasons that I strongly prefer the term ‘evidence-informed’ to the term ‘evidence-based’*. Evidence-informed decision making IS possible – it involves decision makers consulting and appraising a range of evidence sources and using the information to inform their decision. As such, two policy makers may make completely different policy decisions which have both been fully informed by the evidence. Likewise, my decision to happily eat a large slice of chocolate cake instead of going to the gym can be completely evidence-informed since I get to choose which outcomes I am seeking :-).

A final point is that since evidence can inform policies designed to lead to diverse outcomes, evidence-informed policy making is not inevitably a ‘good thing’; if a policy maker has nefarious aims, she can use evidence to help her achieve these in the same way that a more altruistic policy maker can use evidence to benefit others. Thus efforts to support evidence-informed policy will only be beneficial when those making decisions are actually motivated to improve the lives of others.


*n.b. I am a big supporter of the ‘evidence-based policy in development’ network since I suspect the name choice is mainly historical rather than a statement of policy. In fact, judging by discussions via the listserve, I would suspect that most members prefer the term evidence-informed policy.



Implementation science: what is it and why should we care?

imp sci pie chart

The 30 participants were mostly members of DFID’s Evidence into Action team plus a few people who follow me on twitter – admittedly not a very rigorous sampling strategy but a useful quick and dirty view!

Last week I attended a day-long symposium on ‘implementation science’ organised by FHI 360. I had been asked by the organisers to give a presentation, and it was only after agreeing that it occurred to me that I really had no idea what implementation science was. It turns out I was not alone – I did a quick survey of colleagues engaged in the evidence-informed policy world and discovered that the majority of them were also unfamiliar with the term (see pie chart). And even when I arrived at the conference full of experts in the field, the first couple of hours were devoted to discussions about what implementation science does and does not include.

To summarise some very in-depth discussions, it seems that there are basically two ways to understand the term.

The definitions that seem most sensible to me describe implementation science as the study of how evidence-informed interventions are put into practice (or not) in real world settings. These definitions indicate that implementation science can only be done after efficacy and effectiveness studies have demonstrated that the intervention can have a positive impact. As @bjweiner (one of the conference speakers) said, implementation science aims to discover ‘evidence-informed implementation strategies for evidence-informed interventions’.

A second category of definitions take a much broader view of implementation science. These definitions include a wide variety of additional types of research including impact evaluations, behaviour change research and process evaluations within the category of implementation science. To be honest, I found this latter category of definitions rather unhelpful – they seemed to be so broad that almost anything could be labelled implementation science. So, I am going to choose to just go with the narrower understanding of the term.

Now I have to mention here that I thoroughly enjoyed the symposium and found implementation scientists to be a really fascinating group to talk with. And so, as a little gift back to them, and in recognition of the difficulties they are having in agreeing on a common definition, I have taken the liberty of creating a little pictorial definition of implementation science for them (below). I am sure they will be delighted with it and trust it will shortly become the new international standard ;-).
implementation science

So what else do you need to know about implementation science?

Well, it tends to be done in the health sector (although there are examples from other sectors) and it seems to focus on uptake by practitioners (i.e. health care providers) more than uptake by policy makers. In addition it is, almost by definition, quite ‘supply’-driven – i.e. it tends to focus on a particular evidence-informed intervention and then study how that can be implemented/scaled up. I am sure that this is often a very useful thing – however, I suspect that the dangers of supply-driven approaches that I have mentioned before will apply; in particular, there is a risk that the particular evidence-informed intervention chosen to be scaled up, may not represent the best overall use of funds in a given context. It is also worth noting that promoting and studying the uptake of one intervention may not have long-term impacts on how capable and motivated policy makers/practitioners are to take up and use research in general.

A key take home message for me was that implementation science is ALL about context. One of my favourite talks was given by @pierrembarker who described a study of the scale-up of HIV prevention care in South Africa. At first the study was designed as a cluster randomised controlled trial; however, as the study progressed, the researchers realised that, for successful implementation, they would need to vary the approach to scale-up depending on the local level conditions, and thus an RCT, which would require standardised procedures across study sites, would not be practical. Luckily, the researchers (and the funders) were smart enough to recognise that a change of plan was needed and the researchers came up with a new approach which enabled them to tailor the intervention to differing contexts, and at the same time generate evidence on outcomes which was as robust as feasible. Another great talk was given by Theresa Hoke of @FHI360 who described two programmes to scale up interventions that almost completely failed (paper about one of them here). The great thing about the implementation science studies were that they were able to demonstrate clearly that the scale-up had failed and to generate important clues for why this might be the case.

One final cool thing about implementation science is how multi-disciplinary it is; at the symposium I met clinicians, epidemiologists, qualitative social scientists and – perhaps most intriguingly – organisational psychologists. I was particularly interested in the latter because I think it would be really great if we could get some of these types involved in evaluating/investigating ‘demand-side’ evidence-informed policy work funded by organisations, including DFID, (the department formerly known as-) AusAID and AHSPR. These programmes are really all about driving organisational change, and it would be very useful to get an expert’s view on what approaches (if any!) can be taken by outside actors to catalyse and support this.

Anyway, sorry for such a long post but as you can tell I am really excited about my new discovery of implementation science! If you are too, I would strongly recommend checking out the (fully open access) Implementation Science Journal. I found the ‘most viewed’ articles a good place to start. You will also soon be able to check out the presentations from the symposium (including my talk in which I call for more unity between ‘evidence geeks’ like me and implementation scientists) here.


Unintended consequences: When research impact is bad for development

Development research donors are obsessed with achieving research impact and researchers themselves are feeling increasingly pressurised to prioritise communication and influence over academic quality.

To understand how we have arrived at this situation, let’s consider a little story…

Let’s imagine around 20 years ago an advisor in an (entirely hypothetical) international development agency. He is feeling rather depressed – and the reason for this is that despite the massive amount of money that they are putting into international development efforts, it still feels like a Sisyphean task. He is well aware that poverty and suffering are rife in the world and he wonders what on earth to do. Luckily this advisor is sensible and realises that what is needed is some research to understand better the contexts in which they are working and to find out what works.

Fast-forward 10 or so years and the advisor is not much happier. The problem is that lots of money has been invested in research but it seems to just remain on the shelf and isn’t making a significant impact on development. And observing this, the advisor decides that we need to get better at promoting and pushing out the research findings. Thus (more or less!) was born a veritable industry of research communication and impact. Knowledge-sharing portals were established, researchers were encouraged to get out there and meet with decision makers to ensure their findings were taken into consideration, a thousand toolkits on research communications were developed and a flurry of research activity researching ‘research communication’ was initiated.

dfid advisorBut what might be the unintended consequences of this shift in priorities? I would like to outline three case studies which demonstrate why the push for research impact is not always good for development.

First let’s look at a few research papers seeking to answer an important question in development: does decentralisation improve provision of public services. If you were to look at this paper, or this one or even this one, you might draw the conclusion that decentralisation is a bad thing. And if the authors of those papers had been incentivised to achieve impact, they might have gone out to policy makers and lobbied them not to consider decentralisation. However, a rigorous review of the literature which considered the body of evidence found that, on average, high quality research studies on decentralisation demonstrate that it is good for service provision. A similar situation can be found for interventions such as microfinance or Community Driven Development – lots of relatively poor quality studies saying they are good, but high quality evidence synthesis demonstrating that overall they don’t fulfil their promise.

My second example comes from a programme I was involved in a few years ago which aimed to bring researchers and policy makers together. Such schemes are very popular with donors since they appear to be a tangible way to facilitate research communication to policy makers. An evaluation of this scheme was carried out and one of the ‘impacts’ it reported on was that one policy maker had pledged to increase funding in the research institute of one of the researchers involved in the scheme. Now this may have been a good impact for the researcher in question – but I would need to be convinced that investment in that particular research institution happened to be the best way for that policy maker to contribute to development.

My final example is on a larger scale. Researchers played a big role in advocating for increased access to anti-HIV drugs, particularly in Africa. The outcome of this is that millions more people now have access to those drugs, and on the surface of it that seems to be a wholly wonderful thing. But there is an opportunity cost in investment in any health intervention – and some have argued that more benefit could be achieved for the public if funds in some countries were rebalanced towards other health problems. They argue that people are dying from cheaply preventable diseases because so much funding has been diverted to HIV. It is for this reason we have NICE in the UK to evaluate the cost-effectiveness of new treatments.

What these cases have in common is that in each case I feel it would be preferable for decision makers to consider the full body of evidence rather than being influenced by one research paper, researcher or research movement. Of course I recognise that this is a highly complicated situation. I have chosen three cases to make a point but there will be many more cases where researchers have influenced policy on the basis of single research studies and achieved competely positive impacts. I can also understand that a real worry for people who have just spent years trying to encourage researchers to communicate better is that the issues I outline here could cause people to give up on all their efforts and go back to their cloistered academic existence. And in any case, even if pushing for impact were always a bad thing, publically funded donors would still need to have some way to demonstrate to tax payers that their investments in research were having positive effects.

So in the end, my advice is something of a compromise. Most importantly, I think researchers should make sure they are answering important questions, using the methods most suitable to the question. I would also encourage them to communicate their findings in the context of the body of research. Meanwhile, I would urge donors to continue to support research synthesis – to complement their investments in primary research. And to support policy making processes which include consideration of bodies of research.


Make love not war: bringing research rigour and context together

I’ve just spent a few days in Indonesia having meetings with some fascinating people discussing the role of think tanks in supporting evidence-informed policy. It was quite a privilege to spend time with people who had such deep and nuanced understanding of the ‘knowledge sectors’ in different parts of the world (and if you are interested in learning more, I would strongly recommend you check out some of their blogs here, here and here).

However, one point of particular interest within the formal meetings was that research quality/rigour often seemed to be framed in opposition to considerations of relevance and context. I was therefore interested to see that Lant Pritchett has also just written a blog with essentially the same theme – making the point that research rigour is less important than contextual relevance.

I found this surprising – not because I think context is unimportant – but because I do not see why the argument needs to be dichotomous. Research quality and research relevance are two important issues and the fact that some research is not contextually relevant does not in any way negate the fact that some research is not good quality.

How not to move a discussion forward

To illustrate this, let’s consider a matrix comparing quality with relevance.

Low Quality High Quality
Low contextual understanding The stuff which I think we can all agree is pointless Rigorous research which is actually looking at   irrelevant/inappropriate questions due to poor understanding of context
High contextual understanding Research which is based on deep understanding of context   but which is prone to bias due to poor methodology The good stuff! Research which is informed by good contextual understanding and which uses high quality methods to investigate   relevant questions.

Let me give some examples from each of these categories:

Low quality low contextual understanding

I am loath to give any examples for this box since it will just offend people – but I would include in this category any research which involves a researcher with little or no understanding of the context ‘parachuting in’ and then passing off their opinions as credible research.

High quality, low contextual understanding

An example of this is here – a research study on microbicides to prevent the transmission of HIV which was carried out in Zambia. This research used an experimental methodology – the most rigorous approach one can use when seeking to prove causal linkages. However the qualitatitve research strand which was run alongside the trial demonstrated that due to poor understanding of sexual behaviours in the context they were working in, the experimental data were flawed.

Low quality, high contextual understanding

An example of this is research to understand the links between investment in research and the quality of university education which relies on interviews and case studies with academics. These academics have very high understanding of the context of the university sector and you can therefore see why people would choose to ask them this questions. However repeated studies show that academics almost universally believe that investment in research is crucial to drive up the quality of education within universities while repeated rigorous empirical studies, reveal that the relationship between research and education quality is actually zero.

High quality, high contextual understanding

An example here could be this set of four studies of African policy debates. The author spent extended periods of time in each location and made every effort to understand the context – but she also used high quality qualitative research methods to gather her data. Another example could be the CDD paper I have blogged about before where an in-depth qualitative approach to understand context was combined with a synthesis of high-quality experimental research evidence. Or the research described in this case study – an evaluation carried out in Bolivia which demonstrates how deep contextual understanding and research rigour can be combined to achieve impact.

Some organisations will be really strong on relevance but be producing material which is weak methodologically and therefore prone to bias. This is dangerous since – as described above – poor quality research may well give answers – but they may be entirely the wrong answers to the questions posed. Other organisations will be producing stuff which is highly rigorous but completely irrelevant. Again, this is at best pointless and at worst dangerous if decision makers do not recognise that it is irrelevant to the questions they are grappling with.

In fact, the funny thing is that when deciding whether to concentrate more on improving research relevance or research quality… context matters! The problem of poor quality and the problem of low contextual relevance both occur and both reduce the usefulness of the research produced – and arguing about which one is on average more damaging is not going to help improve that situation.

One final point that struck me from reading the Pritchett blog is that he appears to have a fear that a piece of evidence which shows that something works in one context will be mindlessly used to make the argument that the same intervention should be used in another. In other words, there is a concern that rigorous evidence will be used to back up normative policy advice. If evidence were to be used in that way, I would also be afraid of it – but that is fundamentally not what I consider to be evidence-informed policy making. In fact, I disagree that any research evidence ever tells anyone what they should do. Thus, I agree with Pritchett that evidence of the positive impact of low class sizes in Israel does not provide the argument that class sizes should be lowered in Kenya. But I would also suggest that such evidence does not necessarily mean that policy makers in Israel should lower class sizes. This evidence provides some information which policy makers in either context may wish to consider – hence evidence-informed policy making. The Israeli politicians may come to the conclusion that the evidence of the benefit of low class sizes is relatively strong in their context. However, they may well make a decision not to lower class sizes due to other factors – for example finances. I would still consider this decision to be evidence-informed. Conversely, the policy makers in Kenya may look at the Israeli evidence and conclude that it refers to a different context and that it may therefore not provide a useful prediction of what will happen in Kenya – however, they may decide that it is sufficient to demonstrate that in some contexts lower class sizes can improve outcomes and that that is sufficient evidence for them to take a decision to try the policy out.

In other words, political decisions are always based on multiple factors – evidence will only ever be one of them. And evidence from alternative contexts can still provide useful information – providing you don’t overinterpret that information and assume that something that works in one context will automatically transfer to another.

Leave a comment

Should we be worried about policy makers’ use of evidence?

A couple of papers have come out this week on policy makers’ use of evidence.

policy makers

Policy makers are apparently floating around in their own little bubbles – but should this be a cause for concern?

The first is a really interesting blog by Mark Chataway, a consultant who has spent recent months interviewing policy makers (thanks to @PrachiSrivas for sharing this with me). His conclusion after speaking to a large number of global health and development policy makers, is that most of them live in a very small bubble. They do not read widely and instead rely on information shared with them via twitter, blogs or email summaries.

The blog is a good read – and I look forward to reading the full report when it comes out – but I don’t find it particularly shocking and actually, I don’t find it particularly worrying.

No policymaker is going to be able to keep abreast of all the new research findings in his/her field of interest. Even those people who do read some of the excellent specialist sources mentioned in the article will only ever get a small sample of the new information that is being generated. In fact, trying to prospectively stay informed about all research findings of potential future relevance is an incredibly inefficient way to achieve evidence-informed decision-making. For me, a far more important question is whether decision makers  access, understand and apply relevant research knowledge at the point at which an actual decision is being made.

Enter DFID’s first ever Evidence Survey – the results of which were published externally this week.

This survey (which I hear was carried out by a particularly attractive team of DFID staff) looked at a sample of staff across grades (from grade ‘B1d to SCS’ in case that means anything to you..) and across specialities.

So, should we be confident about DFID staff’s use of evidence?

Well, partly…

The good news is that DFID staff seem to value evidence really highly. In fact, as the author of the report gloats, there is even evidence that DFID values evidence more than the World Bank (although if you look closely you will see this is a bit unfair to our World Bank colleagues since the questions asked were slightly different).

And there was recognition that the process for getting new programmes approved does require staff to find and use evidence. The DFID business case requires staff to analyse the evidence base which underlies the ‘strategic need’ and the evidence which backs up different options for intervening. Guidance on how to assess evidence is provided. The business case is scrutinised by a chain of managers and eventually a government minister. Controversial or expensive (over £40m) business cases have an additional round of scrutiny from the internal Quality Assurance Unit.

Which is all great…

But one problem which is revealed by the Evidence Survey, and by recent internal reviews of DFID process, is that there is a tendency to forget about evidence once a programme is initiated. Anyone who has worked in development knows that we work in complex and changing environments and that there is usually not clear evidence of ‘what works’. For this reason it is vital that development organisations are able to continue to gather and reflect on emerging evidence and adapt to optimise along the way.

A number of people on Twitter have also picked up on the fact that a large proportion of DFID staff failed some of the technical questions – on research methodologies, statistics etc. Actually, this doesn’t worry me too much since most of the staff covered by the survey will never have any need to commission research or carry out primary analysis. What I think is more important is whether staff have access to the right levels of expertise at the times when they need it. There were some hints that staff would welcome more support and training so that they were better equipped to deal with evidence.

A final area for potential improvement would be on management prioritisation of evidence. Encouragingly, most staff felt that evidence had become more of a priority over recent years – but they also tended to think that they valued evidence more than their managers did – suggesting a continued need for managers to prioritise this.

So, DFID is doing well in some areas, but clearly has some areas it could improve on. The key for me will be to ensure there are processes, incentives and capacity to incorporate evidence at all key decision points in a programme cycle. From the results of the survey, it seems that a lot of progress has been made and I for one am excited to try to get even better.

1 Comment

Guest post on Pritchett Sandefur paper

Readers, I am delighted to introduce my first ever guest post. It is from my colleague Max – who can be found lurking on twitter as @maximegasteen – and it concerns the recent Pritchett/Sandefur paper. Enjoy! And do let us know your thoughts on the paper in the comments.

Take That Randomistas: You’re Totally Oversimplifying Things… (so f(x)=a_0+∑_(n=1)^∞(a_n  cosnπx/L+b_n  sinnπx/L)…)

Internal validity is great - but it's not everything! (Find more fab evaluation cartoons on

The quest for internal validity can sometimes go too far…
(Find more fab evaluation cartoons on

Development folk are always talking about “what works”. It’s usually around a research proposal saying “there are no silver bullets in this complex area” and then a few paragraphs later ending with a strong call “but we need to know what works”. It’s an attractive and intuitive rhetorical device. I mean, who could be against finding out ‘what works’? Surely no-one* wants to invest in something that doesn’t work?-.

Of course, like all rhetorical devices, “what works” is an over-simplification. But a new paper by Lant Pritchett and Justin Sandefur, Context Matters for Size, argues that this rhetorical device is not just simplistic, but actually dangerous for sensible policy making in development. The crux of the argument is that the primacy of methods for neat attribution of impact in development research and donors’ giddy-eyed enthusiasm when an RCT is dangled in front of their eyes leads to some potentially bad decisions.

Pritchett and Sandefur highlight cases where, on the basis of some very rigorous but limited evidence, influential researchers have pushed hard for the global scale-up of ‘proven’ interventions. The problem with this is that while RCTs can have very strong internal validity (i.e. they are good at demonstrating that a given factor leads to a given outcome) their external validity (i.e. the extent to which their findings can be generalised) is oftentimes open to question. Extrapolating from one very different context, often at small scale, to another context can be very misleading. They go on to use several examples from education to show that estimates using less rigorous methods, but in the local context are a better guide to the true impact of an intervention than a rigorous study from a different context.

All in all, a sensible argument. But that is kind of what bothers me. I feel like Pritchett and Sandefur have committed the opposite rhetorical sin to the “what works” brigade – making something more complicated than it needs to be. Sure, it’s helpful to counterbalance some of the (rather successful) self-promotion of the more hard-line randomistas’ favourite experiments, but I think this article swings too far in the opposite direction.

I think Pritchett and Sandefur do a slight disservice to people who support evidence-informed development (full disclosure: I am one of them) thinking they would blindly apply the results of a beautiful study from across the world in the context in which they work. At the same time (and here I will enter into the doing a disservice to the people working in development territory) I would love to be fighting my colleagues on the frontline who are trying to ignore good quality evidence from the local context in favour of excellent quality evidence from elsewhere. But in my experience I’ve faced the opposite challenge, where people designing programmes are putting more emphasis on dreadful local evidence to make incredible claims about the potential effectiveness of their programme (“we asked 25 people after the project if they thought things were better and 77.56% said it had improved by 82.3%” – the consultants masquerading as researchers who wrote this know who they are).

My bottom line on the paper? It’s a good read from some of the best thinkers on development. But it’s a bit like watching a series of The Killing – lots of detail, a healthy dose of false leads/strawmen but afterwards you’re left feeling a little bit bewildered – did I have to go through all that to find out not to trust the creepy guy who works at the removal company/MIT?

Having said that, it’s useful to always be reminded that the important question isn’t “does it work (somewhere)” but “did it work over there and would it work over here”.  I’d love to claim credit for this phrase, but sadly someone wrote a whole (very good) book about it.

*With the possible exception of Lyle Lanley who convinced everyone with a fancy song and dance routine to build a useless monorail in the Simpsons

1 Comment

Improving on systematic reviews

A fast-track to blogging success in the development field is to pick a research approach (RCTs, econometrics, rigorous synthesis, qualitative research etc), ‘reveal’ that there are some drawbacks of said approach, and go on to conclude that research is bad (or at least highly suspect). Whenever I see such an article, it strikes me as a little akin to giving up on our judicial system on the basis that sometimes there are miscarriages of justice:

I mean, clearly, any method of gathering evidence to inform decisions has limitations. And of course the people making decisions will be informed by a whole lot of other factors. But these facts don’t make me want to give up on evidence entirely but rather inspire me to think about how we can reduce the drawbacks of research approaches and/or support people and processes so that evidence is routinely considered as one part of the decision-making process.

So, it was with happiness that I read this new paper from the ODI. The paper gives an overview of both ‘traditional’ literature reviews and systematic reviews and outlines some drawbacks of each. However, rather than taking the approach of declaring both useless, the authors go on to propose an intermediate approach which combines:

“..compliance with the broad systematic review principles (rigour, transparency, replicability) and flexibility to tailor the process towards improving the quality of the overall findings, particularly if time and budgets are constrained”.

What makes this paper particularly useful is that it sets out a clear process with 8 steps which potential authors can follow. They give plenty of detail of how each stage can be carried out and they include a wealth of useful tips (for example, I learnt about the concept of ‘forward-snowballing’ – who knew?). I think that many people find the idea of carrying out a rigorous review quite intimidating and will find this an invaluable guide. I also love the inclusion of a graphical representation of synthesised evidence – as I have mentioned before, I think we need to get more inventive at communicating bodies of evidence.

The authors don’t shy away from discussing the challenges of their proposed approach – with particular attention paid to the difficulties of assessing the ‘strength’ of evidence. I would tend to be slightly more positive about attempting some type of assessment of evidence strength – and I am not sympathetic to the argument that authors are unable to include method sections due to restrictive word count rules (have you seen how long some academic papers from the ODI are?!). Having said that, I do completely agree with the authors that this is the most challenging – and the most political – part of evidence synthesis and that there will always be a degree of subjectivity.

I did think the authors fell slightly into strawman territory when they list how their approach differs from SRs. A few of the differences do not really exist. For example, they mention that meta-analysis is not a useful way to synthesise data for many topics – which is true – but meta-analysis is by no means a necessary part of an SR. I would hazard a guess that most systematic reviews in development research do not use meta-analysis – see here and here for examples. They also imply that SRs do not include grey literature. This is definitely not true – any good SR should include a thorough search strategy which includes grey literature. See for example this guidance from the EPPI centre which states:

“In most approaches to systematic reviewing the aim is to produce a comprehensive and unbiased set of research relevant to the review question. Being comprehensive means that the search strategy attempts to uncover published and unpublished, easily accessible and harder to find reports of research studies.”

I do wonder whether these statements were true about some earlier SRs – for example, perhaps meta-analysis has been used inappropriately in the past, and I am sure that not all SRs (particularly when they were first introduced to the development research field) did a good job of capturing non-journal published material. This might explain the impressions reflected in the paper.

In any case, these are minor quibbles. Overall, I think it’s a good and useful paper, and I do hope that it will stimulate more people to think about how we can synthesise evidence in a way which is as objective as possible but is also practical.


Supply and demand in evidence-informed policy – this time with pictures!

I have talked before about supply and demand in evidence-informed policy but I decided to revisit the topic with some sophisticated visual aids. I am aware that using the using the model of supply/demand has been criticised as over-simplifying the topic – but I still think it is a useful way to think about the connections between research evidence and policy/practice (plus, to be honest, I am fairly simple!).

You can distinguish between supply and demand by considering ‘what is the starting point?’. If you are starting with the research (whether its a single piece of research or a body of research on a given topic) and considering how it may achieve policy influence, you are on the supply side…

In contrast, those on the demand side, typically start with a decision (or a decision-making process) and consider how research can feed into this decision…

#This distinction may seem obvious, but I think it is often missed. What this means in practice is an explosion of approaches to evidence-informed policy/practice which attempt to push more and more evidence out there in expectation that more supply will lead to a better world…


One problem with this is that if your supply approaches focus on just one research project – or one side of a debate – they risk going against evidence-informed policy


*Science monster usually lives here – she is just visiting my blog today


Some supply approaches do aim to increase access to a range of research and to synthesise and communicate where the weight of evidence lies. However, even these approaches are destined to fail if there is not a corresponding increase in demand…


I think we should continue to support supply-side activities but I  think we also need to get better at supporting the demand. So what would this look like in practice?

For me the two components of demand are the motivation (whether intrinsic or extrinsic) and the capacity (i.e. the knowledge, skills, attitudes, structures, systems etc) to use research. In other words, you need to want to use research and you need to be able to do so.

Motivation can be improved by enhancing the organisational  culture of evidence use – but also by putting systems in place which mandate and/or reward evidence use…

Achieving this in practice needs the support of senior decision makers within a policy making institution. So for example the UK Department for International Development has transformed the incentives to use research evidence since Prof Chris Whitty came in as the Chief Scientific Advisor and Head of Research.

But incentives on their own are not enough. There also needs to be capacity and it needs to exist at multiple levels; at an organisational level, there needs to be structural capacity such as adequate internet bandwidth, access to relevant academic journals etc etc. At an individual level, those involved in the policy making process need to be ‘evidence-literate’ – i.e. they need to know whaat research evidence is, where they can find it, how they can appraise it, how to draw lessons from evidence for policy decisions etc etc…

Achieving this may require a new recruitment strategy – selecting people for employment who already have a good understanding of research evidence. But continuing professional development courses can also be used to ‘upskill’ existing staff.

Anyway, the above is basically a pictural summary of this paper in the IDS bulletin so if you would like to read about the same topic in more academic terms (and without the pictures!) please do check it out. Its not open access I’m afraid so if you want a copy please tweet me @kirstyevidence or leave a comment below.

Hope you liked the pictures!


I disagree that I disagree!

Fight a straw man anyone?
flikrstream: RobinEllisActor

I have been meaning for some time to write a response to Andries de Toit’s paper Making Sense of Evidence: Notes on the Discursive Politics of Research and Pro-Poor Policy Making and this blog written by Enrique Mendizal in support of the paper’s conclusions.

My initial reaction to both the paper and the blog is to agree with many of the underlying concerns presented but to protest that both authors are weakening their case by presenting the other side of the argument as far more extreme than it actually is. They are reacting against a caricatured ‘randomista’ – an unthinking, automaton who slavishly demands experimental evidence (and only experimental evidence) before making any decision. To be honest, I would also react against such a person – but I am not convinced that such people actually exist!

I work for the UK Department for International Development in the Evidence into Action team. This team’s mission is to promote and support use of quality research evidence in decision-making both within DFID and more widely. Readers of De Toit and Mendizabal’s articles, might imagine that this team is full of ‘randomistas’ – but this couldn’t be further from the truth. The team consists of 20 people with a wide variety of experience and expertise. We have social scientists, basic scientists, international development experts, evaluators and economists. We have people with experience in other government departments, academia, grass-roots charities, activist groups and think tanks. Almost every day within the team, I hear intelligent and nuanced discussions about research evidence and how it contributes to policy and practice.

There is general consensus that for some questions (e.g. whether X intervention leads to Y outcome in environment Z) experimental or quasi-experimental approaches offer the strongest evidence.  And that for other questions (e.g. whether X intervention is cultural or politically acceptable, why X leads to Y, whether another environment is similar to Z etc) alternative approaches would be more useful. However beyond those general statements, I would not say that we have a universal preference for one form of evidence or another. It completely depends on what question needs to be answered.

For example, this afternoon within DFID, there was a seminar about male circumcision in Africa and its role in reducing HIV transmission. The reading for the seminar included a systematic review which looks at the efficacy of circumcision as a preventative measure. But there were also papers examining the social and ethical issues around this issue.

Sometimes, we might push for more experimental (or quasi-experimental) approaches. For example, I recently looked over a policy brief which discussed the outcomes of a particular type of governance intervention. I felt that this brief was relying too much on case study based evidence to support a certain hypothesis. My suggestion was that they needed to either summarise more rigorous evidence or, if this evidence does not exist, they need to discuss why this is the case and explain the limitations of the evidence which is presented.

However, on other occasions, the team might suggest that more qualitative research is needed. For example, a colleague was at a meeting recently where the results of a randomised controlled trial to increase vaccine uptake by providing food handouts were discussed. He felt that the discussion was focussing too much on a single question (does the intervention ‘work’ in this setting?) and his advice was that there was a need to consider other questions (e.g. what does the intervention mean for people’s dignity? how ethical is the intervention? is the intervention socially acceptable?) and that to answer these questions different forms of evidence would be required.

I am not in any way suggesting that DFID gets the balance right in every case; of course we make mistakes. But it is not true to suggest that the team in which I work, or DFID as a whole, is dominated by people who only value one form of research evidence as the basis for decisions.

Unfortunately I am not able to go to the PLAAS conference in November at which these issues will be discussed – but I would like to make a plea to those who are attending. Please try not get too bogged down discussing your perceptions of policy makers’ prejudices. I suspect that the discussions will be more useful if they focus on concrete examples of policy decisions which contributors feel were overly influenced by one type of research evidence. By analysing these examples, it may be possible to come up with some genuinely useful case studies and some recommendations on how policy makers can consider a range of factors (including citizens’ voice, local cultures, power dynamics etc. etc. as well as research evidence) as they make their decisions.