Entries tagged with “monitoring and evaluation”.
Did you find what you wanted?
Sun 28 Dec 2014
Raj Shah has announced his departure from USAID. Honestly, this surprises nobody at the Agency, or anyone in the development world who’s been paying attention. If anything, folks are surprised he is still around – it is well-known (or at least well-gossiped) that he was looking for the door, and at any number of opportunities, at least since the spring of 2012. There are plenty of reviews of Shah’s tenure posted around the web, and I will not rehash them. While I have plenty of opinions of the various initiatives that Shah oversaw/claims credit for (and these are not always the same, by the way), gauging what did and did not work under a particular administrator is usually a question for history, and it will take a bit of space and time before anyone should feel comfortable offering a full review of this administrator’s work.
I will say that I hope much of what Shah pushed for under USAID Forward, especially the rebuilding of the technical capacity of USAID staff, the emphasis on local procurement, and the strengthening of evaluation, becomes entrenched at the agency. Technical capacity is critical – not because USAID is ever going to implement its own work. That would require staffing the Agency at something like three or four times current levels, and nobody is ever going to approve that. Instead, it is critical for better monitoring and evaluating the work of the Agency’s implementing partners. In my time at USAID, I saw implementer work and reports that ran the gamut from “truly outstanding” to “dumpster fire”. The problem is that there are many cases where work that falls on the dumpster fire end of the spectrum is accepted because Agency staff lack the technical expertise to recognize the hot mess they’ve been handed. This is going to be less of a problem going forward, as long as the Agency continues to staff up on the technical side.
Local procurement is huge for both the humanitarian assistance and development missions of USAID. For example, there is plenty of evidence supporting the cost/time effectiveness of procuring emergency food aid in or near regions of food crisis. Further, mandates that push more USAID funding to local organizations and implementers will create incentives to truly build local capacity to manage these funds and design/implement projects, as it will be difficult for prime contractors to meet target indicators and other goals without high-capacity local partners.
A strong evaluation policy will be huge for the Agency…if it ever really comes to pass. While I have seen real signs of Agency staff struggling with how to meaningfully evaluate the impact of their programs, the overall state of evaluation at the Agency remains in flux. The Evaluation Policy was never really implementable, for example because it seems nobody actually considered who would do the evaluations. USAID staff generally lack the time and/or expertise to conduct these evaluations, and the usual implementing partners suffer from a material conflict of interest – very often, they would have to evaluate programs and projects implemented by their competitors…even projects where they had lost the bid to a competitor. Further, the organizations I have seen/interacted with that focus on evaluation remain preoccupied with quantitative approaches to evaluation that, while perhaps drawing on Shah’s interest in the now-fading RCT craze in development, really cannot identify or measure the sorts of causal processes that connect development interventions and outcomes. Finally, despite the nice words to the contrary, the culture at USAID remains intolerant of project failure, and the leadership of the Agency never mounted the strong defense of this culture change to the White House or Congress needed to create the space for a new understanding of evaluation, nor did it ever really convey a message of culture change that the staff of USAID found convincing across the board. There are some groups/offices at USAID (for example, in the ever-growing Global Development Lab) where this culture is fully in bloom, but these are small offices with small budgets. Most everyone else remains mired in very old thinking on evaluation.
At least from an incrementalist perspective, entrenching and building on these aspects of USAID Forward would be a major accomplishment for Shah’s successor. Whoever comes next will not simply run out the clock of the Obama Administration – there are two years left. I therefore expect the administration to appoint an administrator (rather than promote a career USAID staff caretaker with no political mandate) to the position. In a perfect world, this would be a person who understands development as a discipline, but also has the government and implementing experience to understand how development thought intersects with development practice in the real world. Someone with a real understanding of development and humanitarian assistance as a body of thought and practice with a long history that can be learned from and built upon would be able to parse the critical parts of USAID Forward from the fluff, could prevent the design and implementation of projects that merely repeat the efforts (and often failures) of decades ago, and could perhaps reverse the disturbing trend at USAID to view development challenges as technical challenges akin to those informed by X-Prizes – a trend that has shoved the social aspects of development to the back seat at the Agency. At the same time, someone with implementing and government experience would understand what is possible within the current structure, thus understanding where incremental victories might push the Agency in important and productive directions that move toward the achievement of more ideal, long-term goals
There are very, very few people out there who meet these criteria. Steve Radelet does, and he served as the Chief Economist at USAID while I was there, but I have no idea if he is interested or, more importantly, if anyone is interested in him. Much the pity if not. More likely, the administration is going to go with the relatively new Deputy Administrator Alfonso Lenhardt. Looking at his background, he’s already been vetted by the Senate for his current position, has foreign service experience, time in various implementer-oriented positions, and he is well-positioned to avoid a long confirmation process as a former lobbyist and from his time as House Sergeant-at-Arms, which likely give him deep networks on both sides of the aisle. In his background, I see no evidence of a long engagement with development as a discipline, and I wonder how reform-minded a former Senior Vice President for Government Relations at an implementer can be. I do not know Deputy Administrator Lenhardt at all, and so I cannot speak to where he might fall on any or all of the issues above. According to Devex, he says his goal is to “improve management processes and institutionalize the reforms and initiatives that Shah’s administration has put in place.” I have no objection to either of these goals – they are both important. But what this means in practice, should Lenhardt be promoted, is an open question that will have great impact on the future direction of the Agency.
Tue 2 Dec 2014
Those of you who’ve read this blog before know that I have a lot of issues with “technology-will-fix-it” approaches to development program and project design (what Evgeny Morozov calls “solutionism”). My main issue is that such approaches generally don’t work. Despite a very, very long history of such interventions and their outcomes demonstrating this point, the solutionist camp in development seems to grow stronger all the time. If I hear one more person tell me that mobile phones are going to fix [insert development challenge here], I am going to scream. And don’t even get me started about “apps for development,” which is really just a modified incarnation of “mobile phones will fix it” predicated on the proliferation of smartphones around the world. Both arguments, by the way, were on full display at the Conference on the Gender Dimensions of Weather and Climate Services I attended at the WMO last month. Then again, so were really outdated framings of gender. Perhaps this convergence of solutionism and reductionist framings of social difference means something about both sets of ideas, no?
At the moment I’m particularly concerned about the solutionist tendency in weather and climate services for development. At this point, I don’t think there is anything controversial in arguing that the bulk of services in play today were designed by climate scientists/information providers who operated with the assumption that information – any information – is at least somewhat useful to whoever gets it, and must be better than leaving people without any information. With this sort of an assumption guiding service development, it is understandable that nobody would have thought to engage the presumptive users of the service. First, it’s easy to see how some might have argued that the science of the climate is the science of the climate – so citizen engagement cannot contribute much to that. Second, while few people might want to admit this openly, the fact is that climate-related work in the Global South, like much development work, carries with it an implicit bias against the capabilities and intelligence of the (often rural and poor) populations they are meant to serve. The good news is that I have seen a major turn in this field over the past four years, as more and more people working in this area have come to realize that the simple creation and provision of information is not enough to ensure any sort of impact on the lives of presumptive end-users of the information – the report I edited on the Mali Meteorological Service’s Agrometeorological Advisory Program is Exhibit A at the moment.
So, for the first time, I see climate service providers trying to pay serious attention to the needs of the populations they are targeting with their programs. One of the potentially important ideas I see emerging in this vein is that of “co-production”: the design and implementation of climate services that involves the engagement of both providers and a wide range of users, including the presumptive end users of the services. The idea is simple: if a meteorological service wants to provide information that might meet the needs of some/all of the citizens it serves, that service should engage those citizens – both as individuals and via the various civil society organizations to which they might belong – in the process of identifying what information is needed, and how it might best be delivered.
So what’s the problem? Simple: While I think that most people calling for the co-production of climate services recognize that this will be a complex, fraught process, there is a serious risk that co-production could be picked up by less-informed actors and used as a means of pushing aside the need for serious social scientific work on the presumptive users of these services. It’s pretty easy to argue that if we are incorporating their views and ideas into the design of climate services, there is really no need for serious social scientific engagement with these populations, as co-production cuts out the social-science middleman and gets us the unmitigated, unfiltered voice of the user.
If this sounds insanely naïve to you, it is*. But it is also going to be very, very attractive to at least some in the climate services world. Good social science takes time and money (though nowhere near as much time or money as most people think). And cutting time and cost out of project design, including M&E design, speeds implementation. The pressure to cut out serious field research is, and will remain, strong. Further, the bulk of the climate services community is on the provider side. They’ve not spent much, if any, time engaging with end users, and generally have no training at all in social science. All of those lessons that the social sciences have learned about participatory development and its pitfalls (for a fantastic overview, read this) have not yet become common conversation in climate services. Instead, co-production sounds like a wonderful tweak to the solutionist mentality that dominates climate services, a change that does not challenge the current framings of the use and utility of information, or the ways in which most providers do business. Instead, you keep doing what you do, but you talk to the end users while you do it, which will result in better project outcomes.
But for co-production to replace the need for deep social scientific engagement with the users of climate services, certain conditions must be met. First of all, you have to figure out how, exactly you are going to actually incorporate user information, knowledge, and needs into the design and delivery of a climate service. This isn’t just a matter of a few workshops – how, exactly, are those operating in a nomothetic scientific paradigm supposed to engage and meaningfully incorporate knowledge from very different epistemological framings of the world? This issue, by itself, is generating significant literature…which mostly suggests this sort of engagement is really hard. So, until we’ve worked out that issue, co-production looks a bit like this:
Climate science + end user input => Then a miracle happens => successful project
That, folks, is no way to design a project. Oh, but it gets better. You see, the equation above presumes there is a “generic user” out there that can be engaged in a straightforward manner, and for whom information works in the same manner. Of course, there is no such thing – even within a household, there are often many potential users of climate information in their decision-making. They may undertake different livelihoods activities that are differently vulnerable to particular impacts of climate variability and change. They may have very different capacities to act on information – after all, when you don’t own a plow or have the right to use the family plow, it is very difficult to act on a seasonal agricultural advisory that tells you to plant right away. Climate services need serious social science, and social scientists, to figure out who the end users are – to move past presumption to empirical analysis – and what their different needs might be. Without such work, the above equation really looks more like:
Climate science => Then a miracle happens => you identify appropriate end users => end user input => Then another miracle happens => successful project
Yep, two miracles have to happen if you want to use co-production to replace serious social scientific engagement with the intended users of climate services. So, who wants to take a flyer with some funding and see how that goes? Feel free to read the Mali report referenced above if you’d like to find out**.
Co-production is a great idea – and one I strongly support. But it will be very hard, and it will not speed up the process of climate service design or implementation, nor will it allow for the cutting of corners in other parts of the design process. Co-production will only work in the context of deep understandings of the targeted users of a given service, to understand who we should be co-producing with, and for what purpose. HURDL continues to work on this issue in Mali, Senegal, and Zambia – watch this space in the months ahead.
*Actually, it doesn’t matter how it sounds: this is a very naïve assumption regardless.
** Spoiler: not so well. To be fair to the folks in Mali, their program was designed as an emergency measure, not a research or development program, and so they rushed things out to the field making a lot of assumptions under pressure.
Mon 10 Feb 2014
I’m a big fan of accountability when it comes to aid and development. We should be asking if our interventions have impact, and identifying interventions that are effective means of addressing particular development challenges. Of course, this is a bit like arguing for clean air and clean water. Seriously, who’s going to argue for dirtier water or air. Who really argues for ineffective aid and development spending?
More often than not, discussions of accountability and impact serve only to inflate narrow differences in approach, emphasis, or opinion into full on “good guys”/ “bad guys” arguments, where the “bad guys” are somehow against evaluation, hostile to the effective use of aid dollars, and indeed actively out to hurt the global poor. This serves nothing but particular cults of personality and, in my opinion, serves to squash out really important problems with the accountability/impact agenda in development. And there are major problems with this agenda as it is currently framed – around the belief that we have proven means of measuring what works and how, if only we would just apply those tools.
When we start from this as a foundation, the accountability discussion is narrowed to a rather tepid debate about the application of the right tools to select the right programs. If all we are really talking about are tools, any skepticism toward efforts to account for the impact of aid projects and dollars is easily labeled an exercise in obfuscation, a refusal to “learn what works,” or an example of organizations and individuals captured by their own intellectual inertia. In narrowing the debate to an argument about the willingness of individuals and organizations to apply these tools to their projects, we are closing off discussion of a critical problem in development: we don’t actually know exactly what we are trying to measure.
Look, you can (fairly easily) measure the intended impact of a given project or program if you set things up for monitoring and evaluation at the outset. Hell, with enough time and money, we can often piece enough data together to do a decent post-hoc evaluation. But both cases assume two things:
1) The project correctly identified the challenge at hand, and the intervention was actually foundational/central to the needs of the people at hand.
This is a pretty weak assumption. I filled up a book arguing that a lot of the things that we assume about life for the global poor are incorrect, and therefore that many of our fundamental assumptions about how to address the needs of the global poor are incorrect. And when much of what we do in development is based on assumptions about people we’ve never met and places we’ve never visited, it is likely that many projects which achieve their intended outcomes are actually doing relatively little for their target populations.
Bad news: this is pretty consistent with the findings of a really large academic literature on development. This is why HURDL focuses so heavily on the implementation of a research approach that defines the challenges of the population as part of its initial fieldwork, and continually revisits and revises those challenges as it sorts out the distinct and differentiated vulnerabilities (for explanation of those terms, see page one of here or here) experienced by various segments of the population.
Simply evaluating a portfolio of projects in terms of their stated goals serves to close off the project cycle into an ever more hermetically-sealed, self-referential world in which the needs of the target population recede ever further from design, monitoring, and evaluation. Sure, by introducing that drought-tolerant strain of millet to the region, you helped create a stable source of household food that guards against the impact of climate variability. This project could record high levels of variety uptake, large numbers of farmers trained on the growth of that variety, and even improved annual yields during slight downturns in rain. By all normal project metrics, it would be a success. But if the biggest problem in the area was finding adequate water for household livestock, that millet crop isn’t much good, and may well fail in the first truly dry season because men cannot tend their fields when they have to migrate with their animals in search of water. Thus, the project achieved its goal of making agriculture more “climate smart,” but failed to actually address the main problem in the area. Project indicators will likely capture the first half of the previous scenario, and totally miss the second half (especially if that really dry year comes after the project cycle is over).
2) The intended impact was the only impact of the intervention.
If all that we are evaluating is the achievement of the expected goals of a project, we fail to capture the wider set of impacts that any intervention into a complex system will produce. So, for example, an organization might install a borehole in a village in an effort to introduce safe drinking water and therefore lower rates of morbidity associated with water-borne illness. Because this is the goal of the project, monitoring and evaluation will center on identifying who uses the borehole, and their water-borne illness outcomes. And if this intervention fails to lower rates of water-borne illness among borehole users, perhaps because post-pump sanitation issues remain unresolved by this intervention, monitoring and evaluation efforts will likely grade the intervention a failure.
Sure, that new borehole might not have resulted in lowered morbidity from water-borne illness. But what if it radically reduced the amount of time women spent gathering water, time they now spend on their own economic activities and education…efforts that, in the long term, produced improved household sanitation practices that ended up achieving the original goal of the borehole in an indirect manner? In this case, is the borehole a failure? Well, in one sense, yes – it did not produce the intended outcome in the intended timeframe. But in another sense, it had a constructive impact on the community that, in the much longer term, produced the desired outcome in a manner that is no longer dependent on infrastructure. Calling that a failure is nonsensical.
Nearly every conversation I see about aid accountability and impact suffers from one or both of these problems. These are easy mistakes to make if we assume that we have 1) correctly identified the challenges that we should address and 2) we know how best to address those challenges. When these assumptions don’t hold up under scrutiny (which is often), we need to rethink what it means to be accountable with aid dollars, and how we identify the impact we do (or do not) have.
What am I getting at? I think we are at a point where we must reframe development interventions away from known technical or social “fixes” for known problems to catalysts for change that populations can build upon in locally appropriate, but often unpredictable, ways. The former framing of development is the technocrats’ dream, beautifully embodied in the (failing) Millennium Village Project, just the latest incarnation of Mitchell’s Rule of Experts or Easterly’s White Man’s Burden. The latter requires a radical embrace of complexity and uncertainty that I suspect Ben Ramalingan might support (I’m not sure how Owen Barder would feel about this). I think the real conversation in aid/development accountability and impact is about how to think about these concepts in the context of chaotic, complex systems.
Wed 29 May 2013
I’ve just spent nearly three weeks in Senegal, working on the design, monitoring, and evaluation of a CCAFS/ANACIM climate services project in the Kaffrine Region. It was a fantastic time – I spent a good bit of time out in three villages in Kaffrine implementing my livelihoods as governmentality approach (for now called the LAG approach) to gather data that can inform our understanding of what information will impact which behaviors for different members of these communities.
This work also included a week-long team effort to build an approach to monitoring and evaluation for this project that might also yield broader recommendations for M&E of climate services projects in other contexts. The conversations ranged from fascinating to frustrating, but in the process I learned an enormous amount and, I think, gained some clarity on my own thinking about project design, monitoring, and evaluation. For the purposes of this blog, I want to elaborate on one of my long-standing issues in development – the use of panel surveys, or even broad baseline surveys, to design policies and programs.
At best, people seem to assume that the big survey instrument helps us to identify the interesting things that should be explained through detailed work. At worst, people use these instruments to identify issues to be addressed, without any context through which to interpret the patterns in the data. Neither case is actually all that good. Generally, I often find the data from these surveys to be disaggregated/aggregated in inappropriate manners, aimed at the wrong issues, and rife with assumptions about the meaning of the patterns in the data that have little to do with what is going on in the real world (see, for example, my article on gendered crops, which was inspired by a total misreading of Ghanaian panel survey data in the literature). This should be of little surprise: the vast bulk of these tools are designed in the abstract – without any prior reference to what is happening on the ground.
What I am arguing here is simple: panel surveys, and indeed any sort of baseline survey, are not an objective, inductive data-gathering process. They are informed by assumptions we all carry with us about causes and effects, and the motivations for human behavior. As I have said time and again (and demonstrated in my book Delivering Development), in the world of development these assumptions are more often than not incorrect. As a result, we are designing broad survey instruments that ask the wrong questions of the wrong people. The data from these instruments is then interpreted through often-inappropriate lenses. The outcome is serious misunderstandings and misrepresentations of life on globalization’s shoreline. These misunderstandings, however, carry the hallmarks of (social) scientific rigor even as they produce spectacular misrepresentations of the decisions, events, and processes we must understand if we are to understand, let alone address, the challenges facing the global poor. And we wonder why so many projects and policies produce “surprise” results contrary to expectations and design? These are only surprising because the assumptions that informed them were spectacularly wrong.
This problem is easily addressed, and we are in the process of demonstrating how to do it in Kaffrine. There are baseline surveys of Kaffrine, as well as ongoing surveys of agricultural production by the Senegalese agricultural staff in the region. But none of these is actually tied to any sort of behavioral model for livelihoods or agricultural decision-making. As a result, we can’t rigorously interpret any patterns we might find in the data. So what we are doing in Kaffrine (following the approach I used in my previous work in Ghana) is spending a few weeks establishing a basic understanding of the decision-making of the target population for this particular intervention. We will then refine this understanding by the end of August through a full application of the LAG approach, which we will use to build a coherent, complex understanding of livelihoods decision-making that will define potential pathways of project impact. This, in turn, will shape the design of this program in future communities as it scales out, make sense of the patterns in the existing baseline data and the various agricultural services surveys taking places in the region, and enable us to build simple monitoring tools to check on/measure these pathways of impact as the project moves forward. In short, by putting in two months of serious fieldwork up front, we will design a rigorous project based on evidence for behavioral and livelihoods outcomes. While this will not rule out surprise outcomes (African farmers are some pretty innovative people who always seem to find a new way to use information or tools), I believe that five years from now any surprises will be minor ones within the framework of the project, as opposed to shocks that result in project failure.
Incidentally, the agricultural staff in Kaffrine agrees with my reading of the value of their surveys, and is very excited to see what we can add to the interpretation of their data. They are interested enough to provide in-town housing for my graduate student, Tshibangu Kalala, who will be running the LAG approach in Kaffrine until mid-July. Ideally, he’ll break it at its weak points, and by late July or early August we’ll have something implementable, and by the end of September we should have a working understanding of farmer decision-making that will help us make sense of existing data while informing the design of project scale up.
Thu 31 Jan 2013
Bill Gates, in his annual letter, makes a compelling argument for the need to better measure the effectiveness of aid. There is a nice, 1 minute summary video here. This is becoming a louder and louder message in development and aid, having been pushed now by folks ranging from Raj Shah, the Administrator of USAID, to most everyone at the Center for Global Development. There are interesting debates going on about how to shift from a focus on outputs (we bought this much stuff for this many dollars) to a focus on impacts (the stuff we bought did the following good things in the world). Most of these discussions are technical, focused on indicators and methods. What is not discussed is the massively failure-averse institutional culture of development donors, and how this culture is driving most of these debates. As a result, I think that Gates squanders his bully pulpit by arguing that we should be working harder on evaluation. We all know that better evaluation would improve aid and development. Suggesting that this is even a serious debate in development requires a nearly-nonexistent straw man that somehow thinks learning from our programs and projects is bad.
Like most everyone else in the field, I agree with the premise that better measurement (thought very broadly, to include methods and data across the quantitative to qualitative spectrum) can create a learning environment from which we might make better decisions about aid and development. But none of this matters if all of the institutional pressures run against hearing bad news. Right now, donors simply cannot tolerate bad news, even in the name of learning. Certainly, there are lots of people within the donor agencies that are working hard on finding ways to better evaluate and learn from existing and past programs, but these folks are going to be limited in their impact as long as agencies such as USAID answer to legislators that seem ready to declare any misstep a waste of taxpayer money, and therefore a reason to cut the aid budget…so how can they talk about failure?
So, a modest proposal for Bill Gates. Bill (may I call you Bill?), please round up a bunch of venture capitalists. Not the nice socially-responsible ones (who could be dismissed as bleeding-heart lefties or something of the sort), the real red-in-tooth-and-claw types. Bring them over to DC, and parade out these enormously wealthy, successful (by economic standards, at least) people, and have them explain to Congress how they make their money. Have them explain how they got rich failing on eight investments out of ten, because the last two investments more than paid for the cost of the eight failures. Have them explain how failure is a key part of learning, of success, and how sometimes failure isn’t the fault of the investor or donor – sometimes it is just bad luck. Finally, see if anyone is interested in taking a back-of-the-envelope shot at calculating how much impact is lost due to risk-averse programming at USAID (or any other donor, really). You can shame Congress, who might feel comfortable beating up on bureaucrats, but not so much on economically successful businesspeople. You could start to bring about the culture change needed to make serious evaluation a reality. The problem is not that people don’t understand the need for serious evaluation – I honestly don’t know anyone making that argument. The problem is creating a space in which that can happen. This is what you should be doing with your annual letter, and with the clout that your foundation carries.
Failing that (or perhaps alongside that), lead by demonstration – create an environment in your foundation in which failure becomes a tag attached to anything from which we do not learn, instead of a tag attached to a project that does not meet preconceived targets or outcomes. Forget charter cities (no, really, forget them), become the “charter donor” that shows what can be done when this culture is instituted.
The evaluation agenda is getting stale, running aground on the rocky shores of institutional incentives. We need someone to pull it off the rocks. Now.
Sat 20 Oct 2012
I just witnessed a fascinating twitter exchange that beautifully summarizes the divide I am trying to bridge in my work and career. Ricardo Fuentes-Nieva, the head of research at Oxfam GB, after seeing a post on GDP tweeted by Tim Harford (note: not written by Harford), tweeted the following:
To which Harford tweeted back:
This odd standoff between two intelligent, interesting thinkers is easily explained. Bluntly, Harford’s point is academic, and from that perspective mostly true. Contemporary academic thinking on development has more or less moved beyond this question. However, to say that it “never has been” an important question ignores the history of development, where there is little question that in the 50s and 60s there was significant conflation of GDP and well-being.
But at the same time, Harford’s response is deeply naive, at least in the context of development policy and implementation. The academic literature has little to do with the policy and practice of development (sadly). After two years working for a donor, I can assure Tim and anyone else reading this that Ricardo’s point remains deeply relevant. There are plenty of people who are implicitly or explicitly basing policy decisions and program designs on precisely the assumption that GDP growth improves well-being. To dismiss this point is to miss the entire point of why we spend our time thinking about these issues – we can have all the arguments we want amongst ourselves, and turn up our noses at arguments that are clearly passé in our world…but if we ignore the reality of these arguments in the policy and practice world, our thinking and arguing will be of little consequence.
I suppose it is worth noting, in full disclosure, that I found the post Harford tweeted to be a remarkably facile justification for continuing to focus on GDP growth. But it is Saturday morning, and I would rather play with my kids than beat that horse…
Wed 15 Aug 2012
Alright, last post I laid out an institutional problem with M&E in development – the conflict of interest between achieving results to protect one’s budget and staff, and the need to learn why things do/do not work to improve our effectiveness. This post takes on a problem in the second part of that equation – assuming we all agree that we need to know why things do/do not work, how do we go about doing it?
As long-time readers of this blog (a small, but dedicated, fanbase) know, I have some issues with over-focusing on quantitative data and approaches for M&E. I’ve made this clear in various reactions to the RCT craze (see here, here, here and here). Because I framed my reactions in terms of RCTs, I think some folks think I have an “RCT issue.” In fact, I have a wider concern – the emerging aggressive push for quantifiable data above all else as new, more rigorous implementation policies come into effect. The RCT is a manifestation of this push, but really is a reflection of a current fad in the wider field. My concern is that the quantification of results, while valuable in certain ways, cannot get us to causation – it gets us to really, really rigorously established correlations between intervention and effect in a particular place and time (thoughtful users of RCTs know this). This alone is not generalizable – we need to know how and why that result occurred in that place, to understand the underlying processes that might make that result replicable (or not) in the future, or under different conditions.
As of right now, the M&E world is not doing a very good job of identifying how and why things happen. What tends to happen after rigorous correlation is established is what a number of economists call “story time”, where explanation (as opposed to analysis) suddenly goes completely non-rigorous, with researchers “supposing” that the measured result was caused by social/political/cultural factor X or Y, without any follow on research to figure out if in fact X or Y even makes sense in that context, let alone whether or not X or Y actually was causal. This is where I fear various institutional pushes for rigorous evaluation might fall down. Simply put, you can measure impact quantitatively – no doubt about it. But you will not be able to rigorously say why that impact occurred unless someone gets in there and gets seriously qualitative and experiential, working with the community/household/what have you to understand the processes by which the measured outcome occurred. Without understanding these processes, we won’t have learned what makes these projects and programs scalable (or what prevents them from being scaled) – all we will know is that it worked/did not work in a particular place at a particular time.
So, we don’t need to get rid of quantitative evaluation. We just need to build a strong complementary set of qualitative tools to help interpret that quantitative data. So the next question to you, my readers: how are we going to build in the space, time, and funding for this sort of complementary work? I find most development institutions to be very skeptical as soon as you say the words qualitative…mostly because it sounds “too much like research” and not enough like implementation. Any ideas on how to overcome this perception gap?
(One interesting opportunity exists in climate change – a lot of pilot projects are currently piloting new M&E approaches, as evaluating impacts of climate change programming requires very long-term horizons. In at least one M&E effort I know of, there is talk of running both quantitative and qualitative project evaluations to see what each method can and cannot answer, and how they might fit together. Such a demonstration might catalyze further efforts…but this outcome is years away)
Tue 14 Aug 2012
One of the things I have had the privilege to witness over the past two years is the movement of a large donor toward a very serious monitoring and evaluation effort aimed at its own programs. While I know some in the development community, especially in academia, are skeptical of any new initiative that claims to want to do a better job of understanding the impact of programs, and learning from existing programs, what I saw in practice leads me to believe that this is a completely sincere effort with a lot of philosophical buy-in.
That said, there are significant barriers coming for monitoring and evaluation in development. I’m not sure that those making evaluation policy fully grasp these barriers, and as a result I don’t see evidence that they are being effectively addressed by anyone. Until they are, this sincere effort is likely to underperform, if not run aground.
In this post, I want to point out a huge institutional/structural problem for M&E: the conflict of interest that is created on the implementation side of things. On one hand, donors are telling people that we need to learn about what works, and that monitoring and evaluation is not meant to be punitive, but part of a learning process to help all of us do our jobs better. On the other hand, at most donors the budgets are under pressure, and the message from the top is that development must focus on “what works.” Think about what this means to a mission director or a chief of party. On one hand, they are told that M&E is about learning, and failure is now to be expected and can be tolerated as long as we learn why the failure occurred and can remedy the problem and prevent that problem in the future in other places. On the other, they are told that budgets will focus on what works. So if they set up rigorous M&E, they are likely to identify programs that are underperforming (and perhaps learn why)…but there is no guarantee that this learning won’t result in that program being cut, with a commensurate loss of staff and budget. I have yet to see anyone meaningfully address this conflict of interest, and until someone figures out how to do so, there will be significant and creative resistance to the implementation of rigorous M&E.
Any ideas, folks? Surely some of you have seen this at work…
Simply put, the donors are going to have to decide what is more important – learning what works, and improving on development’s 60+ year track record of spotty results with often limited correlation to programs and projects, or maintaining the appearance of efficiency and efficacy by cutting anything that does not seem to work, and likely throwing out a lot of babies with the bathwater. I know which one I would choose. It remains unclear where the donors’ choices will fall. In a politically challenging environment, the pressure to go with the latter approach is high, and the protection of a learning agenda that will really change how development works will require substantial political courage. That courage exists…but whether or not it comes to the fore is a different question.
Tue 14 Jun 2011
David Cameron gave a speech yesterday at the Global Alliance for Vaccines and Immunisation conference. It deserves to be read in full – I don’t agree with every word (and how I disagree with many of Cameron’s stances), but it is one of the clearest statements on why we must continue to deliver aid to the poorest and most vulnerable people in the world.
On the down side, Cameron starts out a bit too market triumphalist for my tastes:
At home we don’t tackle poverty by state hand-outs; we help people get into work, to stand on their own two feet and to take control of their own destiny. The same should be true of development. No country has ever pulled itself out of poverty through aid alone, so this government will take a new approach. The same conditions create prosperity the world over. They include access to markets, property rights, private-sector investment and they make up what I see as the golden thread of successful development. Ultimately it’s the private sector that will be the engine for growth and that’s why this government’s efforts will increasingly focus on helping developing countries achieve that growth with the jobs and opportunities it will bring.
Well, this is a bit muddled. First, last I checked England (and Great Britain more generally) was home to a robust welfare state (well, until various Tory governments from Thatcher to Cameron took a hatchet to it) that provided the safety net that enhanced the quality of life of its citizens. On the other hand (and second), I agree that no country has ever been lifted out of poverty through aid alone – but then, that’s not what aid does. At best, aid catalyzes much larger processes of change – and sometimes those changes play out constructively (I discuss this at length in Delivering Development). Third, the only countries to have really changed their status in the last half century have done so by rejecting things like the open market and behaving in very politically repressive ways to get through a serious of difficult transitions that eventually made them competitive in global markets and able to productively take in foreign investment – so this claim about what works isn’t fully supported by the evidence. Andy Sumner’s work on the New Bottom Billion suggests that this might be changing as a new pile of countries “graduate” from low-income to middle-income status, but this is still unclear as many of the new “graduates” from low to middle income status have just crept above that line, often with no transformation of their economic fundamentals (leaving them vulnerable to slip-back) and still containing huge numbers of very poor people (creating the same problem, and calling into question the very concept of “graduation” to middle income status).
This is not to say that I don’t think markets have any value – I just fear those who place absolute faith in them, especially given that the environment is the site of perhaps the most serious market failure we’ve ever seen. However, as the speech progressed, I became somewhat more comfortable as, at least in the context of development, Cameron takes a somewhat more moderate tack:
We want people in Africa to climb the ladder of prosperity but of course when the bottom rungs of that ladder are broken by disease and preventable death on a massive scale, when countries can’t even get on the bottom rung of the growth ladder because one in seven of their children die before they reach their fifth birthday, we have to take urgent action. We have to save lives and then we can help people to live. So that’s where today’s announcement fits in. Because there cannot really be any effective development – economic or political – while there are still millions of people dying unnecessarily.
Absolutely correct – the “bottom of the pyramid”, as it were, often finds itself left behind when economic growth programs rev up . . . this is well-understood in both academia and the development institutions. Indeed, it is not controversial for my Bureau (DCHA – the folks who deal with disasters and conflict) to argue that its work is fundamental to creating a firm foundation for future development efforts because we address the needs of vulnerable populations who might otherwise be overlooked by Agency programming.
But what I most like is the kicking Cameron hands out to those who argue we don’t have the money for aid in these hard economic times. The kicking comes in two parts – first a moral argument:
When you make a promise to the poorest people in the world you should keep it. I remember where I was during the Gleneagles Summit and the Live 8 concert of 2005 and I remember thinking at the time how right it was that those world leaders should make such pledges so publicly. For me it’s a question of values; this is about saving lives. It was the right thing to promise; it was the right thing for Britain to do and it is the right thing for this government to honour that commitment.
So to those who point to other countries that are breaking their promises and say that makes it okay for us to do the same, I say no, it’s not okay. Our job is to hold those other countries to account, not to use them as an excuse to turn our back on people who are trusting us to help them. And to those who say fine but we should put off seeing through those promises to another day because right now we can’t afford to help, I say we can’t afford to wait. How many minutes do we wait? Three children die every minute from pneumonia alone; waiting is not the right thing to do and I don’t think that 0.7% of our gross national income is too high a price to pay for saving lives.
I actually think that most people in our country want Britain to stand for something in the world, to be something in the world. And when I think about what makes me proud of our country, yes, I think of our incredibly brave service men and women that I have the honour to meet and see so often; and yes, I think of our capabilities as an economic and diplomatic power; but I also think of our sense of duty to help others. That says something about this country and I think it’s something we can be proud of.
Where . . . the . . . hell . . . is . . . the . . . American . . . political . . . leadership . . . on . . . this? Dammit, the British just took the “City on a Hill” mantle from us. Most Americans want America to stand for something in the world, last I checked.
Oh, and Cameron addresses the unaddressable (for America, it seems) in his speech: that development, in reducing the need for future wars and humanitarian interventions, actually is cost-effective:
If we really care about Britain’s national interest, about jobs, about growth, about security, we shouldn’t break off our links with the countries that can hold some of the keys to that future. If we invest in Africa, if we open trade corridors, if we remove obstacles to growth, it’s not just Africa that will grow but us too. And if we invest in countries before they get broken we might not end up spending so much on dealing with the problems, whether that’s immigration or threats to our national security.
Take Afghanistan. If we’d put a fraction of our current military spending on Afghanistan into helping Afghanistan 15 or 20 years ago just think what we might have been able to avoid over the last decade. Or take Pakistan. Let another generation of Pakistanis enter adult life without any real opportunities and what are the risks in terms of mass migration, radicalisation, even terrorism? That’s why UK support over the next four years will get four million more children in Pakistan into school. This could be life changing for those children and it can be part of the antidote to the extremism that threatens us all. So it’s not just morally right to invest in aid, it’s actually in our own interests too.
God help us, Ron Paul seems to be the only candidate for anything willing to say that the wars we are in are costing a hell of a lot of money, and might not have been necessary. Of course, Ron Paul doesn’t like aid, either . . . actually, he doesn’t seem to like much of anything. Nobody is really taking his hobgoblin act all that seriously, which means he isn’t going to shift the debate here. Cameron, though, really glues his fiscal conservativism to a rational argument for aid – maybe we just should have worked on the aid side of things, at a fraction of the cost, and averted the whole mess in the first place. Lord help me, the Tories are sounding reasonable . . .
Now, Cameron’s ideas for transforming aid are vague, mostly about focusing on results and enhancing accountability. This is all well and good, but amazingly thorny. There’s been quite a bit of discussion about evaluation in the development community (great summary list here) and this blog (here, here and here) of late, and if nothing else, the reader might come to grips with the huge challenges that we must address before we can get to a realization of Cameron’s otherwise nonoffensive ideas.
I suppose it was asking too much to hope a leader talking about transforming development might mention that the global poor might actually have ideas of their own that we should start learning about before we go barging in . . .
Tue 4 Jan 2011
On his blog Shanta Devarajan, the World Bank Chief Economist for Africa, has a post discussing the debate about the performance and results of the Millennium Villages Project (MVP). The debate, which takes shape principally in papers by Matt Clemens and Gabriel Demombynes of Center for Global Development and Paul Pronyk, John McArthur, Prabhjot Singh, and Jeffrey Sachs of the Millennium Villages Project, questions how the MVP is capturing the impacts of its interventions in the Millennium Villages. As Devarajan notes, the paper by Clemens and Demombynes rightly notes that the MVP’s claims about its performance are not really that clearly framed in evidence, which makes it hard to tell how much of the changes in the villages can be attributed to their work, and how much is change driven by other factors. Clemens and Demombynes are NOT arguing that the MVP has had no impact, but that there are ways to rigorously evaluate that impact – and when impact is rigorously evaluated, it turns out that the impact of MVP interventions is not quite as large as the project would like to claim.
This is not all that shocking, really – it happens all the time, and it is NOT evidence of malfeasance on the part of the MVP. It just has to do with a simple debate about how to rigorously capture results of development projects. But this simple debate will, I think, have long-term ramifications for the MVP. As Devarajan points out:
In short, Clemens and Demombynes have undertaken the first evaluation of the MVP. They have shown that the MVP has delivered sizeable improvements on some important development indicators in many of the villages, albeit with effects that are smaller than those described in the Harvests of Development paper. Of course, neither study answers the question of whether these gains are sustainable, or whether they could have been obtained at lower cost. These should be the subject of the next evaluation.
I do not, however, think that this debate is quite as minor as Devarajan makes it sound – and he is clearly trying to downplay the conflict here. Put simply, the last last two sentences in the quote above are, I think, what has the MVP concerned – because the real question about MVP impacts is not in the here and now, but in the future. While I have been highly critical of the MVP in the past, I am not at all surprised to hear that their interventions have had some measurable impact on life in these villages. The project arrived in these villages with piles of money, equipment and technical expertise, and went to work. Hell, they could have simply dumped the money (the MVP is estimated to cost about $150 per person per year) into the villages and you would have seen significant movement in many target areas of the MVP. I don’t think that anyone doubts that the project has had a measurable impact on life in all of the Millennium Villages.
Instead, the whole point here is to figure out if what has been done is sustainable – that is the measure of performance here. Anyone can move the needle in a community temporarily – hell, the history of aid (and development) is littered with such projects. The hard part is moving the needle in a permanent way, or doing so in a manner that creates the processes by which lasting change can occur. As I have argued elsewhere (and much earlier that in this debate), and as appears to be playing out on the ground now, the MVP was never conceptually framed in a way that would bring about such lasting changes. Clemens and Demombynes’ work is important because it provides an external critique of the MVP’s claims about its own performance – and it is terrifying to at least some in the MVP, as external evaluations are going to empirically demonstrate that the MVP is not, and never was, a sustainable model for rural development.
While I would not suggest that Clemens and Demombynes’ approach to evaluation is perfect (indeed, they make no such claim), I think it is important because it is trying to move past assumptions to evidence. This is a central call of my book – the MVP is exhibit A of a project founded on deeply problematic assumptions about how development and globalization work, and framed and implemented in a manner where data collection and evaluation cannot really question those assumptions . . . thus missing what is actually happening (or not happening) on the ground. This might also explain the somewhat non-responsive response to Clemens and Demombynes in the Pronyk et al article – the MVP team is having difficulty dealing with suggestions that their assumptions about how things work are not supported by evidence from their own project, and instead of addressing those assumptions, are trying to undermine the critique at all costs. This is not a productive way forward, this is dogma. Development is many things, but if it is to be successful by any definition, it cannot be dogmatic.