Entries tagged with “statistics”.

An interesting review of Paul Collier’s The Bottom Billion and Wars, Guns and Votes by Yale Anthropologist Mike McGovern has gotten a little bit of attention recently in development circles, speaking as it does to ongoing debates about the role of statistical analysis, what counts as explanation, and where qualitative research fits into all of this.  I will take up McGovern’s good (but incomplete, in my opinion) review in another post.  Here, I needed to respond to a blog entry about this review.

On the Descriptive Statistics, Causal Inference and Social Science blog, Andrew Gelman discusses McGovern’s review.  While there is a lot going on in this post, one issue caught my attention in particular.  In his review, McGovern argues that “Much of the intellectual heavy lifting in these books is in fact done at the level of implication or commonsense guessing,” what Gelman (quoting Fung) calls “story time”, the “pivot from the quantitative finding to the speculative explanation.”  However, despite the seemingly dismissive term for this sort of explanation, in his blog post Gelman argues “story time can’t be avoided.” His point:

On one hand, there are real questions to be answered and real decisions to be made in development economics (and elsewhere), and researchers and policymakers can’t simply sit still and say they can’t do anything because the data aren’t fully persuasive. (Remember the first principle of decision analysis: Not making a decision is itself a decision.)

From the other direction, once you have an interesting quantitative finding,of course you want to understand it, and it makes sense to use all your storytelling skills here. The challenge is to go back and forth between the storytelling and the data. You find some interesting result (perhaps an observational data summary, perhaps an analysis of an experiment or natural experiment), this motivates a story, which in turn suggests some new hypotheses to be studied.

Now, on one hand I take his point – research is iterative, and answering one set of questions (or one set of new data) often raises new questions which can be interrogated.  But Gelman seems to presume that explanation only comes from more statistical analysis, without considering what I saw as McGovern’s subtle point: qualitative social scientists look at explanation, and do not revert to story time to do so (good luck getting published if you do).  We spend a hell of a lot of time fleshing out the causal processes behind our observations, including establishing rigor and validity for our data and conclusions, before we present stories.  This is not to say that our explanations are immediately complete or perfect, nor is it to suggest that our explanations do not raise new questions to pursue.  However, there is no excuse for the sort of “story time” analysis that McGovern is pointing out in Collier’s work – indeed, I would suggest that is why the practice is given a clearly derisive title.  That is just guessing, vaguely informed by data, often without even thinking through alternative explanations for the patterns at hand (let alone presenting those alternatives).

I agree with Gelman’s point, late in the post – this is not a failing of statistics, really.  It is a failure to use them intelligently, or to use appropriate frameworks to interpret statistical findings.  It would be nice, however, if we could have a discussion between quant and qual on how to avoid these outcomes before they happen . . . because story time is most certainly avoidable.

The BBC has posted an interesting map of Nigeria that captures the spatiality of politics, ethnicity, wealth, health, literacy and oil.  There are significant problems with this map.  The underlying data has fairly large error bars that are not acknowledged, and the presentation of the data is somewhat problematic; for example, the ethnic “areas” in the country are represented only by the majority group, hiding the heterogeneity of these areas, and other data is aggregated at the state level, blurring heterogenous voting patterns, incomes, literacy rates and health situations. I really wish that those who create this sort of thing would do a better job addressing some of these issues, and pointing out the issues they cannot address to help the reader better evaluate the data.

But even with all of these caveats, this map is a striking illustration of the problems with using national-level statistics to guide development policy and programs.  Look at the distributions of wealth, health and literacy in the country – error bars or no, this data clearly demonstrates that national measures of wealth cannot guide useful economic policy, national measures of literacy might obscure regional or ethnic patterns of educational neglect, and national vaccination statistics tell us nothing about the regional variations in disease ecology and healthcare delivery that shape health outcomes in this country.

This is not to say that states don’t matter – they matter a lot.  However, when we use national-scale data for just about anything, we are making very bad assumptions about the heterogeneity of the situation in that country . . . and we are probably missing key opportunities and challenges we should be addressing in our work.