The limits of using ‘Big Data’ for development – an economist writes

Get your inner nerd out! The World Bank has launched a competition to help them better predict a households’ poverty status based on easy-to-collect information and machine learning algorithms. Build a statistical model that works well, and you could win a cash price of up to US$ 6,000! So what’s the pitch? asks Bjorn Gelders.


“Right now measuring poverty is hard, time consuming, and expensive,” the World Bank states. “By building better models, we can run surveys with fewer, more targeted questions that rapidly and cheaply measure the effectiveness of new policies and interventions. The more accurate our models, the more accurately we can target interventions and iterate on policies, maximising the impact and cost-effectiveness of these strategies.” Sounds great! Or does it?

The use of statistical models and machine learning has become omnipresent in recent years. Hedge funds, for example, use complex algorithms to try to predict the movements of stock markets. Banks use them to estimate the risk of defaults when deciding on loan applications and insurance firms when setting premiums. Internet giants like Google and Facebook build detailed profiles of their users to better target adverts. The list is ever-growing.

Many social assistance programmes in low- and middle-income are using complex maths too, to ‘target the poor’. Their administrators feed data from national household surveys into a computer algorithm, which develops a formula that is used to predict whether a household is poor or not, based on certain characteristics such as family size, type of housing, and education. These ‘proxy means tests’ are a feat of the conditional cash transfer programmes in Latin America, and are also being introduced in countries in Africa and Asia, usually with technical support of multilateral donors.

The main problem, though, is that algorithms can make troublingly unfair decisions. In our paper ‘Exclusion by Design’ we show that social assistance programmes that rely on statistical models to select beneficiaries suffer from high errors, typically excluding at least half of the very poorest households they aim to reach. The economists Brown, Ravallion, and Van De Walle reach a similar conclusion based on data from nine African countries. While ‘econometric’ targeting can do a reasonable – but far from perfect – job of filtering out the most affluent households, it is not an accurate method for identifying the poorest households. What’s more, highly sophisticated models that use more detailed information and complex techniques do not perform much better.

Bjorn Gelders is blah and has xx years experience of blah

Bjorn Gelders is a Senior Social Policy Specialist at Development Pathways, specialising in child poverty, inequality, vulnerability and social protection programming. His areas of expertise include statistical analysis, microsimulation and policy analysis and he has provided technical assistance to donors, governments and NGOs across Africa, Asia and the Pacific.

When the algorithms get it wrong, there is little chance of recourse. Every year, millions of families across the developing world who really do need support are denied access to social protection because a computer says they are not ‘poor’. Pathways’ recent research into social accountability found that few social assistance programmes have functioning grievance mechanisms, and those that do are better at collecting complaints rather than resolving them. Local officials or caseworkers are not empowered to override incorrect decisions. And, no-one holds the algorithms accountable: I have not yet seen a social assistance programme that monitors their performance and keeps track of errors.

Don’t get me wrong. I’m fascinated by the explosion of ‘big data’ and machine learning in different areas of society. And I think the World Bank has set an interesting challenge for data scientists. Robust data and statistics should play a critical role in decision-making, and efforts to improve and optimise household surveys are important. But let’s not trust machines too much, especially when deciding who has their right to social security fulfilled and who does not.

The causes of poverty are multi-faceted and household incomes and consumption are highly dynamic. Trying to squeeze that reality into a statistical model is challenging at best and dangerous at worst.

2 Responses to “The limits of using ‘Big Data’ for development – an economist writes”

  1. Dominic Oyaya Reply

    Thanks for that interesting finding. It’s true that relying on computer generated data to decide on who to benefit from poverty initiatives leaves some deserving cases out

  2. One question arises…What is the comparator? There are plenty of human only run systems that are non-transparent, and non-responsive, and probably have substantial inaccuracies as well (False Positives and False Negatives). In your paper ‘Exclusion by Design’ do you look at what sort of human only systems were in operation before the use of statistical models was introduced?

    That said, I do think all algorithm based decision making needs human oversight. For a real life and large-scale example of bad practice, I encourage readers to look closely at the continuing horror story in Australia, known as “robo-debt”, where the federal Department of Human Services has issued algorithm-generated debt recovery notices to 20,000 welfare recipients who were later found to owe less or even nothing. The same newspaper article (November 2017) notes “The number of altered robo-debt notices is likely to have grown in the past six months, and only represents instances where welfare recipients have challenged the amounts”. And this is only after a damning Senate inquiry and substantial media exposure.

    My suggestion…

    1. Any algorithm-based approach to decision making that has an impact on human welfare should have a publicly available False Positives policy.
    A False Positive is a case that is predicted by the algorithm to be an X e.g.a terrorist or loan defaulter, but is actually not so

    2. Such a policy should acknowledge that False Positives are likely, no matter how good the algorithm, and that False Positives are more likely the rarer the event is that the algorithm is trying to detect.

    3. The False Positive policy should also make a commitment to providing a human response to all False Positives cases, to (a) minimise any harm to the persons concerned, that has or is likely to occur, (b) to identify, through investigations of those cases, ways of improving the algorithm such that those particular types of False Positive do not occur again in the future

    This approach is not “First Do No Harm” Instead it assumes some form of harm is likely despite our best efforts so we should always “monitor and respond to potential harms” Especially in applications of algorithms affecting large numbers of people.

Leave a Reply

Your email address will not be published. Required fields are marked *