Big Data and Monitoring Sustainable Development Goal 3: Not Counting Those Left Behind?

The latest global initiative to address the vast inequalities in the world, including health inequalities, is the 2030 Agenda for Sustainable Development. Also known as the Sustainable Development Goals (SDGs), this agenda has 17 goals that were adopted in September 2015 by all UN Member States, with a rallying call to “leave no one behind”.

To know whether anyone is being left behind, the SDGs need an accountability system that includes monitoring and evaluation. To that end, the campaign has developed 232 targets, and each target has at least one indicator on which all countries must measure and report. In SDG3 (health for all), there are 13 targets and 27 indicators, covering a wide spectrum of health measures from maternal and neonatal mortality to traffic accident deaths and HIV incidence, to name just a few.

However, indicators and their measurement are contentious subjects because they can result in unintended consequences. For example, before the SDGs there were the Millennium Development Goals (MDGs) that covered the years 2000-2015, and had similarly aspirational objectives to halve global poverty and achieve various health and other social targets. Arguments have been made that poorly chosen MDG indicators diverted attention from other critically important life saving programmes. Case studies presented in a compelling ‘Power of Numbers’ series led those editors to conclude that “target-setting is a valuable but a limited and blunt tool, and that the methodology for target-setting should be refined to include policy responsiveness in addition to data availability criteria.”

In our recently published paper, “Neglecting human rights: accountability, data and Sustainable Development Goal 3”, Paul Hunt and I examine accountability and SDG3, including monitoring. We posit that international human rights law places obligations on States at all times, including for activities and policies related to the SDGs. We looked at the SDG3 indicators and found gaps through which breaches of human rights could fall undetected, especially around participation and quality health care. We also looked at SDG3 data availability and examined suggestions that Big Data can help fill statistical gaps. In this blog I present some of that research examining the issues raised by the critiques of the MDGs, including whether the targets and indicators will capture the human rights duties of States to respect, protect and fulfill health rights.

Robust statistics are frequently absent in those countries and communities most ‘left behind’. The Inter-agency and Expert Group on Sustainable Development Goal Indicators (IAEG-SDGs) which had overall responsibility for the development of the SDG indicators, has noted that only 97 of the 232 indicators have regularly produced data – and even then, the poorest countries are not regularly or reliably collecting the data for all those 97. When the Sustainable Development Solutions Network (SDSN) first proposed the SDG indicators in a report for the UN Secretary General, they acknowledged the indicators would take time to achieve, and estimated the global cost of improving data information systems to enable annual reporting at $1 billion annually, of which ‘at least $100–200 m will be required in incremental ODA [official development assistance]’.

To develop information systems to the level where they can generate data takes more than money; it also takes time – time to extend the systems out to where data must be collected from, to train people in gathering and transferring the data, and to build capacity in the national statistics offices for data analysis. It requires increasing budgets to employ and train more people – difficult enough in wealthy countries, let alone low- and middle-income countries. It has therefore been suggested that we could turn to Big Data: proponents claim that data arising from online search queries, web posts, twitter, and other social media, can provide more timely and even more accurate statistics than the traditional surveys and other tools used by national statistics offices. Furthermore, it is argued that such data collection methods are quicker and cost less, thus they could be appealing to cash strapped institutions or governments. Use of online searches has already shown, in some cases, to predict disease outbreaks more accurately than traditional methods. But mistakes have also been made -outbreaks have been predicted that simply did not happen. Just because people use search terms such as ‘flu’ this does not mean the searcher has flu or any other disease. In public health, this can result in a ‘false positive’ – something is thought to be present when it isn’t.

An even greater risk in terms of people’s health rights are ‘false negatives’ – something is happening, but it isn’t detected. To illustrate: there are 24 countries in which internet access reaches fewer than 10% of the population. If epidemic surveillance depends upon social media and online searches, then any outbreak of disease amongst the majority communities living entirely off the grid will not be captured. In this false negative scenario, the risk is high that an outbreak of infectious disease is not curtailed because it isn’t identified. People’s lives and health are threatened, especially the people too poor, marginalized, or remote to have access to the internet or to other forms of electronic communication.

It has long been recognized that there is a digital divide both between and within countries that, globally, leaves four billion people without internet access. The consequences of this divide could become an even greater human rights risk if our means of monitoring disease or reporting on SDG health indicators becomes dependent on Big Data. It has been suggested that Big Data could monitor SDG indicators on malaria, TB, HIV, “complementing traditional data sources and filling the gaps where they exist”. But there are two problems in poor countries with limited online access: Big Data arises from only small segments of the population, undoubtedly the urban wealthier communities, and secondly, countries without high internet coverage are the same countries that are missing the traditional data generated by national statistics offices. Of the 24 countries with fewer than 10% of the population having online access, the maternal mortality ratios range from 115 to 1374 (average 532) per 100,000 live births, compared with the average of 14 across OECD countries. Nations that have low internet access rates also have poor health systems, including health information systems.

This means that if we want to know whether people are being left behind in the global campaign to eliminate poverty and achieve health for all, we must first build robust statistics systems, and strong health systems which include good flows of health data. This should be the primary focus of overseas development assistance to improve data – not investment in Big Data initiatives in countries with weak national statistics systems. Big Data of course has important roles to play in supplementing data collection in countries that have well functioning national statistics offices; but in those without, developing reliance on Big Data tools will increase the risk that those people living off the grid become even less visible, with health rights unrealized, and left even further behind.

Disclaimer: The views expressed herein are the author(s) alone.