We’ve already established that big data is a
game changer. This is as true for business as it is for our daily living –
including our individual and public health.
Big data is already being used in
health research, preventative health care, treatment and therapy. For example,
new websites are popping up that allow patients to track their symptoms and
outcomes in real time. This allows others in similar medical situations to
analyze the individual and aggregate outcomes to help make their medical
decisions. Some of these websites, like patients
like me, even show an expected outcome for your case
for each potential intervention based on the available data on the website
(watch the website creator explain it here). A January 2013 McKinsey report entitled “The
“big-data” revolution in health care” describes the impact of big data and how
it is helping and can continue to help lead to right living, right care, right
provider, right value and right innovation.
The May 2011 McKinsey report Big
data: The next frontier for innovation, competition, and productivity estimates that “big data can enable more than $300
billion a year in value creation in US health care”. As exciting as the potential of big data is on public health, I
can’t help but ask if big data represents the next big thing in health inequality?
It has been long established that health is
strongly correlated with socioeconomic status (just check out gapminder if you aren’t convinced). Low and middle-income countries bear a
disproportionate mortality and morbidity burden. In addition, within a country,
poor, vulnerable and marginalized groups have inferior health and health
outcomes.
Traditionally, these are the groups who
receive the lowest level of health care coverage and fewest research dollars –
currently less than 10 percent of medical research is devoted to
diseases that account for more than 90 percent of the global burden of disease. Unfortunately, they are also the groups who generate the least
data and, consequently, where big data will have the least impact.
If big data is going to revolutionize our
world as quickly and as profoundly as predicted, and if “primary data pools are
at the heart of big-data revolution in healthcare” (McKinsey, 2013 report), it
is these already underserved and overburdened groups who will benefit the
least. This risks creating an even greater health inequality.
What’s the solution? We need to work to
generate a robust data reserve for underserved groups and start using it. Here
are some simple steps that we can take to get there:
Go digital: Many of the
information systems used in global public health don’t give us big data because
they don’t use digital data collection. Many large and resource intensive sociodemographic
and epidemiological surveys are still conducted using paper and are never fully
entered into a digital format. New technologies and tools should make these
collection methods a thing of the past. Joel Selanik explains the problem and presents
one great solution in his TED talk (here).
Ask for help…from non-experts: a research team from the Harvard School of Public Health used 1,000 non-scientific volunteers to analyze an enormous data
set of tuberculosis bacterium growth videos. The team of volunteers was able to
analyze the information in two days, which would have normally taken the
research team three months. We need to start unleash the power of crowd
sourcing solutions.
Open your mind to open data:
The culture of traditional research has created incentives for groups to
protect their data. This has to be a way of the past. Key players like the
United Nations, including the World Health Organization, have already made
their data sets public. All organizations, countries and researchers should
follow suit and allow the public access to non-personal health information so
that evidence based public health can be crowed sourced for all diseases across
various contexts.
Paper Data Collection on Health in Rural India