When data science fails

3 Minutos de lectura

The pandemic that Google failed to predict


In 2008, Google created Google Flu Trends with the purpose of detecting the appearance and spread of influenza cases in several countries.

As Google explained in an article published in Nature, the method consisted of analyzing millions of data derived from millions of searches related to influenza cases.

For example, increases in searches for influenza drugs could indicate that the disease was on the rise.
Not infrequently, Google managed to stay ahead of even the most advanced countries' reporting systems, such as the United States or France.

But Google also failed. And it is something that began to be noticed from the beginning.

It missed the first wave of the H1N1 pandemic in 2009. And between 2011 and 2013 Google overestimated the number of cases for 100 out of 108 weeks.


Yes, Google missed with its millions of data.



Data blindness


Why, what did Google's data fail to show?

The answer is not that there was a lack of more data or better data scientists. Of course, Google had access to the best.


Rather, Google's problem, like that of many companies, was its over-reliance on data analytics, not realizing the need for other, non-quantitative paradigms to interpret contexts and people.

Google had the search data, but could not understand why people were searching.

This led to the over-dimension of cases: Google ended up relating much of what was searched for during the flu season to influenza, even if it had nothing to do with the actual cases of the disease.


For example, it took searches such as "high school basketball" as an indicator of influenza.

Or Google estimated many different illnesses that caused similar symptoms as cases of influenza.

For more or less obvious reasons, it could not make diagnoses, as a doctor does when he meets his patient.



Interpret, not analyze


All companies have these limitations with data analysis. Although it may seem that having more and more data gives a better understanding of phenomena, this understanding is not always, almost never, more complete or better.

However, the data paradigm is so common that a data-free approach seems inaccurate and uncertain. Data are, after all, irrefutable and undeniable.



So we believe that he who has the data has the truth.


And this is exactly the mistake. We cannot ask more from data than they can give us.

Qualitative paradigms are necessary, based on our ability to make synthetic and general interpretations of things.

And interpreting is a capacity that only human beings have. 



The data do not answer why. And they lack something: context.

Although data can be obtained from a context, that context is never explained by the data. We cannot reconstruct or understand a context by piecing together a lot of data. 

In this case, the whole is greater than its parts. 

Let's think of a very simple example: from a table we can know its color, texture, material, size, frequency of use, among other things. We can obtain all the measurable data of a table. 

But no matter how much we analyze the table, no matter how much we break it down into its parts, we will never know why, for example, a family uses it for eating, but not for breakfast. Nor can we understand why that family bought that table and not another. 

We cannot, precisely, interpret contexts and people. To do so, we need an overview, a perspective on the whole of a context and ways of being in the world.



The way forward for companies


For many organizations, the same thing can happen. They certainly have the data on their best-selling products. Based on this data, they decide, for example, to increase the supply of that product. They realize that their consumers value it more.

But why do they value it more? They can't answer that in their data. They only get a partial signal from them. The full signal is the understanding, the interpretation, of what your consumers are doing and looking for.

It's the same with SEO, why make content that has the "keywords", the "key words", if we don't understand why a user searches for those words? We will never find that in Google Analytics.


Hence the value of human-centered design and design research, fed by the human sciences, like philosophy or anthropology and tools like ethnographic interviews: they give us an empathetic, contextual understanding of people.

That's what we do every day in the Xperience research area: understand users and people to transform business models, create value offers, design experiences and channels that match their interests, ways of understanding life.

Do you want to talk about how to transform the understanding of your users and customers? Don't hesitate to contact us.


Picture of Mateo Rodríguez Arias

Mateo Rodríguez Arias

Soy el director de investigación de Xperience. Cuento con 10 años de experiencia en las áreas de Investigación, Marca, Tendencias y Diseño de Servicios. Soy Docente de Posgrados de la Colegiatura Colombiana. Estudié diseño gráfico en la Colegiatura Colombiana y cuento con una especialización en intervención creativa de la misma universidad. Tengo una maestría en filosofía de la Universidad de Navarra, y otra en comunicación digital. Tengo estudios en ilustración avanzada, antropología, tradeshow y mercadeo.