The Cult of Big Data

“Data is quickly becoming the planet’s most abundant resource” – IBM
“Data is the new currency” Microsoft
“90% of all information ever created was produced in the last two years” Mike Lynch, Autonomy

The philosopher’s stone, the perpetual motion machine, cold fusion. We once dreamed physics or chemistry would offer the panacea, but now it is immaterial – data.

Its touch has not yet cured scrofula, but you can be sure the answer’s contained somewhere within the genome.

Data is talked about like it’s new – as though the Library of Alexandria did not contain information, and the book was never a technology.

According to Google Books (our modern day equivalent) it seems “data” is a creation of the 20th century. Google Search analytics show the term “big data” coming almost out of nowhere in 2011 and accelerating in popularity through 2012.

ngram

search

And since 2011, apparently, we’ve created 90% of the data ever in existence.

We risk mistaking this for 90% of the things worth knowing, or 90% of the answers to the many and multifaceted concerns of the world.

Big problems – but hey, we have big data! say the rhetoricians at TED, SXSW, the Economist, Harvard Business Review, Davos. Big data gonna fix it all – now that we can measure the world, now that we can store it, now that we know.

Look at all the economic data that exists – hundreds of years of interest rates, hundreds of thousands of stocks. All accurate to hundreths of a second, hundredths of a point. Do you think the next financial crisis will be predicted sufficiently to stop it?

Managing data is difficult. Doing a decent analysis is harder. Mostly analysis shows you the conclusions you were expecting to find, because your expectations shaped the methods of analysis you chose to use.

The algorithm is the alchemy of our age. We forget that it simply means a sequence of instructions. And instructions require someone to do the instructing, and someone to say when something has been found. Algorithms are social processes, cultural objects – not objective. Take any big enough dataset and you will find statistically significant differences. That is a truth about the universe; the differences found are not necessarily.

This is not an original argument – you’ve read Kate Crawford’s piece on “The Hidden Biases in Big Data“.

So instead, let’s ask – Why do we need Big Data to be this panacea? What is it about Big Data that makes it the Next Big Thing?

In 2007-08, the West didn’t just suffer a financial crisis – it suffered a crisis of meaning. What was the narrative? How did the world actually work? “Boom and bust” had not stopped; trickle-down economics had turned out to have stopped some time in the 1970s, and capitalism was quite evidently not working. But there was also no alternative – capitalism certainly didn’t fall and mostly neither did governments. We had a go at the narrative of power shifting from East to West – Dubai! Where is Dubai now? And China’s GDP statistics are understood to be fictional.

Meanwhile we see a succession of the hottest years on record. Beijing residents wear gas masks to protect against air pollution, and earlier this month carbon dioxide exceeded 400ppm, the highest level since the Pliocene.

Data is urgently needed as “the planet’s most abundant resource” because nothing else is in fact abundant.

Capitalism is above all an ethos of growth, and data offered the best equation for growth yet – Moore’s Law.

Exponential growth for data storage, for processing power – for data itself, and (we hope) the ability to make some sense of it. If we all pretend to be confident that data can be turned into transformative business value and profit – then capitalism is safe. We can continue the long march upwards into the sunny uplands of prosperity – and big data will tell us how to fix the air and the water and the oil supply so that’ll be alright too.

But “The best minds of my generation are thinking about how to make people click ads.

Advertising may be able to create desires, but it can’t create the capacity to fulfil them. Consumers’ disposable incomes have fallen. Retail spending has been essentially static since 2007. In the UK retail sales are worth £311 billion.  UK advertising spend is £16.8 billion. These are limits to growth for the value of consumer data.

Not that the speakers at digital media conferences have noticed.

data new oil

So what?

Well, big data isn’t just about consumer data, that’s for sure – much as advertising does provide the financial foundation of the web. And I’m certainly not wishing to write off the power of data analysis.

I spoke earlier this month as part of Auto Italia’s Immaterial Labour Isn’t Working series. In the discussion following, an audience member commented wryly that “Big Data is a CEO term. Anyone who actually works with it just calls it ‘data’ “.

I am interested in that distinction. How did data suddenly get Big (and capitalised) – when did a dataset’s size become a proxy for its power? I’m interested in the hopes, beliefs and rhetoric attached to it. Perhaps I’m interested as data as an object of faith. 

2 comments

  1. Malte

    There’s a tiny detail. Up til the 20th century, most data was created with a certain purpose. Now most data is created as a sideeffect, or rather available for uses it wasn’t created for, purposes noone even thought of when creating it. Of cause, to a certain point you could read a book from the library of alexandria to answer questions the author didn’t ask. But with connection of Data sources there is a incredible rise in asking the questions after getting the data. For example in Medicine, by connecting millions of patients health histories to discover effects and cures and connections between diseases. In the end, the rise of data boils down to a huge network effect. Of cause you still have to ask questions, but the new thing is that the data itself can point you towards the questions and provide the answers.

  2. Pingback: A few tips on ‘social media analysis’ | DisCoverage