Tagged: big data

The Cult of Big Data

“Data is quickly becoming the planet’s most abundant resource” – IBM
“Data is the new currency” Microsoft
“90% of all information ever created was produced in the last two years” Mike Lynch, Autonomy

The philosopher’s stone, the perpetual motion machine, cold fusion. We once dreamed physics or chemistry would offer the panacea, but now it is immaterial – data.

Its touch has not yet cured scrofula, but you can be sure the answer’s contained somewhere within the genome.

Data is talked about like it’s new – as though the Library of Alexandria did not contain information, and the book was never a technology.

According to Google Books (our modern day equivalent) it seems “data” is a creation of the 20th century. Google Search analytics show the term “big data” coming almost out of nowhere in 2011 and accelerating in popularity through 2012.

ngram

search

And since 2011, apparently, we’ve created 90% of the data ever in existence.

We risk mistaking this for 90% of the things worth knowing, or 90% of the answers to the many and multifaceted concerns of the world.

Big problems – but hey, we have big data! say the rhetoricians at TED, SXSW, the Economist, Harvard Business Review, Davos. Big data gonna fix it all – now that we can measure the world, now that we can store it, now that we know.

Look at all the economic data that exists – hundreds of years of interest rates, hundreds of thousands of stocks. All accurate to hundreths of a second, hundredths of a point. Do you think the next financial crisis will be predicted sufficiently to stop it?

Managing data is difficult. Doing a decent analysis is harder. Mostly analysis shows you the conclusions you were expecting to find, because your expectations shaped the methods of analysis you chose to use.

The algorithm is the alchemy of our age. We forget that it simply means a sequence of instructions. And instructions require someone to do the instructing, and someone to say when something has been found. Algorithms are social processes, cultural objects – not objective. Take any big enough dataset and you will find statistically significant differences. That is a truth about the universe; the differences found are not necessarily.

This is not an original argument – you’ve read Kate Crawford’s piece on “The Hidden Biases in Big Data“.

So instead, let’s ask – Why do we need Big Data to be this panacea? What is it about Big Data that makes it the Next Big Thing?

In 2007-08, the West didn’t just suffer a financial crisis – it suffered a crisis of meaning. What was the narrative? How did the world actually work? “Boom and bust” had not stopped; trickle-down economics had turned out to have stopped some time in the 1970s, and capitalism was quite evidently not working. But there was also no alternative – capitalism certainly didn’t fall and mostly neither did governments. We had a go at the narrative of power shifting from East to West – Dubai! Where is Dubai now? And China’s GDP statistics are understood to be fictional.

Meanwhile we see a succession of the hottest years on record. Beijing residents wear gas masks to protect against air pollution, and earlier this month carbon dioxide exceeded 400ppm, the highest level since the Pliocene.

Data is urgently needed as “the planet’s most abundant resource” because nothing else is in fact abundant.

Capitalism is above all an ethos of growth, and data offered the best equation for growth yet – Moore’s Law.

Exponential growth for data storage, for processing power – for data itself, and (we hope) the ability to make some sense of it. If we all pretend to be confident that data can be turned into transformative business value and profit – then capitalism is safe. We can continue the long march upwards into the sunny uplands of prosperity – and big data will tell us how to fix the air and the water and the oil supply so that’ll be alright too.

But “The best minds of my generation are thinking about how to make people click ads.

Advertising may be able to create desires, but it can’t create the capacity to fulfil them. Consumers’ disposable incomes have fallen. Retail spending has been essentially static since 2007. In the UK retail sales are worth £311 billion.  UK advertising spend is £16.8 billion. These are limits to growth for the value of consumer data.

Not that the speakers at digital media conferences have noticed.

data new oil

So what?

Well, big data isn’t just about consumer data, that’s for sure – much as advertising does provide the financial foundation of the web. And I’m certainly not wishing to write off the power of data analysis.

I spoke earlier this month as part of Auto Italia’s Immaterial Labour Isn’t Working series. In the discussion following, an audience member commented wryly that “Big Data is a CEO term. Anyone who actually works with it just calls it ‘data’ “.

I am interested in that distinction. How did data suddenly get Big (and capitalised) – when did a dataset’s size become a proxy for its power? I’m interested in the hopes, beliefs and rhetoric attached to it. Perhaps I’m interested as data as an object of faith. 

Seeing Like A Database: the problems with big data

“Big data” has been one of the buzzwords of 2011, and grand claims are being made for its power:

The world is becoming data-ized as digital information and numerical measurement is being applied to all aspects of what people do, particularly things that couldn’t be measured before because it was impractical or impossible. (Think: using wireless and GPS in cars to base insurance premiums on where and when people actually drive, as has been possible since 2007.)

The impact will be as profound as the scientific method in the 18th century — which quickly moved past the sciences and left its mark on all areas of human endeavor. For instance, what is “quantitative decision making” in management, if not the scientific method applied to business…. Likewise, the BigData revolution is plowing through the sciences, and also jumped into mainstream areas, such as business and government.

Data; boring but… by Ken Cukier, 6 March 2011

The problem with these claims is that they conflate increased power to capture and store data with (i) being able to extract meaningful insights from it, and (ii) being able to successfully act on and implement these insights, with (iii) no unexpected or adverse effects. Clearly cracking the first part doesn’t save the world on its own.

Further, big data evangelism often trips over into technocratic thinking, a belief that ‘nerdpower makes right’. Excitable blogposts about exabyte datasets, rather than defining the right problems to solve. Wide-eyed admiration for the amount of data that can be gathered, without recognition for the ethical rightness (or otherwise) of doing so.

Which is to say that big data is fundamentally political. Whether we choose to theorise it as technology or knowledge [actually, there’s a good PhD proposal…], the act of recording the world in this way privileges particular values, worldviews and types of action.

In his blog post Lessons of the Victorian data revolution, Pete Warden insightfully makes the connection with technocratic thinking and brings in that great study of central planning, James Scott’s Seeing Like A State:

James Scott’s “Seeing Like a State” looks at the legacy of the Victorian scientific revolution, and shows how the very success of its ideas had a dark side. [Similarly,] Creating datasets may help technical people […] to understand problems and propose solutions, but it also means that […] other people with deep, lived experience of the domains will be overruled. In the 20th century the prestige of the scientific toolkit was used to justify disasters like the collectivization of agriculture, as technocrats around the world wielded numbers to take power away from “inefficient” smallholders. Those figures were mostly proven bogus by reality, as plans with no knowledge of conditions on the ground failed when confronted with the wildly variable conditions of soil, weather and pests that farmers had spent a lifetime learning to cope with.

Lessons of the Victorian data revolution by Pete Warden

If you’ve not read ‘Seeing Like A State’, incidentally, I recommend it. In it Scott surveys the great utopian schemes of the 20th century, from Le Corbusier’s urban planning in Brasilia to Russian collectivisation of agriculture and China’s Great Leap Forward. Each well-intentioned and yet spectacular failures, with millions of deaths. His argument is that centrally-managed planning does not work because it rides roughshod over the complex interdependencies on the ground.

Perhaps, under ‘big data’ ideology, we might ask – is this not simply a problem of too little information? We have the capacity to measure everything now – did Corbusier or Stalin fail because their data was not sufficiently granular?

Scott would disagree. The problem at hand is not quantity of knowledge but its very type. Common to each central planning disaster is a belief in a high-modernist ideology claiming that science can improve every aspect of human life, and an authoritarian central power willing to effect large-scale re-orderings of society and nature. “Big data will solve everything” can, clearly enough, be another iteration of the same. Scott – and, in fact, Friedrich Hayek’s criticism of centrally-planned economies (The Rule of Serfdom, 1944) – is that this disregards local and personal knowledge (Scott might add, embodied and tacit knowledge), and the complex diversity of organisation required and ends sought. (Hayek may believe that this can be summarised through the price mechanism, but Scott’s metis (local knowledge) is rather less reducible than that.)

Central planning – or big data – may seek to make the complexity of local situations legible to systemised, technocratic thinking – but the two are essentially incommensurable. Talk of ‘big data’ needs to be visible as something bringing with it a particular modernist worldview, and alongside that a particular relationship of power over the specificities – places, people – represented as nodes and datapoints. Technology is rarely value-neutral.

*

This is not to say, however, that ‘big data’ is necessarily socially oppressive. Perhaps there are alternatives – I am still thinking this through.

In my recent Bugged Planet post, I drew attention to Indy Johar’s tweet where he noted “the asymmetry of personal data, open for the 99% & deep analytics for the 1%” [source]. This raises the question, what if the analytics were open to the 99%? What would this take, what would this look like, and would it actually redistribute power in any way?