Seeing Like A Database: the problems with big data

“Big data” has been one of the buzzwords of 2011, and grand claims are being made for its power:

The world is becoming data-ized as digital information and numerical measurement is being applied to all aspects of what people do, particularly things that couldn’t be measured before because it was impractical or impossible. (Think: using wireless and GPS in cars to base insurance premiums on where and when people actually drive, as has been possible since 2007.)

The impact will be as profound as the scientific method in the 18th century — which quickly moved past the sciences and left its mark on all areas of human endeavor. For instance, what is “quantitative decision making” in management, if not the scientific method applied to business…. Likewise, the BigData revolution is plowing through the sciences, and also jumped into mainstream areas, such as business and government.

Data; boring but… by Ken Cukier, 6 March 2011

The problem with these claims is that they conflate increased power to capture and store data with (i) being able to extract meaningful insights from it, and (ii) being able to successfully act on and implement these insights, with (iii) no unexpected or adverse effects. Clearly cracking the first part doesn’t save the world on its own.

Further, big data evangelism often trips over into technocratic thinking, a belief that ‘nerdpower makes right’. Excitable blogposts about exabyte datasets, rather than defining the right problems to solve. Wide-eyed admiration for the amount of data that can be gathered, without recognition for the ethical rightness (or otherwise) of doing so.

Which is to say that big data is fundamentally political. Whether we choose to theorise it as technology or knowledge [actually, there's a good PhD proposal...], the act of recording the world in this way privileges particular values, worldviews and types of action.

In his blog post Lessons of the Victorian data revolution, Pete Warden insightfully makes the connection with technocratic thinking and brings in that great study of central planning, James Scott’s Seeing Like A State:

James Scott’s “Seeing Like a State” looks at the legacy of the Victorian scientific revolution, and shows how the very success of its ideas had a dark side. [Similarly,] Creating datasets may help technical people [...] to understand problems and propose solutions, but it also means that [...] other people with deep, lived experience of the domains will be overruled. In the 20th century the prestige of the scientific toolkit was used to justify disasters like the collectivization of agriculture, as technocrats around the world wielded numbers to take power away from “inefficient” smallholders. Those figures were mostly proven bogus by reality, as plans with no knowledge of conditions on the ground failed when confronted with the wildly variable conditions of soil, weather and pests that farmers had spent a lifetime learning to cope with.

Lessons of the Victorian data revolution by Pete Warden

If you’ve not read ‘Seeing Like A State’, incidentally, I recommend it. In it Scott surveys the great utopian schemes of the 20th century, from Le Corbusier’s urban planning in Brasilia to Russian collectivisation of agriculture and China’s Great Leap Forward. Each well-intentioned and yet spectacular failures, with millions of deaths. His argument is that centrally-managed planning does not work because it rides roughshod over the complex interdependencies on the ground.

Perhaps, under ‘big data’ ideology, we might ask – is this not simply a problem of too little information? We have the capacity to measure everything now – did Corbusier or Stalin fail because their data was not sufficiently granular?

Scott would disagree. The problem at hand is not quantity of knowledge but its very type. Common to each central planning disaster is a belief in a high-modernist ideology claiming that science can improve every aspect of human life, and an authoritarian central power willing to effect large-scale re-orderings of society and nature. “Big data will solve everything” can, clearly enough, be another iteration of the same. Scott – and, in fact, Friedrich Hayek’s criticism of centrally-planned economies (The Rule of Serfdom, 1944) – is that this disregards local and personal knowledge (Scott might add, embodied and tacit knowledge), and the complex diversity of organisation required and ends sought. (Hayek may believe that this can be summarised through the price mechanism, but Scott’s metis (local knowledge) is rather less reducible than that.)

Central planning – or big data – may seek to make the complexity of local situations legible to systemised, technocratic thinking – but the two are essentially incommensurable. Talk of ‘big data’ needs to be visible as something bringing with it a particular modernist worldview, and alongside that a particular relationship of power over the specificities – places, people – represented as nodes and datapoints. Technology is rarely value-neutral.

*

This is not to say, however, that ‘big data’ is necessarily socially oppressive. Perhaps there are alternatives – I am still thinking this through.

In my recent Bugged Planet post, I drew attention to Indy Johar’s tweet where he noted “the asymmetry of personal data, open for the 99% & deep analytics for the 1%” [source]. This raises the question, what if the analytics were open to the 99%? What would this take, what would this look like, and would it actually redistribute power in any way?

  • http://twitter.com/sedicious Sedicious

    Yes.  Google “sousveillance” and “transparent society” for more thoughts along this line.

    • http://twitter.com/hautepop hautepop

      Ah, that’s the topic of the next blog post I plan to write!

      The key issue to me is, however, that there’s a massive asymmetry of resources between what I (as a regular individual) can monitor and what companies and governments with much greater time, technical and expertise resources can do. It’s not a level playing field, and as such is very likely to compound existing asymmetries of power more than it can disrupt them.

      Then, additionally, there is @Michael’s point above – by turning myself into an object of sur/sousveillance I am necessarily submitting to a process of reduction, constructing myself as an object in a database rather than a phenomenological subject. Who gains most from that? Those who benefit from an increasingly technocratic worldview – which is to say, probably not the individual subject.

  • Michael

    @Sedicious

    I’m not convinced that sousveillance is the solution to this problem. Decentralising the power to collect and analyse data would be a step away from central planning, but it wouldn’t take us outside the mindset that makes central planning so appealing, which is the data-driven mindset.
    If central planning’s claim is that more data will empower the state by enabling it to make better decisions, then sousveillance’s claim is that more data will empower individuals by enabling them to make better decisions. Both positions assume that what we need for better decision-making is more data.

    On the surface that’s a hard assumption to challenge: who could argue that less data would ever be preferable? But what the data-driven mindset overlooks is that choosing to treat the world as data excludes other forms of power-knowledge. When we ask for more data, we get less of something else. Treating the world as data – even Big Data – is an act of reduction, regardless of whether the world’s reduced to database rows in a government server room or database rows in a million smartphones.

    Central planning might take as its motto “When one person dies it’s a tragedy; when a million people die it’s a statistic.” In that case perhaps the motto of sousveillance should be “When one person dies it can be a statistic too.”