In his classic 1963 study of the emergence of clinical medicine, Michel Foucault described the presence of what he called the ‘speaking eye’. His point was that novel analytical spaces had emerged in which new possibilities for observation were coupled with a new descriptive language. Over the last 20 years we have seen something similar happening with data analytics. The difference, as I explore in my new book The Data Gaze, is that we are all frequently, sometimes constantly, exposed to the gaze of the new codified clinic. The rise of data analytics has brought with it an unrelenting stare. New types of knowledge and norms are emerging as we become exposed to its roaming gaze. Whether we have noticed it or not – occasionally stories like those about Cambridge Analytica or a periodic hacking scandal make us momentarily aware – data analytics have taken on huge influence over our lives.
It is a dual process of distribution that makes the gaze of data analysis so hard to escape. This distribution works in two distinct ways. In terms of the materiality of these systems, they operate through distributed computation. The analytics are facilitated by the delegation of processing to different devices, which are nodes within large computational networks. Over the last decade a large assemblage of products have emerged for handling and managing this distribution. Open source software projects like Hadoop, managed by the Apache Software Foundation, have been crucial to this, as have other developments including YARN which enable delegation across these distributed networks as well as facilitating the isolation of data sources and nodes. There are numerous other products that work in different combinations – managed by data engineers – to handle the different data flows required.
This first computational stage of distribution means that ever greater amounts of data can be stored, accessed and analysed together. The data and analytics are distributed across these infrastructures. This distributed computing enables the analyses to run across different data sources and to operate in real-time. Far from being a centralised space, this means that this is a form of surveillance that is networked into our lives and is part of the material infrastructures in which we live. The data we produce are stored and cleaned in this codified clinic ready for analysis – data are also duplicated across nodes to limit loss and damage, which opens us up to scrutiny whilst also rendering things unforgettable.
More importantly, analytics are also distributed across organisations. We often think of data analytics as being performed by specialists in some dedicated office – with data viz produced that are then passed on to their customers. This does happen, but a good deal of the analytics industry is geared toward providing software ‘solutions’ that can turn us all into data analysts. Here the hyperbole about what the software can achieve mixes with the reality, yet the result is that data analysis spreads far beyond the specialist or the dedicated service provider and becomes embedded in the organisations and instantiations that surround us. Since the 1960s a new division of labour has emerged that is demarcating the various types of organisational roles of those involved in data analytics. These roles have slowly been built upon to form into discrete responsibilities, often based upon background, qualifications and the types of tools that the individual uses. This shows how these roles have become established in lots of different types of organisational structures, rather than taking place on the outside of those spaces.
As this might suggest, this is not a centralised form of power, it is distributed across both computing networks and across our organisations. The analytics industry is seeking to turn us all into data analysts and is therefore seeking to embed the gaze of data analysis into social structures of different types. Rather than being a distant presence, data analytics become embedded into the very structures of our organisations and lives. Its many protagonists lend it ever greater scope and vision. Many eyes look upon us through our data.
I discovered that the data gaze does not stop there, its pursuit of a kind of pure vision of the world means that it even seeks to watch over itself. The infrastructures and the practices of data analytics are based on a kind of double vision – a hyper-surveillance perhaps – in which it watches the data, and thereby watches us, whilst also watching over its own processes and systems. This is how data analytics build their legitimacy and persuade people of their objectivity and power. Norms are concretised though the persuasive power of this double focus. It gets its legitimacy from the pursuit, if not the achievement, of a self-observed and ever more tailored codified clinic. Powerful ideals and visions blend with the material realties and practices. It is hard to separate the two, particularly as those ideals – of amped up competitiveness, objectivity, efficiency and speed – provide the justification and direction for the material realities.
When we take all of this together, we get a vision of a form of knowledge that is based upon an ability to see directly into organisational structures and individual lives. The distribution of a data gaze across computing networks and organisational structures has produced a decentred form of surveillance that is acting upon us from different angles. Through these processes of distribution, Data analytics spiral outwards into the structures of our lives. They don’t operate at a distance or within dedicated spaces, they operate within these ordinary structures that frame our lives. What we are looking at with data analytics is the way that power operates in what has been described as ‘platform capitalism’ – it is in data analytics that the forces of those shifting dynamics are felt. It is the dual process of distribution, both computation and organisational distribution, that makes the gaze of data analytics so hard to escape.