Penny Andrews, Jo Bates and Paula Goodale talk about the Data Journeys methodology, developed at the Information School at the University of Sheffield, which aims to help people understand the diverse factors that shape data flows between teams, organisations and sectors.
Public and third sector organisations face increasing pressures to share and open their data in order to enable the development of more efficient and personalised services, and to keep sensitive data secure. Human factors are one of the most significant issues managers face when evaluating and addressing
data sharing challenges.
The Data Journeys methodology aims to help people to understand how societal and organisational cultures, public policies and economic issues come together to shape how data move or not between different teams, organisations and sectors.
In our research developing the Data Journeys methodology, we focused specifically on the journey of weather temperature data produced by public organisations and citizen scientists. We then followed these data as they moved through different sites of practice, from production on into re-use in different contexts including climate science and financial markets. The Data Journeys methodology allowed us to begin to capture some of the ways in which diverse social worlds are becoming increasingly interconnected and interdependent as a result of these complex data flows.
How did we get here?
The research underlying this work was developed on an AHRC-funded project: The Secret Life of a Weather Datum (SLWD), which Jo and Paula worked on with their collaborator Yuwei Lin (University of Stirling). This project aimed to pilot a new approach for better understanding and communicating how values and practices influence the transformation of weather data on its journey from production through to various contexts of ‘big data’ re-use; and, how these values and practices themselves transform as they interact with the data at various moments over the course of its journey.
Through focusing on the small in ‘big’ weather data by tracing the (metaphorical) journey of a single weather datum from its production to its re-use in different contexts, the project developed an innovative conceptualisation of ‘big data’ production and use that encourages engagement with the complex socio-cultural processes shaping contemporary data flows.
The project developed four distinct, but interconnected case studies, which enabled exploration of values and practices shaping weather data production, collation,
distribution and re-use across institutions of the state and market, and in the collective actions of citizen groups.
The case studies focused on:
1. The initial production of the weather datum by Sheffield’s Weston Park weather station and its journey to the UK’s Met Office
2. Re-use of the data in the weather risk and derivatives markets in the UK’s financial sector
3. Re-use of the weather data within key centres of UK Climate Science including the Met Office Hadley Centre
4. The intersecting journeys of data produced by amateur weather observers and citizen scientists that have been transcribing historical weather data from archived ship log books.
Developing Data Journeys: collecting data
We started by identifying a number of UK-based sites of weather data production, processing, and use. We then iteratively mapped the journeys of data between these and other relevant organisations, projects, datasets and individuals using post-it notes on flipchart paper.
We decided to focus on the journeys of data produced at our local weather station (Sheffield Weston Park) and data produced by amateur weather observers and citizen scientists. We then followed these data on their journeys from sites of production on into processing by the Met Office, and then on into re-use in climate science and financial markets.
In total, primary data were gathered in relation to eight sites of data practice: Sheffield’s Weston Park weather station, Met Office headquarters in Exeter, the Climatic Research Unit at the University of East Anglia, the Inter-governmental Panel on Climate Change (IPCC), archives that store historical weather observations, the Old Weather citizen science project, amateur weather observers in distributed locations, and a firm that supplies weather data to the weather derivatives market. We used a range of methods to collect our data: in-depth interviews incorporating an oral history component, field observations, digital ethnography of selected forums and Twitter hashtags and documentary analysis of policies, legislation and other relevant sources.
Understanding Data Journeys: what we found out
We found that even at the micro level of a digital datum – for example, the production of a temperature recording of 18.5℃ on 24 June 2014 – we could see how all the different elements involved in data production and sharing, evolving over time, seemed to become real at particular moments.
Through examining the journey of data between different sites, we were able to identify socio-cultural factors that enable and restrict the movement of data and note sites of potential movement, blocked movement and lack of movement.
Even when data can move between sites, they do not necessarily move smoothly or easily from one place to another – they experience ‘friction’ because they can’t be separated from their social and material contexts. This is clear in our examples of climate data sharing and attempts to recover historical weather data. Jo’s upcoming article on ‘The Politics of Data Friction’ in Journal of Documentation will explore this idea in greater depth.
Using Data Journeys helped us to examine the ways in which data producers’ cultural values become embedded in the data that they produce which impacts upon its reliability, biases, and whether or not it is created in the first place. This is particularly important in the current context of ‘fake news’, lack of transparency from governments and corporations, and the popular misconception that ‘the data can’t lie’.
Information Professional readers will recognise many of the factors that affect the movement of data, such as the design and implementation of public policy around the re-use of public sector information, organisational cultures, and public funding priorities; the way data is processed to be used and, the dynamic nature of ‘data journeys’ over time as organisational changes and the wider social context make their mark.
Sharing Data Journeys
When working on this project and the Data Journeys methodology, we also aimed to explore ways in which research data and findings could be shared with non-academic audiences.
Our initial findings have been published on a public facing website – http://lifeofdata.org.uk – that was developed as part of the project. The interactive website draws upon a tube map metaphor in order to represent visually the journey of data as they move between the different sites of data practice that we explored. Each of these sites is represented by a station on the tube map, and clicking on a station allows the user to dig deeper into the detail.
Within each station the user is invited to explore the different data practices, cultures, and public policy frameworks that contribute to the production of digital data, and shape their movement between, and use across, different sites. Where possible, original research data including audio interviews and photographic images are embedded into the website to bring the story to life. The website is designed to incorporate participants’ memories of particular moments during the evolution of data practices over time, which ‘humanises’ the data journeys.
Sharing of qualitative research data is still a tricky journey, but we uploaded what we had permission to share to the Internet Archive, and licenced it using Creative Commons licences (where permissions were granted by participants), see http://bit.ly/2gc8E0F
Everything developed as part of the project is openly licenced. The WordPress code is Open Source and the website is licensed under Creative Commons online. lifeofdata.org.uk
Engaging with our community
Members of the project team also contributed to the development of a guide to building a Raspberry Pi weather station – sheffieldpistation.wordpress.com. The Open Source code for the weather station was published on GitHub, and one re-user of the code has told us about the development of teaching-related projects he is involved in using the code. Tied to this activity, we also developed a short 10-15 minute ‘citizen science’ activity in which participants get to put together a Raspberry Pi weather station and transmit a temperature observation. About 250 people built their own weather stations and sent temperature observations to WOW – the Met Office amateur observers website.
The aim of our public engagement activities has been to spark curiosity and interest about data, coding, and technology, to help people understand the growing relevance of data within the economy and society, and to demonstrate to them that they can get involved in a variety of data practices and citizen science projects. This hands-on activity was an excellent opportunity for us to introduce participants to the concept of data journeys which is at the heart of the SLWD project, as well as more general issues around data sharing, open data, and citizen science. Quite a few participants had Raspberry Pis at home, but didn’t know how to get started – we provided a postcard with a link to the website on how to build and code a weather station from scratch.
What’s the next stop?
Project members have written journal articles (http://bit.ly/2g20LqU) and Principal Investigator Jo currently has two PhD students that are using the Data Journey methodology in projects on Smart Cities and NHS data governance. The team hopes to continue empirical work on Data Journeys. Jo is also working with Sheffield City Council to explore ways to develop the impact of the methodology outside academia. For example, through student projects, knowledge transfer partnerships, collaborative PhDs and more. If you’re an information professional working with data, data journeys (and collaborations with the team) could be exactly what you’re looking for!
Find out more and get involved
If you’re interested in applying the methodology, but lack the time and resources, contact the research team to explore possible collaboration opportunities by emailing Dr Jo Bates at firstname.lastname@example.org.
If you use Data Journeys in your own work, please let us know (and cite us!).
Penny Andrews (@pennyb) is a doctoral researcher in the Information School at the University of Sheffield and a research fellow in Politics at the University of Leeds.
Jo Bates (@j0bates) is a lecturer in Information Politics in the Information School at the University of Sheffield.
Paula Goodale (@PaulaGoodale) is a lecturer at the University of Sheffield.