Toxic data

News & Press: News

Toxic data

19 November 2019
Posted by: Rob Mackinlay

Toxic data

Five years ago, just before Christmas, I spat into a test tube then slipped it into an envelope, dropping it into a postbox already stuffed with seasonal greetings cards. A few weeks later I received an email letting me know that my personalised DNA test results were available, and I was able to log in to the testing company’s website to view them. The email and the website warned me that genomic testing can sometimes reveal unpleasant and unwelcome information, and not to take any drastic action impulsively. Sage advice after tragic cases such as the woman who had a preventative double mastectomy before discovering there was an error in her results.

As I’ve written before in Information Professional, my family has a mutation in the OPA1 gene, which interferes with vision. I could easily imagine someone opening their DNA test reports to discover something like this which could be life changing – such as a positive result for the BRCA mutations that make them more likely to develop breast cancer. As it happens my own results were clear, although there was a treasure trove of fascinating information alongside them. Alongside the reports there was the option to download my DNA sequence dataset. Of course I immediately did this, and just for fun I posted a copy to GitHub, the Microsoft owned site where people collaborate on open source coding projects. Now I’m an open source human – you can grab a copy of my DNA and clone me. Just think of it as an early Christmas present from me to you, dear reader! A few years later that DNA testing company, 23andMe, inked a deal to share customer DNA sequences with pharmaceutical giant GSK to accelerate the drug discovery process. This was a truly fascinating moment – we often give exquisitely sensitive personal data to apps, but how often do we stop to consider how likely it is that they will sell it, share it or even leak it accidentally?

If this seems overblown, recall the recent Privacy International investigation which found that a number of period tracker apps shared some very private data with Facebook - like whether you are trying to get pregnant, health and wellbeing, whether you are using birth control and even (assuming you record it) when you have sex. Perhaps it’s no wonder some people have reported online advertisers seem to magically know they’re expecting a baby? At Jisc we recently supported the All-Party Parliamentary Group on Data Analytics enquiry into Technology and Data Ethics, which came up with a number of very interesting ideas for improving trust and transparency in personal data. My personal favourite was the idea of a food labelling approach, with apps listing the personal data they consume and what they do with it – perhaps with a traffic light to signal how sensitive the data is. Just picture that being a precondition for developers to get their app listed in the Apple App Store or the Google Play Store.

And as I wrote in Information Professional earlier this year, we are on the cusp of a revolution in the way we store and use data, with the first commercially viable DNA-based data storage and retrieval systems only a few years away. Soon enough we will be encoding not just our photo collections but our most intimate data using technology that allows us to retain everything, and it will become increasingly important to figure out what to retain and what to dispose of. But perhaps we’re already at the stage where the accumulated personal data we already have has the potential to become damaging. Much as an oil spill could cause serious environmental damage, under the General Data Protection Regulation (GDPR), the punitive sanctions associated with a “data spill” could prove ruinous for individuals and organisations alike. My bet is that there will be a whole new role for librarians and information professionals in ensuring that labelling and metadata makes the data detox process as painless as possible.

Contributor: Martin Hamilton @martin_hamilton, Futurist at Jisc

Published: 17 November 2019

More from Information Professional

This reporting is funded by CILIP members. Find out more about the

Benefits of CILIP membership