Q&A with Stephan Holländer, scientific information specialist: Success of AI in academic libraries depends on good underlying data
Stephan will make a presentation as part of breakout on breakout on data behaviour
with Julian Schwarzenbach and Caroline Carruthers at the forthcoming CILIP Conference 2019
in Manchester, in which he will be focusing on the use of unstructured data in libraries.
We hear from other sectors of the information industry about using Artificial Intelligence (AI). Why do we hear so little in this respect from libraries on this side of the Atlantic?
For librarians this topic means a break with some traditions that were long established over the last century and are still practised in the present one. Librarians are first trained to handle structured data and have developed rules for library catalogues. With the introduction of the second generation of computer library systems, the rules of cataloguing were developed and adopted to the computer technology. With the introduction of the internet, librarians saw that other sources of information emerged and scientific books and journals became digital and virtual. Now we are on the brink of yet another change in the way libraries will work in the future. Not only structured information will be the core business of librarians but also the collection and analysis of data as well.
In what way have the working methods in libraries changed?
The huge growth of mobile devices, internet-connected devices, and applications on the market has drastically increased opportunities for data collection. As data is collected in libraries, librarians can use the information to develop products and services, improve marketing and communications, or monetize information. As libraries become more accustomed to serving digital resources and in converting their own holdings into digital media, they may also be in a position to create research archives from ongoing research projects. One of the most important assets to science is the scientific record, which includes the accumulated body of knowledge contained in books, journals, conference proceedings, transcripts of scientific meetings but also data banks, data sets, and online data from research projects. Some of these collections of data are freely accessible due to the open access policy outlined by national research guideline in different countries and at their universities, data repositories in the private sector have a more controlled or restricted access.
And how is Artificial Intelligence of use in this respect for libraries?
Artificial intelligence (AI) could become an invaluable tool for organising and making accessible large collections of information. Google’s Life Tags
project is a searchable archive of Life magazine photographs that used artificial intelligence to attach hundreds of tags to organize the archive. Another Google project, Talk to Books
, lets users type in a statement or a question and the system retrieves whole sentences in books related to what was typed, with results based not on keyword matching, but on more complex training of AI to identify what a good response looks like. Thanks to new devices and software librarians and patrons of libraries will be able to discover patterns in huge amounts of unstructured data to refine their search for relevant information to their question. Current library systems in use are not capable of working with a big amount of unstructured data. Such systems for library use have yet to come as ad ons to systems already in use.
Article continues below advertisement
What will be the benefit from analysing unstructured data with AI?
Pattern recognition is studied in many fields, including psychology, ethology, cognitive science and computer science. Pattern recognition is based on either a priori knowledge or on statistical information extracted from the patterns. The patterns to be classified are usually groups of measurements or observations, defining points in an appropriate multi-dimensional space. The components of pattern recognition are: data acquisition, pre-processing, feature extraction, model selection and training, and evaluation.
Are our present library systems capable of handling unstructured data and what would be the value-added if they did?
There I have to refer to the situation in my home country Switzerland. Present library systems and the library systems to be launched in 2021 in the majority of academic libraries in Switzerland are not yet able to handle unstructured data. But I am convinced that is about to change. We see already large repositories with scientific data, being created by scientific libraries. Like in the past, when the producers of library systems see the growing demand in a near future, they will react, offering ad-on’s that will supersede the present capabilities of those library systems, eg. the Open URL link resolver , developed by Herbert van de Sompel, Patrick Hochstenbach and their colleagues at Ghent University, when integrating web resources into the information research with library systems.
Library systems will then be capable of analysing patterns which might also lead to a diversification in how people are searching. Such research results could also be highly personalised. AI can potentially improve existing systems (supporting discovery) and enhance current activities (such as metadata creation). There is also the possibility of these systems and activities actually being superseded by AI systems thereby transforming the ways in which people locate information of relevance to them.
In our joint presentation, Julian Schwarzenbach as a data specialist, and I as an information professional, will explore this from the perspective of our two specialisations. We will look at the opportunities and the benefits for academic libraries when they start their respective activities in this fast evolving field.