Until relatively recently research into how digital resources are used within the academy has largely focused on e-journals. It focused on the analysis of the digital ‘fingerprints’ left by the users of the electronic journal databases, a methodology we call deep log analysis. However, the virtual scholar uses a much wider range of digitally delivered content and there is a danger that all scholarly information seeking is defined or coloured by what has been witnessed in the e-journal environment.
As a first step towards obtaining a more rounded picture of how digital resources are used, Ciber’s Virtual Scholar Programme is subjecting e-books to the same robust evidence-based methods used for journals. In the belief that e-books have the potential to transform the scholarly environment, the project is called SuperBook.1
The lure of the e-book
E-books are enormously attractive to very large and important academic communities largely untouched by the arrival of e-journals, which has revolutionised the information-seeking behaviour of many academics, scientists and researchers in particular. They should be very useful to the biggest scholarly community of all – students, who typically have difficulty obtaining key textbooks and readings, and to whom journal articles are of very limited value because of their research orientation and speciality. They will also be welcomed by scholars in the Arts and Humanities (and probably in some of the social sciences, e.g. Anthropology), whose research tends to be written up in books, not journals.
So we have a large academic community, conducting much of their daily lives online but largely untouched by the scholarly digital revolution. Sit back and watch the e-book rush.
There is another reason why e-books are likely to light the scholarly communication touch paper. They are extremely popular right across the academic spectrum and until relatively recently it has not been easy to exploit their contents with the speed and ease which most digital consumers expect. Books provide consolidated, authoritative, established and organised information for which there is a huge consumer demand.
But the knowledge in books has been underused because of the difficulties of finding content and the inconvenience of having to trawl through far more information than is needed. In an electronic form, their contents can be more readily accessed – keyword searching, chapter-level descriptions and abstracts mean their contents can be exploited in a way just not possible before. Chapters, paragraphs and sentences are now the unit of consumption and, as we know from our e-journal studies, this means they will appeal to the digital information consumer, especially students, who prefer bite-size chunks of information.
The expected popularity of e-books will come with as yet an unknown price for university libraries. Fewer people will visit the library. Academic libraries tend to occupy an enormous amount of premium space and this space is largely filled by books and students. Provide access in students’ dorms, bars and recreational space to the relatively small number of books they need and who would bet against a huge drop in library visits? This, of course, will then lead to questions about whether the library needs all the space it occupies.
While the advent of e-books in numbers will mean libraries will become more remote from their users, it will also mean that publishers will become ever closer, because they will have all the knowledge of how the user behaves – the users’ footfalls now take place in their virtual space.
Furthermore, they can offer the products directly to the user. Indeed, with e-textbooks, e-monographs, e-journals and e-reference works being bundled together in some publishers’ offerings it might be the publisher who will provide the e-library experience.
While there has been much talk about the great market potential for e-books, there has been little in the way of evidence-based user studies which would support academic and publishing decision makers. There are many self-report studies2 but there are clearly considerable dangers in asking users to comment on the impact and future of a newly emergent technology of which they have little experience or understanding.
The SuperBook project
SuperBook is an action research study funded by Emerald and Wiley publishers, which involved ‘dropping’ more than 3,000 carefully selected e-books from OUP (Oxford Scholarship Online), Wiley (Interscience), and Taylor & Francis into the University College London (UCL) information environment, with minimum fanfare, and then assessing, by means of deep log analysis, what happened.
What in effect was being created was an e-book observatory in which behaviour could be observed by librarians, publishers and academics, and changes introduced and then evaluated. This article concentrates on just one of the e-book collections, that of Oxford Scholarship Online, the one we have done most work on.
Oxford Scholarship Online (OSO) is a cross-searchable library containing the full text of more than 1,200 OUP books in the areas of economics and finance, philosophy, political science, and religion. Specially-commissioned (author) abstracts and keywords are provided at book and, unusually, chapter level. In the case of OSO the e-books are e-monographs rather than e-textbooks, so it would be expected to appeal to all humanities and social science scholars – staff and students.
Conventional usage data, of the kind provided by publishers in Counter-compliant form for libraries, can only provide periodic, broad and shallow indicators of activity, whereas, in this project, deep log analysis (DLA) provides a detailed and real-time assessment of the information-seeking behaviour of user communities. This data can be used to help determine impacts and outcomes through further qualitative means.
DLA involves the processing of huge volumes of usage and search data provided in the raw transactional logs of publishers/aggregators. This is then related to user demographics to provide a whole range of evidence-based user portraits – hence the word ‘deep’. In turn, this information provides the foundation for follow-up user surveys and interviews (planned for the autumn).
The e-logs can furnish a whole range of metrics that could be used to profile the information-seeking behaviour of virtual scholars in connection with searching and viewing e-books.
Usage was measured in the following ways:
1 Number of pages reviewed
2 Number of sessions conducted
3 Site penetration (number of views per session)
4Time spent viewing a page
5 Duration of a session
6 Number of pages printed.
Use was related to these user characteristics:
1 Subject I – defined by subject of book viewed
2 Subject II – defined by sub-network label of server used (which provided departmental location)
3 Geographical location of user (on-site/off-site)
4 Academic status (staff, student – defined by sub-network used, e.g. halls of residence = student users).
The following information-seeking characteristics were evaluated in depth:
1 Referrer link used (e.g. Google)
2 Form of navigation to content adopted
3 Age of book viewed
4 Individual e-book titles used
5 Number of e-books used
6 Scatter of usage over titles available
7 Whether the books used were catalogued or not.
OSO was introduced to UCL in November 2006 and this paper covers usage over the period January to March 2007. The e-books had had sufficient time to bed down, and during this period their profile was raised, largely by giving them greater prominence on the library web page.
Findings
OSO obtained relatively high levels of use, despite the fact that e-monographs were a relatively new information resource for most scholars and that they had very limited promotion. Nearly 11,000 pages were viewed during the three months. More significantly, most of the people who found the service viewed a relatively large number of pages, certainly by e-journal standards. In well over half of all online sessions more than four pages were viewed, and in one in nine more than 10 pages were viewed – high levels, especially when you take into account that an HTML full-text ‘page’ is the equivalent of five printed pages.
A good proportion of users were obviously taking advantage of the rich choice of titles available, with one quarter of sessions involving more than three books. Furthermore, the percentage of books available that were used at least once was about 40 per cent for economics, philosophy and politics.
The figure was only 21 per cent in the case of religion, something which is explained by the fact that UCL does not have a department of religious studies. Political science recorded the highest percentage of titles which had their pages printed, a sure sign the user has found something interesting.
On average, 15 online sessions a day were undertaken, with a maximum of 60 being recorded on 19 March. Sessions typically lasted more than three and half minutes, which demonstrates clearly the relatively short time users spent online on any single site.
Use highly concentrated
Overall, use was highly concentrated, with just two of the over 1,200 titles available accounting for more than 12 per cent of the page views, and the top 20 titles accounting for 43 per cent of usage. These are figures that hint at the true potential of the e-book.
Levels of use, and the subjects involved, varied considerably from month to month. More than half of all the views made in March, and well over one-third in February, were of political science titles. However, this subject accounted for just a fifth of views in January. This could be because we are dealing with resources that support courses and programmes which might last just 10 weeks and have very different rhythms.
Mondays accounted for 20 per cent of weekly page views, Saturdays the lowest at seven per cent. Unexpectedly, Sundays recorded a relatively high level of use (14 per cent).
Over two-thirds of usage took place on site within UCL. There were quite big differences in the information-seeking behaviour of on-site and off-site users. Off-site users adopted a more direct approach, with nearly three quarters of their views being to full-text pages; the figure for on-site users was just over half. Furthermore, off-site users were more likely to view an e-book, with only one in 10 sessions not recording a view to a book, compared to four in 10 for on-site users (those users not viewing a book would have just looked at the homepage, a subject list or help page).
These differences may reflect access expectations on the part of the users in regard to a new, innovative resource. Users within UCL may have a stronger expectation that the service would be continued to be offered and may just have checked the site out in anticipation of using it in the future. Those people accessing via a search engine were most likely to record more views in a session and were more likely to view text pages. Perhaps, in the relatively early days of e-book access, search engine users did not expect to find the material free, and were eager to grab the opportunity to view or squirrel away pages while the going was good.
UCL scholars could use a www (external) search facility to locate OSO e-books or the (internal) search facility. In general those people employing the internal facility used about 2.1 words when composing a search expression, and external search engine users 3.3 words. There were also differences between the subjects of the book viewed. Those searching externally and finding political science titles and economics and finance titles used about one more word in their search expressions as compared to philosophy and religion, respectively four and three words.
A major difference between e-books and e-journals is that in the latter we see a very high concentration of use in the most recent articles, sometimes as much as 60 per cent in the most recent two years, with usage rapidly tailing off with time. In the case of e-books, the most recent two years accounted for 17 per cent of views and 45 per cent of use was accounted for by titles published three to six years ago.
This could be because: students are not so pre-occupied with the most current material; it takes a while for a book to become a standard text; lecturers are laggards when it comes to updating reading lists; social science and humanities titles do not become obsolete as much as their counterparts in science.
However, current books received the most menu-only views, suggesting they have a current awareness use. There were interesting subject differences, with economics and finance, surprisingly, recording the greatest use of books aged over 11 years (23 per cent of content viewed). Philosophy, also surprisingly, made the greatest use of current material and about 18 per cent of use was books published in the current (2006) period.
Perhaps the most significant and interesting finding was that catalogued books (one-third of the books were randomly catalogued) were much more likely to be used. UCL catalogued e-books attracted more than twice the usage of non-catalogued ones. Clearly the catalogue is where people look for books, and lecturers are unlikely to recommend readings if the books are not in the UCL library book catalogue.
For users searching within UCL, in some circumstances, it was possible to identify where people where searching from, whether they were students (in the case of the halls of residence) and what departments/discipline groupings staff were searching from. Fourteen per cent of UCL usage related to the student halls of residence network. In departments, Philosophy stood out and this can be partly explained by the fact that OUP is acknowledged to have the best philosophy ‘list’ of any publisher. More surprisingly perhaps, the genetics network came second, demonstrating when book content is made readily available the appeal is wide.
It is very early days in the roll-out of e-books and it will take time for things to settle down. There is a tremendous amount of volatility and this might persist for a while as the news about e-book access spreads and new audiences are created.
However, we have seen enough to know that e-books will become very popular and that behaviour is quite diverse, especially in regard to subject field. It is also clear that the information-seeking behaviour that has been described differs from that associated with e-journals – e-books are used more intensively and older titles are of interest. E-book use may well affect e-journal use, as users avail themselves of the increased choice.
Jisc is to replicate the SuperBook project on a national scale (the National E-books Observatory Project): the usage and impact of more that 30 core e-textbooks will be monitored in more than 100 British universities over a period of one year. [See article by Milloy, November Update p. 32.] J
References
1 www.ucl.ac.uk/slais/research/ciber/superbook/
2 e.g. C. Armstrong, L. Edwards and R. Lonsdale. ‘Virtually there? E-books in UK academic libraries.’ Program: Electronic Library and Information Systems, Vol. 36, 4 (2002), pp. 216–27;
Heting Chu. ‘Electronic books: viewpoints from users and potential users.’ Library Hi Tech, Vol. 21, No. 3, 2003, pp. 340–46;
M. Langston. ‘The California State University E-Book Pilot Project: implications for cooperative collection development.’ Library Collections, Acquisitions, & Technical Services, Vol. 27 No. 1, 2003, pp. 19–32;
M. Levine-Clark. ‘Electronic book usage: a survey at the University of Denver.’ Portal: Libraries and the Academy, Vol. 6, No. 3, 2006, pp. 285–299.
Further reading
D. Nicholas et al. ‘What deep log analysis tells us about the impact of big deal: Case study OhioLink’. Journal of Documentation, 62 (4), 2006, pp. 482-508.
David Nicholas (david.nicholas@ucl.ac.uk) is Professor at the School of Library, Archive & Information Studies at University College London. He is also Managing Director of Ciber. Paul Huntington is a Research Fellow at UCL and a Managing Director of Ciber. Ian Rowlands is Senior Lecturer at UCL Slais, and founding member of Ciber.