This article is from the July/August 2004 issue of Update.
The government published the first version of the Electronic Government Metadata Standard (e-GMS)1 back in 2002. Now at version 3 the abbreviation is well known, but what exactly is it?
Metadata is ‘data about the data’. In our terms, it’s the library catalogue, including information about the items on the shelves, such as the author, title and publisher. This is metadata — just enough information about a document to tell you what it is, and where to find it.
The term ‘metadata’ is more readily associated with websites — but this is only a small part of what the e-GMS has to offer. If you are a public sector organisation, you should be looking at the e-GMS.
Quite simply, the e-GMS is the Office of the e-Envoy’s guide to what metadata should be used in public sector applications and websites. It is based on the Dublin Core metadata standard ISO15836:2003.
The e-GMS contains 25 elements of metadata with more than 90 refinements. It is designed to be used by any public sector organisation, and therefore not all the elements will apply to everyone.
In fact, the standard recommends you create a ‘local’ version of it to use in your organisation. The Accessible & Personalised Local Authority Websites (Aplaws) project2 has created tools for implementation of e-GMS on local authority websites including refinements to the e-GMS data elements, a content management system and local government categories that map on to the Government Category List (GCL).
Why have an e-GMS?
The e-GMS is part of the government’s grand scheme to have government services e-enabled by 2005. The idea is to get all public sector websites and applications giving ‘like’ data so we can share information. The metadata standard is part of the government’s Electronic Interoperability Framework (eGIF).
For instance, if you want to share your MARC21 library catalogue with someone who already has a catalogue, but isn’t using MARC21, you have to look at the fields in both catalogues and ‘map’ the ‘like’ fields together. Your ‘Author’ field could map to their ‘Creator’ field for example. If, however, you were both using MARC21, the job would be considerably easier as you’d know instantly what ‘maps’ to what. This is the thinking behind the e-GMS — to get everyone working from the same starting point.
The same is true for websites, but here the thinking is slightly different. As anyone who has used a search engine will know, when you do a search only a few of the sites you get back are relevant to you. However, if everyone were to add metadata to their web pages, then searches would retrieve more relevant hits.
In addition, if everyone used the same terminology to describe their content you could get similar, relevant content from a variety of sources. This is something librarians have been doing for years — classifying items, using thesauri to describe resources. Through the e-GMS, the government is applying this to a wider audience.
You will also find ‘data entry formats’ under the encoding scheme heading. Data entry formats are used to ensure a consistent approach to the way you add data to an element. For example, a date can be entered in a number of ways (07/08/04 or 7/8/2004 or 7 Aug 2004, etc). Your data entry format will ensure you pick one way of adding a date and stick to it.
The preferred way to enter a date in terms of the e-GMS is CCYY-MM-DD, e.g. 2004-08-07, as this is the way most computer systems use dates.
The government has created the Government Data Standards Catalogue3 to help specify the data entry format for most ‘data’, from how to enter a person’s name and organisation to how to enter monetary amounts.
- Resource
‘Resource’ is the name given to the actual thing you are applying metadata to, be it a hardcopy document, a photograph, a CD-Rom or an electronic file.
- Metadata schemes
Metadata schemes define the data elements and refinements you use to add metadata to a resource. The e-GMS is one such scheme, and Aplaws can be seen as an application profile based on e-GMS.
The e-GMS contains only four mandatory elements:
- Creator
- Date
- Subject (only the refinement of Category is mandatory)
- Title
For each ‘resource’ you have — you must have the above four elements. In addition there are three elements that are mandatory if they are applicable:
- Accessibility
- Identifier
- Publisher
And a further two elements are recommended:
Encoding schemes
The standard does not sit alone. The Office of the e-Envoy has created several standards to be used with the e-GMS, such as the e-GMS Audience Encoding Scheme (e-GMSAES) and the GCL.4
The Office of the e-Envoy goes one step further in order to help you maintain these encoding schemes by offering the GCL as a ‘.csv file’ or in an XML format, as well as updating it twice a year. Such support means you can easily import the encoding scheme into your information management systems and enable creators of the metadata to choose from drop-down lists to speed up the input process.
While the e-GMS does provide as much help, offering its own encoding schemes as well as others from the Public Record Office (PRO) to ISO 639-2,5 it will not cover every aspect of what you need. The e-GMS does not pretend to cover everything. Instead it offers a basis from which you can work and add to as needs arise.
Take the Subject element. The refinement of ‘Subject.Category’ is mandatory and you must have at least one term from the Government Category List for it. However, there are other refinements for Subject that you can use your own local thesauri/taxonomy for: Keyword, Person, Process Identifier, Programme and Project.
Indeed the Local Authority Websites (Laws) project6 has done considerable work to extend the GCL to a Local GCL (LGCL) version. This local version has 300+ more terms in it specifically identified as of use to local authorities. Entries are mapped to a GCL term in order to help users see which term from the GCL they should use.
Applying the e-GMS
So we know what metadata is and what the standard is, but how do you add such information to a web page? Well, in Hyper Text Mark-up Language (HTML) it appears at the top of the code, usually within the <head></head> tags using the following syntax:
<META name=‘[metadata scheme].[Element name].[Refinement name]’ scheme=‘[name of encoding scheme]’ content=‘[data]’>
So for example you could have:
<META name=‘e-GMS.Subject.Category’ scheme=‘GCL’ content=‘Libraries and archives’>
The above example shows that we are using the e-GMS for this particular bit of metadata. The Government Category List is used to control what appears in the ‘content’. The e-GMS stipulates that you must have at least one GCL term in the Subject.Category element/refinement.
If you want to use more than one Subject.Category, you can separate the [data] entries with a semicolon, although the e-GMS would prefer you to create a new line of HTML code for each term used.
If you are of a more techie inclination then you might be interested in metadata in XML. You should visit the GovTalk website,7 which has more information and several XML schemas (not to be confused with metadata schemes) as examples.
The Dublin Core website also has a number of guidelines for implementing metadata in XML.8
Instead of ‘tagging’ webpages with metadata, the e-GMS shows you what database fields you should have for a database application. If all public sector organisations apply the standard, exchanging data between them should be easy. In library terms it’s like using the MARC21 ‘schema’ in order to share your library catalogue with others.
In database application terms some of the elements of the e-GMS take on new meaning — for example ‘Date.Submitted’. Does this mean the date the record of metadata you are creating is submitted to the ‘system’ or does it mean the date the resource was submitted to the collection? Then again, isn’t that ‘Date.Acquired’? Do you need to have both versions? It doesn’t actually matter all that much, as long as your interpretation isn’t in conflict with the notes given in the e-GMS itself and you are clear about what each element will mean to you. You should also have the meaning documented and the document passed to the Office of the e-Envoy as a local version of the e-GMS.
What if you are already using another metadata scheme? Fear not! The e-GMS gives indications for each element as to how it maps on to other popular metadata schemes such as Dublin Core and the Australian Government Locator Service. So if you are already applying metadata to your website, but need to make sure you are e-GMS-compliant, it is easy to find out. Because the e-GMS follows existing international standards you may well find you are already compliant.
There are several reasons why the e-GMS is a good thing for libraries and archives. There are several key elements to do with records management, the Freedom of Information Act, and the Environmental Information Regulations.
Records management
If you are involved with records management or archives within your organisation — even if you aren’t in the public sector — you will be interested in the Preservation, Disposal and Rights elements in particular. All have several refinements that allow you to gather information about when the resource should be reviewed for potential disposal, what will happen to it if it is disposed of (e.g. sent to the PRO), and what you need in order to be able to see the resource.
This last point will be particularly useful as we move further into the digital archive age. The preservation refinements allow you to note what equipment or software you will need in order to read the resource in the future. Alternatively, if you have digitised a resource, you can note the format it originally came in.
In addition, with the increasing use of automated systems to ‘catalogue’ resources, pulling out such metadata automatically is a goal most people are aiming for. This will help libraries in the long run as it will make ‘creators’ of resources think about metadata at the moment of creation, and thus speed up the process of adding metadata or cataloguing a resource.
Templates for documents can be created to allow the automatic collection of metadata so that when a document is submitted to an internet or intranet site most of the work of adding metadata is already done.
The real challenge for librarians and archivists is to maintain the integrity of the metadata collected — especially if the responsibility for applying metadata to a resource is devolved to the creator of that resource. While all public sector organisations are struggling to meet the government deadline of 2005, librarians have an opportunity to influence records management and company-wide policies.
References
1 Electronic Government Metadata Standard (e-GMS) version 3. Office of the e-Envoy, 2004 (www.govtalk.gov.uk/schemasstandards/metadata
_document.asp?docnum=872).
2 www.aplaws.org.uk/
3 Government Data Standards Catalogue. Office of the e-Envoy, 2003 (www.govtalk.gov.uk/gdsc/html/default.htm).
4 Government Category List. Office of the e-Envoy, 2004 (www.govtalk.gov.uk/schemasstandards/gcl.asp).
5 Codes for the Representation of Names of Languages Part 2: Alpha-3 Code ISO639-2 (www.loc.gov/standards/iso639-2/langhome.html).
6 www.laws-project.org.uk/
7 www.govtalk.gov.uk/schemasstandards/xmlschema.asp
8 Andy Powell. Dublin Core Metadata Initiative, 2003. Guidelines for implementing Dublin Core in XML (http://dublincore.org/documents/dc-xml-guidelines/).
Liane Broadley is an Internet Content Librarian at the Royal Borough of Kensington and Chelsea and has been looking at how the e-GMS could be implemented in the authority (liane.broadley@rbkc.gov.uk).