Exploit Interactive HomeHomeSearch
Issue CoverEditorialFeaturesRegular ColumnsNews and EventsEt cetera

SEAMLESS: An Organisational and Technical Model for Seamless Access to Distributed Public Information

Mary Rowlatt, Cathy Day, Jo Morris and Robert Davies describe SEAMLESS: An Organisational and Technical Model for Seamless Access to Distributed Public Information. The article outlines some of the lessons the research team have learned during the course of the project and also indicates how Essex Libraries are planning to continue to develop and maintain SEAMLESS once the LIC funded research project is completed.

Introduction

SEAMLESS logoThe SEAMLESS project [1] aims to develop a new model for citizens' information – one which is distributed, and based on partnerships and common standards. The project, which is funded for two years by the Library and Information Commission under the former BLRIC Management and Co-operation Programme, started in February 1998. Currently the team are working with 29 local organisations, covering a wide range of sectors, to develop the necessary standards and set up a prototype system.

The objectives of the SEAMLESS project are to:

The team have developed a SEAMLESS interface to the diverse data resources provided by the participating organisations. Initial targets include information pooled locally on the SEAMLESS server using Z’mbol [2], a number of remote databases using Z39.50 [3] or Z’mbol, together with resources which exist as web pages (or as word-processed documents which have been converted to HTML format and have had the SEAMLESS web page metatags added) which are retrieved and indexed using HARVEST [4].

Number of Participating Organisations

One of our concerns when we started the project was that other information providers may not be keen to work with us. In fact the opposite proved to be true. In many respects the SEAMLESS project has been an exercise in managing expectations and keeping participating organisations down to a manageable number. Our original proposal stated that we would work with 6 to 12 local organisations to develop a prototype distributed citizens' information system, however we are currently working with 29.

There are, we think, two reasons for this. Firstly SEAMLESS arrived on the scene when the time was "right". SEAMLESS chimed well with the emerging government agenda for the Information Age, and with many of their key initiatives - partnership working, modernising local government and empowering citizens. Also, we arrived at a time when organisations were becoming more aware of the potential of ICTs and the Internet and were looking for new and more effective ways of working. In this context SEAMLESS seemed to offer them some benefits.

Secondly we have been able to achieve good publicity and a high profile locally. This has been effective in creating momentum and kudos for the project. Instead of having to persuade organisations to join the project we found that we actually had to turn organisations away in order to keep the project manageable. The downside was SEAMLESS was sometimes seen, by quite influential people, as a magic solution that could solve any information problem and we sometimes had to work hard to bring their expectations down to a more realistic level.

Sophistication of Local Information Systems

When we wrote the proposal we assumed a certain level of sophistication in the information systems in use by local information providers. We anticipated that most information providers would be using databases, a significant number would have their own web sites, some would be running servers connected to the Internet, and maybe a few would be running Z39.50 compliant servers.

What we found was a much more complicated picture. Many of the smaller agencies and voluntary organisations had little more than word processors. The system therefore had to be designed to cope with word-processed documents as well as databases and web sites.

Quite a few organisations had websites, but in many cases they were contracted out to external bodies which not only managed and hosted the site for them, but also created the content. This added a further level of complexity to meetings and discussions. It also meant that the organisations themselves had to pay to get the work done. We found that this could sometimes be a disincentive – in general we found it easier to persuade organisations to commit staff time to the project rather than hard cash.

One area where anticipated problems did not materialise was with the data itself. We were aware of a number of research projects which had been funded under the JISC's eLib [5] and the Telematics for Libraries (EU) [6] programmes with the aim of searching collections of distributed resources. However, these had largely focused on bibliographic data – the catalogues of academic libraries, museums and archives. Catalogue data is by nature very structured and we were not sure that a similar approach could cope with the huge variety of unstructured data we expected to find in the domain of citizens' information. However, so far at least, the SEAMLESS profile has been able to accommodate everything we've wanted to add to the system.

Distributed vs. Centralised Citizens' Information Systems

When we started the project we were thinking of two models for citizens' information. The traditional library based, centralised database of community information and a new, library led, distributed citizens' information system. Citizens' information seemed to us to be a rather more powerful and active construct than community information in that it consists of the actual data itself, rather than signposts to the organisations providing it, and it encourages and facilitates direct interaction between the user and the provider through the provision of interactive services and communication facilities.

Now, however, the picture appears somewhat more complex and fragmented. Firstly, Essex, along with Cambridgeshire, Suffolk, Southend, Thurrock and Peterborough, was recently awarded £500,000 by the DCMS/Wolfson Public Libraries Challenge Fund to develop connectivity and co-operation in the Eastern Region. One stream of the Co-East project [7] focuses on linking up community information databases in the region using the SEAMLESS standards. Given the short timescale of the project, which is to be completed by the end of March 2000, it is inevitable that the other authorities will have to use their existing library based, centralised community information databases, at least in the first instance. Immediately therefore we are faced with a ‘mixed economy’ of centralised and distributed information systems.

In addition we have become aware that a number of government initiatives, for example the University for Industry, the Early Years Development Scheme, and NHS Direct, are crucially dependent on the development of, what are in effect, new citizens' information systems, this time quite outside the library sphere. Some of these systems may well be distributed systems in their own right. SEAMLESS is currently working with all these organisations with a view to including their data in the SEAMLESS system which raises a further prospect – that of a grouping or hierarchy of distributed systems working together.

One Metadata System

When we wrote the original proposal we had an idea that we might be able to identify the definitive citizens' information profile. However, it very quickly became apparent that although it might simplify things considerably if everyone adopted the same profile, this was unlikely to be achievable in a real world environment where a number of different profiles were already in use. This is especially true if you consider that many of the systems we would need to link with are developing quite outside the library sphere.

Rather than create a totally new profile for SEAMLESS, we based the SEAMLESS profile [8] on existing, and widely used profiles. The SEAMLESS Information Profile. is a set of 33 attributes, based on a subset of the GILS [9] (Government, or Global, Information Locator Service) Profile, with some additional attributes from the IMS [10] (Instructional Management Scheme) to cater for more detailed information about educational courses. Discussions with the participating information providers indicated a desire to incorporate the Alta Vista format for the keyword and description attributes. These therefore appear without the SEAMLESS se. prefix, with the intention that they can be recognised by the Alta Vista crawler as well as by SEAMLESS.

In order to achieve interoperability with other systems we have produced a draft mapping(11) between the SEAMLESS profile and the Dublin Core. We are also working on a mapping between Dublin Core, US Marc Community Information Format, GILS and the SEAMLESS profile and have had some discussions with the Library of Congress and other experts about this. We hope to be able to publish the results of this work in Spring 2000.

Duplication of Data

In our original proposal we recognised that many organisations currently expend considerable time and effort collecting data to supplement their own core data and that this results not only in duplication of data but represents an unnecessary workload. We stated that one of the potential benefits of SEAMLESS would be that this duplication of effort would be reduced because organisations would be able to concentrate on their core data and rely on the system to supply the other information they need.

Rationalisation of data in this way will take time and even with the limited amount of data currently in the Beta version we have found that duplication of data is likely to be problematic. We plan to tackle this in a number of ways. Firstly, we plan to be more proactive in managing the supply of content to the system by assisting participating organisations to assess what is their core data, and then concentrating on incorporating that into the system.

Secondly, we are beginning to think that perhaps we should focus, in the first instance, on the larger organisations which tend to provide the majority of the public services in the region, and therefore maintain as core data a great deal of potentially useful information for SEAMLESS.

Thirdly, we do not wish to disenfranchise or delay the smaller organisations but there is a clear management overhead in introducing each additional organisation to the system. We are already working with some of the large 'umbrella' or co-ordinating groups such as the Essex Community Foundation and the Essex Community Volunteer Service. By incorporating their databases into SEAMLESS we will be able to create an initial presence for the smaller organisations at very little overhead.

Lastly we plan to explore whether it might be possible to develop specific tools to cope with the problems caused by data duplication and also investigate whether changes to the system architecture might be beneficial.

Need for Tools

The application of the metadata to web pages and Word documents, or 'tagging,' has caused some problems. During the course of the project most of the organisations have been in the position of applying metadata to pre-existing documents. Applying metadata in this way is time consuming and expensive. Clearly it would be much more efficient if we could move from retrospective tagging to tagging at source, i.e. at the time when the document is created, and we are working with the organisations to develop ways of doing this.

The accuracy and quality of tagging has been difficult to control and mistakes may result in poor information retrieval or documents being rejected by the system entirely. The tags, although not difficult to understand, have to conform exactly to the required syntax, which includes opening and closing brackets and quotation marks around attribute names and variables. It is very easy to make mistakes, and not very easy to spot them by eye.

There is also an issue about how many of the attributes organisations chose to apply. Only 6 of the attributes are mandatory and there is a danger that this minimum set becomes the norm, with a corresponding negative effect on the sophistication of both searching and display within the system. We also found that the production of database reports proved more difficult than we had expected, largely due to the lack of experience in report generation among the partners. Although time consuming we have been able to resolve these problems by providing individual support and guidance as required.

Some of the technically more advanced organisations, however, have managed not only to automate the process successfully, but to build it into their normal work practices such that it is not seen as a burden at all. A good example of this is Anglia Polytechnic University who have developed a system which produces a tagged version of their prospectus every time it is updated. We plan to work closely with organisations such as this with a view to sharing best practice and making the task easier for others.

There is a need to find ways to simplify the process for participating organisations, and to improve the accuracy and reduce the overheads of data preparation. We are planning to develop a number of tools which might assist in this process including tagging templates, syntax checkers and metadata generators.

Semantic Interoperability

The application of the SEAMLESS profile only achieves interoperability at the technical level. It ensures that the SEAMLESS system can 'read' the data from other data sources and that it 'looks' in the right fields for particular sorts of information. In order for the system to work effectively we have also had to achieve some level of semantic interoperability, to ensure that participating organisations are using a common vocabulary to index their data. This has been achieved through the development of the SEAMLESS thesaurus and place name authority list.

However, one of the problems that has become apparent in the Beta testing is that not all organisations are indexing to the same level of detail. This has an impact on retrieval from the system as more detailed indexing leads to improved recall and precision. However, in the real world there is a very real tension between exhaustive indexing and the workload involved. There is a similar problem with websites. Some organisations apply the SEAMLESS metatags to all of their pages, whilst others only apply them to the higher level pages. Again this affects information retrieval.

Now that we have a larger body of data in the Beta system it is easier for the organisations to assess the impact of their metadata and indexing practices, and we are hopeful that this will encourage them to apply both the metatags and the index terms more exhaustively. We are also working on improving and enlarging both the thesaurus and the place name authority list, and are investigating whether these can be automated to ease the burden on participating organisations.

All of this represented a fairly steep leaning curve for both the project team and the partners but it's encouraging to note that all of the organisations we started with are still working with us and many now feel that they have learned some useful new skills along the way.

Partnership Building

We have found that building, supporting and maintaining the partnerships has been an ongoing, and time-consuming commitment and that, if the system is to grow and develop, dedicated staff will be required to manage both the technical side of the system and the partnerships.

There are a number of reasons for this. Firstly, on the technical side, there is a continuing task to be performed in administering the system and the server, and in developing and maintaining the necessary tools, the metadata tools, thesaurus etc, to support the process. This task will grow as the number of distributed resources making up the system increases. In addition, both the technology and available standards are developing and changing very rapidly in this area and there will be a continuous need to monitor this and adapt the system as necessary.

On the organisational side we have found a need to maintain continuous contact with our existing partners, to cope with changes in their staff, their information, and their systems. Even within the short timespan of the project so far, 5 of our partners have launched new websites, two organisations have merged, and in 6 organisations our main contact has changed. Staff changes can be positive and in most of these cases we have noticed increased involvement and activity on the project following the appointment of new staff. We were pleased to note that in all cases the ‘germ’ of the idea has survived within the organisation and the management were keen to maintain their participation in the project.

The introduction of new partners as we build and expand the system will require the input of considerable staff time. In addition there will also be a continuous need to ensure compliance with the standards adopted, to monitor quality and performance, to provide feedback and evaluation to partners and management, and to market, promote and develop the system. Within the research project this management role has been taken by Essex Libraries and participating organisations have recently expressed a desire that the Library continue to exercise this role as the project begins the transition from research project to live system.

Conclusions

The SEAMLESS project, a relatively small project, has had a big impact at local, national and regional levels and the team are working with significantly more organisations than we originally envisaged. In co-operation with our local information partners we have:

We have achieved a high profile and strong support from a wide range of local organisations in the County, and have a ‘waiting list’ of organisations wishing to join the project. The success of the project to date has also enabled us to influence the County Council agenda. We have been able to input to the Corporate Information Strategy and to ensure that the strategy reflects the importance of citizens' information. Most significantly we have secured long term funding from the County Council to enable us to build upon the SEAMLESS project with a view to developing a fully functioning Citizens' Information system in Essex. In the first instance this funding will enable us to appoint two permanent members of staff to work on SEAMLESS, to develop the metadata tools and improve and automate the thesaurus. We plan to launch the system to the public during Spring 2000.

References

  1. SEAMLESS project web site,
    <http://www.seamless.org.uk/> Link to external resource
  2. Z’mbol,
    <http://www.fdgroup.com/fdi/zmbol/about.html> Link to external resource
  3. ISO 23950 1998 ANSI/NISO Z39.50 1995, ISO 1998, Information retrieval (Z39.50) application service definition and protocol specification
  4. Harvest,
    <http://www.tardis.ed.ac.uk/harvest/> Link to external resource
  5. eLib web site, UKOLN
    <http://www.ukoln.ac.uk/services/elib/> Link to external resource
  6. Metadata,
    <http://www2.echo.lu/libraries/en/metadata/metahome.html> Link to external resource
  7. Co-East web site,
    <http://www.co-east.net/> Link to external resource
  8. A New Profile for Citizens(or Community) Information?, Mary Rowlatt, Cathy Day, Jo Morris and Kevin Atkins, Ariadne issue 19, March 1999
    <http://www.ariadne.ac.uk/issue19/rowlatt/> Link to external resource
  9. Government Information Locator Service,
    <http://www.usgs.gov/gils/> Link to external resource
  10. Instructional Management Scheme,
    <http://www.imsproject.org/> Link to external resource

Author Details

Mary Rowlatt
Information Services Manager
Essex Libraries
Chelmsford Library
PO Box 882
Chelmsford
Essex CM0 8PN

Tel:+44 1245 436524
Fax: +44 1245 436769
Email: maryr@essexcc.gov.uk

Cathy Day and Jo Morris
Research Assistants
SEAMLESS Project
Essex Libraries
Chelmsford Library
PO Box 882
Chelmsford
Essex CM0 8PN

Tel:+44 1245 436560 Fax: +44 1245 436769 Email: seamless@essexcc.gov.uk

Robert Davies, Director
Education for Change Ltd.
United House
North Road
London N79 DP

Tel: + 44 171 697 8881
Fax: + 44 171 697 8883
Email: rob.davies@efc.co.uk URL: http://www.efc.co.uk/

For citation purposes:
Rowlatt, M., Day, C., Morris, J. and Davies, R.,"SEAMLESS: An Organisational and Technical Model for Seamless Access to Distributed Public Information", Exploit Interactive issue 4, January 2000
<URL: http://www.exploit-lib.org/issue4/seamless/>


[HTML Validation] - [Accessibility check]