

Brian Kelly describes how the use of Dublin Core metadata is used to provide enhanced searching features for Exploit Interactive.
Providing a search facility for a web site is very easy. Many tools are available, including many which are available free-of-charge. However as web sites grow web administrators often become aware that end users can find the search facilities of limited use. It becomes difficult, for example, to search for a document written by a particular person, to search within a particular category of the web site or to combine a variety of search criteria.
Metadata, which can be defined as structured data about data, can help to overcome these limitations. Dublin Core metadata [1], in particular, provides an agreed standard for metadata for resource discovery. Using simple Dublin Core metadata it is possible to search by an author's name. Using more advanced Dublin Core metadata it is possible to provide more sophisticated types of search queries.
An enhanced search facility has been released for Exploit Interactive [2]. Full-text searching and searching by author name or description has been available since issue 1. As can be seen in Figure 1 it is now possible to search across a particular issue.
![]() Figure 1: Searching for "rdf" In Issue 1 |
Figure 1 illustrates a search for an article containing "rdf" in issue 1. As Exploit Interactive makes use of the issue number in the URL, this type of search could have been provided by a simple search tool which provided filtering capabilities based on URLs. However this approach is very limited as it is dependent on the URL naming conventions. In fact the query does not make use of the URL name; Dublin Core metadata is used to describe the issue number.
Figure 2 illustrates this point. In Figure 2 illustrates a search for "rdf" in "Feature Articles" (as opposed to Regular Column, News and Events, or Editorial columns). Since there is no encoding of a "Feature Article" in the URL, this type of query requires an alternative approach. The approach employed is to use Dublin Core metadata to describe article types.
![]() Figure 2: Searching for "rdf" In Feature Articles |
This approach can be extended. For example it is possible to search for articles about projects which have been funded by a range of funding programmes, such as Telematics For Libraries, DIGICULT, etc.
The interface illustrated in Figures 1 and 2 is dependent on the browser providing support for frames and JavaScript. Although support for frames and JavaScript is common, it is by no means universal. In addition browsers from different suppliers may not be compatible. In order to overcome these problems we have also developed an interface which uses simple HTML. This is illustrated in Figure 3.
![]() Figure 3: Simple interface for advanced searching |
An additional advantage with this interface is that it may be possible to allow multiple categories to be searched (for example it may be possible to search for articles in issue 1 and issue 3 which have been funded by the Telematics For Libraries programme). We are currently investigating functionality provided by the indexing software to see if this can be done.
A summary of the metadata used is given in Table 1.
| Description | Function | Example |
| Issue number (e.g. 1) | Searching in a particular issue (or range of issues) | <meta name="DC.Relation.IsPartOf" content="http://www.exploit-lib.org/issue4/"> |
| Type of article (Regular, Feature, News, etc.) | Searching for a particular article type(s) (e.g. Regular or Feature article, but not News) | <meta name="DC.Type" content="text.article.feature" scheme="Exploit-categories"> |
| Funding body for article, such as "tfl" (Telematics For Libraries). | Searching for articles about projects funded by a particular funding body. | <meta name="DC.Subject" content="tfl" scheme="Exploit-article-funders"> |
The metadata included in articles is not embedded directly. Instead it is defined by a simple variable using VBScript in the article_defaults.ssi file. Every article has an article_defaults.ssi file, which defines the author, title, etc. as illustrated below.
<% doc_title = "Using Metadata To Improve Local Searching" author = "Kelly, B." description = "Brian Kelly describes how metadata is used in Exploit Interactive ..." keywords = "EXPLOIT, TAP, Telematics for Libraries" ' Give the article type : either feature, regular, news, editorial or etc article_type = "regular" ' Give the funding body : either tfl, tap, elib, institution, national or other ' tfl = funded by Telematics For Libraries; tap = funded by (other) Telematics Application Programme ' institution = funded by institution; national = funded by national level; other = other funding body funding_body = "tfl, digicult" %>Figure 4: Definition Of The Metadata Values
The metadata is read in by a default.asp file. This file calls another file which transforms the variables into Dublin Core metadata. The output, which is embedded in the HTML for each article, as shown below.
<meta name="DC.Title" content="Using Metadata To Improve Local Searching"> <meta name="DC.Creator" content="Kelly, B."> <meta name="DC.Description" content="Brian Kelly describes how metadata is used on the Exploit Interactive web magazine to provide advanced searching capabilities"> <meta name="DC.Relation.IsPartOf" content="http://www.exploit-lib.org/issue5/"> <meta name="DC.Type" content="text.article.regular" scheme="Exploit-categories">> <meta name="DC.Subject" content="tfl" scheme="Exploit-article-funders"> <meta name="description" content="Brian Kelly describes how metadata is used on the Exploit Interactive web magazine to provide advanced searching capabilities"> <meta name="keywords" content="EXPLOIT, TAP, Telematics for Libraries">Figure 5: HTML Representation Of The Dublin Core Metadata
This approach has a number of advantages. The data can be reused (for example the title is used in the HTML <TITLE> element and the title and author name are used in the citation information. In addition, if an alternative syntax for storing the metadata is required (e.g. RDF) this can be done by simply changing a single script.
The search service described in this article has been implemented using Microsoft's SiteServer software. A short paper about this work [3] has been accepted at the Ninth International World Wide Web Conference, to be held in Amsterdam in May 2000 [4].
Brian Kelly
UK Web Focus
UKOLN
University of Bath
Bath
England
BA2 7AY
URL: <http://www.ukoln.ac.uk>
Email: <b.kelly@ukoln.ac.uk>
Brian Kelly is UK Web Focus. He works for UKOLN, which is based at the University of Bath.
For citation purposes:
Brian Kelly, "Using Metadata To Improve Local Searching",
Exploit Interactive, issue 5, April 2000
URL: <http://www.exploit-lib.org/issue5/metadata/>
[HTML Validation] - [Accessibility check]
|
Issue Home | Editorial | Features | Regular Columns | News and Events | Et cetera | ||
|
| ||
| Go to Top |
A UKOLN Service. Contact Us. Copyright © 1999 |
Last Updated: 7 April 2000 |