Exploit Interactive HomeHomeSearch
Issue CoverEditorialFeaturesRegular ColumnsNews and EventsEt cetera

The ILSES Project: Integrated Library and Survey Data Extraction Service

Libraries and survey data archives traditionally offer separate services through their respective catalogues and reference systems. Thus for people interested in social science information the way from articles and books to related survey data or vice versa used to be quite cumbersome. ILSES aims at a multi-level integration of both: bibliographic references for publications and meta-data for related data files, including direct access to the increasingly available body of electronic format or digitised documents. On the content side ILSES focuses on the emerging needs by comparative cross-national and cross-temporal social research for country specific and time series (meta-)data. The ILSES prototype was built on exemplary surveys from the European Commission's bi-annual Eurobarometer series and related publications.

Concept

ILSES logoThe ILSES project [1] aims to develop a service that enables end-users to retrieve and access publications, documentary information and empirical data related to large-scale surveys within an integrated system, allowing for vertical ("in depth") search, either starting from publications or from survey data or from both in combination, and offering subsequent horizontal expansion to related data respectively publications. For content providers, libraries and data archives, ILSES offers tools and procedures for the generation of meta-data, conversion of existing formats, normalization, cataloguing and networked common access to distributed holdings [2] [3].

Modular design

The focus on end-users and content providers is reflected in the modular construction of ILSES, with a central data base administrator (ADMIN-ILSES) as its technical heart containing the meta-data dictionary, with the library and data archive content provider tools for local meta-data import and administration (LIB-ILSES, DAT-ILSES), and with the end-user modules E-ILSES, a locally installed client interface, and NET-ILSES as an additional WWW interface with a subset of basic facilities.

Figure 1: Structure of ILSES
Figure 1: Structure of ILSES

The arrows on the left and on the right represent the access to real survey data, electronic documents and publications; the 'communication' of end-users with the meta-data database (ADMIN-ILSES) is effected via E- or NET-ILSES.

Multi-level integration

The integration of bibliographic information and survey meta-data, in principle created separately in libraries and data archives usually neglecting the interrelated nature, is realised in ILSES through indexing from a common (domain specific) thesaurus ("soft links") and through direct intellectual cross-referencing ("hard links"), both down from the study (survey) level to the variable (individual survey question) level.

For the ILSES prototype it was decided to use the internet based HASSET thesaurus [4] which was developed by the UK data archive starting from the "UNESCO thesaurus for social sciences, education, communication, and culture". ILSES also allows for the application of multiple, subject specific thesauri. The about 10000 thesaurus keywords and their hierarchical relations are stored in ADMIN-ILSES and assigned to publications via LIB-ILSES and to survey questions via DAT-ILSES. The end-user can select and combine keywords from the thesaurus list to start a structured search without being familiar for example with the actual question wording or publication titles. Free-text search interfaces on all levels (publication or study abstract, author, title, variable labels or question text) complement the ILSES information retrieval facilities.

Figure 2: E-ILSES screen illustrating integrated data and publication retrieval
Figure 2: E-ILSES screen illustrating integrated data and publication retrieval
For an enlarged view click on the image

Publication and variable hit lists with results from an integral thesaurus guided search for the keyword "democracy"; meta-data for a selected variable (variable name and label, study number and title, question and answer text, aggregate results etc.) are displayed; publications directly linked to the selected variable are listed separately; bibliographic information for a selected publication is displayed.

In general, ILSES provides a pre-coordinated collection building which is controlled with common indexing and consistent cross-referencing between related information. While other systems (e.g. NESSTAR [5]) concentrating on critical mass and accepting a lowest common denominator in contributing material, show decreasing rates of completeness the deeper the search level, ILSES in principle maintains quality and consistency in vertical and horizontal direction. The price to be paid is the required amount of intellectual input which is concluded to be worthwhile only for selected data collections with recognized major importance. Therefore ILSES is anticipated to be an appropriate complement to other tools such as NESSTAR, for accessing specialized, complex and mixed media social scientific information content that needs integration beforehand.

Access to electronic documents and controlled data extraction

Once relevant topics (i.e. publications and survey data units) are identified, ILSES provides, beyond bibliographic information and data documentation, direct access to electronic documents and data ordering. Full text publications are made available through their respective internet addresses, original documents such as the national field questionnaires on dedicated ftp servers.

"Custom-made" data sets containing exactly the information needed in a particular research framework, can be directly ordered from participating data archives. Last but not least ILSES supports the integration (combination) of time series data from different surveys through controlled data manipulation and harmonisation. The necessary SPSS code is created in the background to be run at the data archive. These data matching procedures can be stored and modified or extended at any later stage.

Figure 3: E-ILSES screen illustrating the matching of time series data
Figure 3: E-ILSES screen illustrating the matching of time series data
For an enlarged view click on the image

Variable hit list with results from a free text search for "satisfaction" in variable labels of pre-selected studies (surveys); several steps are shown of preparing a time series data extraction from different surveys (match variables) by software supported comparison of variable labels and values: (1) variables which do not match at all, (2) variables with minor differences in labels or values (necessary manipulation is supported by a recode interface), (3) variables with identical labels and values ready to be matched, (4) checking variables to be matched (former recode procedures are tracked)

Meta-data Standards

ILSES provides meta-data import from standard formats of bibliographic information (UNIMARC carried by the 2709 format), from statistical standard software (SPSS) and imports standard codebooks (detailed survey data documentation). ILSES is considering the emerging social sciences Data Documentation Initiative (DDI) standard [6], written as a DTD in XML [7], both for meta-data import and export and for exchange with complementary systems (e.g. NESSTAR). The DDI-DTD element definitions have already incorporated most of the ILSES requirements for handling cross-national and longitudinal comparative data. DAT-ILSES goes beyond the present DDI-DTD in country specific elements at the study and variable level of description and in time series elements. LIB-ILSES still supersedes the DDI in cross-referencing between publications and data from survey questions. Forementioned elements, as well as addressing better inclusion of thesaurus indexing terms at all possible levels, can be expected for consideration in future DDI-DTD developments.

Partners

The ILSES project has been running within the European Commission's Telematics for Libraries programme from September 1996 to September 1999. The project has been carried out by a consortium of Dutch, German, French, and Irish institutes. In accordance with their respective expertise, iec ProGAMMA (Groningen) was responsible for software development and the overall project management, NIWI (Amsterdam) for the design of the library module and the bibliographic input, Zentralarchiv (Cologne) for the design of and meta-data loading by the data archive module, and University of Amsterdam's Department for Political Sciences contributed the end-user perspective, seconded by Trinity College (Dublin) and CIDSP (Grenoble) in the organisation of end-user and content provider user validation workshops.

Technical Specifications

ILSES has been developed as an open system using different software modules assembled around a central data base. Software and underlying data base work locally or centrally over the Internet (ftp and http protocol). The software is written in C++, the user-interfaces (screens) have been developed in Delphi. ILSES has been developed under Paradox but with it's client-server model any relational database capable of understanding SQL (e.g. Oracle) can be used and manipulated.

Perspectives

Complementing the traditional publication retrieval with own (re-)analysis of relevant survey data, sharpening the perspective on selected data through relevant publications, inspecting the translation of indicators in the original language field questionnaires in cross-national survey programs, controlled build-up of cumulative time series data sets, all these facilities make ILSES a unique service for comparative social research. Cross checking own findings, general quality control and avoidance of double work are only some of the relevant aspects for Social Science Research in general. With ILSES, the building of integrated research databases around important large scale data collections such as Eurobarometer [8] [9], International Social Survey Program (ISSP), World Values Surveys, or National Elections Studies can be managed intellectually and in terms of sharing the workload, in close collaboration between all parties involved: data collectors, researchers, authors and publishers. With LIB and DAT-ILSES each would have the networked tool for creating meta-data, thesaurus indexing and information linking. Because of the added value following from building integrated databases and because of giving end-users their own tools (E and Net-ILSES), the archive’s and library’s support burden per information or data request, can very well be expected to be considerably reduced as well.

Future Development

The future of ILSES depends on both content and software development. The prototype contains meta-data for five Eurobarometer surveys (study description, SPSS variable and value definitions, question- and answer text, aggregate results by country etc. for over 3000 Variables) including thesaurus controlled indexing and cross-linkages to national field questionnaire pages. It also contains about a hundred indexed and cross-referenced, related publications (from among a larger pool of roughly 600 bibliographic references of associated literature). As a start the present five could be expanded to make more than 50 Eurobarometer surveys with all related information, available through ILSES. On the software side rapidly increasing user expectations would ask for on-line data browsing and simple pre-analysis facilities. For content providers, standard developments ask for direct access from ILSES to Z39.50 databases and for new modules to accommodate for example the DDI data documentation standard.

References

  1. ILSES (Integrated Libarary and Survey Data Extraction Service)
    URL: <http://www.gamma.rug.nl/ilses> Link to external resource (including links to the project partners)
  2. Vries, R. E. de (1998) Can the Library and Data Archive Meet in the Active Support of Research in Social Sciences? The Case of ILSES. IASSIST Quarterly Vol. 22 No.4, 1998
    URL: <http://datalib.library.ualberta.ca/iassist/iq.html> Link to external resource
  3. Vries, R.E. de (1997) ILSES: how library and data archive meet in active support of research in social sciences, INSPEL Vol. 31 No. 4, 1997. Also available at:
    URL: <http://www.fh-potsdam.de/~IFLA/INSPEL/> Link to external resource
  4. HASSET thesaurus project at the UK Data Archive
    URL: <http://biron.essex.ac.uk/services/zhasset.html> Link to external resource
  5. NESSTAR (Networking European Social Science Tools and Resources)
    URL: <http://www.nesstar.org/> Link to external resource
  6. DDI (Data Documentation Initiative)
    URL: <http://www.icpsr.umich.edu/DDI/codebook.html> Link to external resource
  7. Treadwell, W. (2000) Maximizing the Search Potential of Social Science Codebooks Through the Application of the Codebook DTD, IASSIST Quarterly Vol. 24 No. 4, 2000 (forthcoming):
    URL: <http://datalib.library.ualberta.ca/iassist/iq.html> Link to external resource
  8. EUROBAROMETER Service Homepage at the Zentralarchiv
    URL: <http://www.za.uni-koeln.de/data/en/eurobarometer/index.htm> Link to external resource
  9. Moschner, M. & Jensen, U. (1999) ILSES – ein neuer Service für die komparative Forschung. Integration von Literatur-Recherche und Daten-Extraktion am Beispiel der Eurobarometer, ZA-Information No. 45, 1999.

Author Details

Meinhard Moschner
Zentralarchiv für Empirische Sozialforschung (ZA)
University of Cologne, Germany

Tel: +49 (0)221 47694 21
Fax: +49 (0)221 47694 44
E-mail:
moschner@za.uni-koeln.de
URL: <http://www.za.uni-koeln.de> Link to external resource

Meinhard Moschner is responsible at ZA for data preparation, data documentation, user advice and data service for the Eurobarometer survey series.

Repke de VriesRepke E. de Vries
Netherlands Institute for Scientific Information Services (NIWI),
Amsterdam, The Netherlands

Tel: +31 20 4628600
Fax: +31 20 6685079
E-mail: repke.de.vries@niwi.knaw.nl
URL: <http://www.niwi.knaw.nl/us/research/research.htm> Link to external resource

Repke de Vries is participating in the NIWI Research Programme "Exploring the future of Information and Communication in Research" and studies ICT related changes in work- and information environments for researchers.

For citation purposes:
Meinhard Moschner and Repke de Vries , "The ILSES Project: Integrated Library and Survey Data Extraction Service", Exploit Interactive, issue 7, 2nd October 2000
URL: <http://www.exploit-lib.org/issue7/ilses/>


[HTML Validation] - [Accessibility check]