Z39.50 Profile for Access to Digital Library Objects


Draft Four of this profile, dated August 15, is available:
This is a companion profile to the Z39.50 Profile for Access to Digital Collections (similar to the CIMI profile in that respect).

It has been developed by LC staff, and we now solicit comments. Please review and comment by September 16.

LC provides access to its digital collections over the web via http and would like to provide enhanced access via Z39.50, through the use of this profile when complete. We hope that institutions providing similar collections will provide access via this profile, and we invite these institutions to participate in its development. Z39.50 client developers as well as institutions who will want to acquire clients to access these digital collections (as well as any other interested parties) are also invited to participate. Text of section 1 of profile follows


Library of Congress
(08/15/96)

The Z39.50 Profile for Access to Digital Library Objects (hereafter referred to as the DL Profile) is a companion profile to the Z39.50 Profile for Access to Digital Collections referred to as the Collections Profile.

1. Overview

As a Z39.50 Profile, the DL profile specifies a subset of Z39.50 features to support functional and user requirements for search and retrieval of information in digital library collections, specifically the Library of Congress digital library collections and similar collections. The use of this profile might be one of several mechanisms used to access library digital objects. This particular mechanism is distinguished by the definition of an enveloping structure, called an Object Descriptive Record, which may (logically) encapsulate a digital object along with information describing the object. When the Descriptive Record does not (logically) encapsulate the object it describes, it instead provides a pointer to the object. In either case, in this profile, Z39.50 access to a digital object (for search or retrieval) is via its Object Descriptive Record.

1.1 Extensions to Collections Profile

The Collections profile includes several scope limitations, and delegates responsibility to companion profiles, to extend the scope in these areas. As a companion profile to the Collections profile the DL profile extends the collections profile in the following areas:

1.1.1 Model of a Digital Object

This profile provides a general and flexible model for the structure of a digital object. In this model, a digital object may consist of constituent parts, any of which may in turn consist of constituent parts, and so on. Constituent parts are represented as Z39.50 elements.
Consider a single digital object consisting of several images (e.g. photos or text images). Although the set of images comprises a single digital object, each must be distinctly representable and the object must convey the fact that there are distinct images, how many, and their individual characteristics. Thus they are represented as separate elements of a Z39.50 record.
Next suppose that the digital object not only includes a number of images, but also additional constituent parts, further structured; for example, each such constituent part may consist of several images. This introduces an intermediate level of aggregation.
The model of a digital object adopted by this profile assumes arbitrary levels of aggregation and is represented as a tree, where each non-leaf node has an arbitrary number of subtrees and/or leaves, and leaf nodes represent data.
Every node, whether a leaf or non-leaf node, has a string tag, whose purpose is to convey to the user what that node represents. A description (via a description meta-element) may also accompany any node, in case the string tag is not sufficiently descriptive.
This model could represent, for example, a digital object consisting of 10 boxes, each with 20 folders, each with 30 photos. String tags such as 'box', 'folder', and 'photo' could be used to convey the type of ele- ment (the type would be conveyed to the user, not the client; this profile does not attempt to define machine-processible content types). As a more complex example, a folder might include a variety of photos, maps, correspondences, etc. and perhaps the correspondences consist of several sequential digitized pages.
As another example, a digital object may consist of multiple volumes, each with a table of contents, several chapters, each chapter with sections, etc. String tags such as 'volume', tableOfContents', 'chapter', 'section', etc. could be used.
Repeating elements, and designation of the ordinal occurrence of an element among a set of repeating elements, is supported by this profile. For example consider an object with multiple "volumes", i.e. "volume 1", "volume 2" etc. "Volume 2" would be represented by the second occurrence of the element whose tag is 'volume'. "Third chapter", "fifth image" or "first five pages" would be similarly represented. Element specification eSpec-1 and record syntax GRS-1 provide these capabilities.

1.1.2 Categories of Digital Object

The profile defines the following categories of digital objects:

1.1.3 Categories of Associated Descriptions

The profile defines the following categories of Associated Descriptions:

1.1.4 Authentication, Rights and Permissions, Access Control, and Resource Control

This profile supports:

1.1.5 Character Set and Language Support for Search terms

This profile supports special characters within a search term, and language designation of the search term. For special characters, support is required for character set negotiation as specified in the Z39.50 Implementors Group (ZIG) Implementors Agreements. See /z3950/agency/agree.html. [Details to be developed.] For designation of the language of a search term, see 4.3.3. This profile supports search terms that read right-to-left (e.g. Arabic, Hebrew); see 4.1.1.

1.2 Pilot Collections

This profile is intended to provide access to the digital collections described briefly below. This list is intended as illustrative and is by no means exhaustive; it is used as a representative set of collections on which the specifications of this profile are based.

1.2.1 Detroit Publishing Company Collection

More than 25000 negatives, 20000 prints, 2900 transparencies, from the Detroit Publishing Company, 1880-1920. U.S. scenes (mostly), including buildings, towns, cities, universities, battleships yachts, resorts, natural landmarks, and industry.
Images available in four versions: GIF, TIFF thumbnail, reference JPG, uncompressed TIFF.

1.2.2 Nation's Forum Collection

Collection of 59 sound recording of speeches, made to preserve the voices of prominent Americans; 1918 and 1920. World War I topics, postwar issues, and the 1920 presidential election. Also available for each speech:

1.2.3 WPA Life Histories Collection

Life History Manuscripts from the Folklore Project, WPA Federal Writers' Project, 1936-40. 2900 documents from 300 writers from 24 states. 2000-15000 words each. 23000 page images total. Histories describe informant's family education, income, occupation, political views, religion and mores, medical needs, diet and miscellaneous observations.
Each document available in HTML, SGML (using American Memory DTD and Panorama viewer) and scanned page image (bitonal TIFF G3 or G4).

1.2.4 Finding Aid for Shirley Jackson Papers

Shirley Jackson was a master American short-story writer and novelist of the mid-20th century, best known for modern Gothic horror, in particular for her classic story, The Lottery, 1948. She also wrote stories about contemporary domestic life. Her papers, given to LC in 1967, consist of diaries, journals, correspondence, literary manuscripts, and miscellaneous papers. There are 7400 items, none digitized. The Finding aid is SGML tagged using the Encoded Archival Description standard, beta version.

1.2.5 Coolidge-Consumerism Collection

17,000 pages (images) of 1920s primary-source materials: manuscript, monograph, and serials Also photos and motion pictures. Documents various aspects of economic life in the U.S. during the 1920s. Includes the Calvin Coolidge Papers, focusing on the life of Calvin Coolidge during the six years he was president (1923-1929).
Scanned page images (bitonal TIFF G4) available for the contents of 152 manuscript folders selected from 14 manuscript collections. 73 folders have page images only; 79 have HTML and SGML also.
Photographs (170) available in GIF, TIFF thumbnail, reference JPG, and uncompressed TIFF. Text is available for 78 monographs and 56 serials.

1.2.6 Legislative Information System

Currently named THOMAS and available on the Web, the Legislative Information System (LIS) under development is a constantly growing collection of large, heterogeneous databases of legislative and legal information which includes the fulltext of the Congressional Record, Bills and Laws; Committee Reports and other documents; and Congressional Research Service products (e.g., Bill Digests).
Besides fulltext ASCII (searchable by boolean and relevancy-ranked queries) the expanded LIS will support SGML-tagged documents, PDF, and audio and video format standards for various data sets.

1.3 Metadata and Variants

A variant specification, metadata element (e.g. from tagSet-M or tagSet-G), or GRS-1 metadata field, may apply at any node (leaf or non-leaf) of the digital object tree. For any given such metadata component type -- a variant specification of a given class and type, metadata element of a given tag, or specific type of GRS-1 metadata -- the rules of applicability and inheritance are as follows; for any leaf node:

1.4 Representations of a Digital Object

A single digital object may have several representations, for example a "thumbnail", "highly compressed", "high resolution", "original", or "reference image". When these characterizations apply to the digital object as a whole, they are represented as (Z39.50) variants applied at the root of the object tree (i.e. at element 'root' of datatype Object; see 3.1.1). Representations may also apply at nodes subordinate to the root, and the rules of applicability and inheritance stated in 1.3 apply. [Note: there is currently a proposal to add a new feature to variant-1, necessary to support this.]

1.5 Z39.50 Access to Digital Objects

As in the Collections profile, a digital object may be accessed via Z39.50 or via some other protocol. For the DL profile however, when a digital object is accessible via Z39.50, it may be accessed via its Object Descriptive Record only. The use of Z39.50 to search or retrieve a digital object directly (not via its Object Descriptive Record) is not supported by this profile.

[remainder of profile not available in html]