Jump to: navigation, search

Formal statement of Conformance to ISO 14721:2003

May 2004

This formal statement of conformance to the OAIS standard is based on the official standard, which is not available on-line. The text at NASA is possibly different from the official standard but may prove helpful. Text in italics relates specifically to the LOCKSS system.

LOCKSS System: Formal Statement of Conformance to ISO 14721:2003 [DRAFT]

Section 1.4 of ISO 14721:2003 sets out the requirements for conformance. Specifically, the requirements are in two parts:

  • Support the model of information described in Section 2.2.
  • Fulfill the responsibilities listed in Section 3.1.

The information model of Section 2.2 describes information transferred to and from the custody of the OAIS conforming system in terms of Information Packages. These contain three parts:

  • Content Information. In the LOCKSS system the Knowledge Base of the Designated Community is embodied in web browsers. The Content Information consists of bit streams with associated HTTP header information including MIME types sufficient for browsers to render the bit stream.
  • Preservation Description Information, consisting of Provenance (the source), Context (links to other objects), Reference (identifiers for retrieval) and Fixity. In the LOCKSS system Provenance is provided by the URL from which the content was collected, Context is provided by the links embedded in the content, Reference is provided by the original URL and by the availability of the text and the metadata it includes to search engines, and Fixity is provided by the mutual auditing protocol which supplies regular assurance that the content agrees with other replicas.
  • Packaging Information. In the LOCKSS system Packaging Information is encoded in instances of Java classes implementing the LOCKSS plugin API. In most cases this is a generic implementation driven by XML files.

The system is required to support three types of Information Packages:

  • Submission Information Package (SIP). In the LOCKSS system SIPs are created by the publisher, who places a "publisher manifest page" containing metadata on their website and publishes the URL. Individual LOCKSS system administrators direct their systems to preserve this page and the content it describes. Their LOCKSS system collects the page and the content it describes.
  • Archival Information Package (AIP). Internally, the LOCKSS system preserves content in a repository defined by a set of Java classes. The AIP consists of instances of these classes, representing the content itself, metadata obtained from the publisher manifest page and the HTTP headers, and an instance of a Java class implementing the LOCKSS plugin API encapsulating metadata not obtained from these sources. This instance is normally driven by externalized metadata in the form of XML files.
  • Dissemination Information Package (DIP). The LOCKSS system disseminates information by acting as an HTTP proxy, making it appear to the Designated Community that the SIP is still available from its original URLs (with any changes required by preservation operations such as format conversion). The entire SIP, including the publisher manifest page with its metadata, is available. Thus the LOCKSS DIP is the same as the LOCKSS SIP.

ISO 14721:2003 also requires that Information Packages be associated with Descriptive Information sufficient to locate them. The Descriptive Information in the LOCKSS system consists of the URLs at which the information was originally published, and the searchable information they contain including metadata and the full text.

The mandatory requirements of Section 3.1 apply to the organization operating the OAIS archive, requiring the OAIS conforming system to enable the organization to:

  • Negotiate for and accept appropriate information from information Producers. An organization's LOCKSS system will, as directed by the authorized administrator, collect content from information Producers in the form of an appropriate SIP. The LOCKSS SIP must contain a "publisher manifest page" instantiated as an HTML page that describes and links to the relevant content and contains a statement that institutional subscribers have permission to collect and preserve that content.
  • Obtain sufficient control of the information provided to the level needed to ensure Long-Term Preservation. An organization's LOCKSS system will as directed by the authorized administrator, collect via HTTP the entire SIP containing the content and the "publisher manifest page" and store it together with all available HTTP header information (including MIME type). This information is sufficient at the time of collection for a browser to render the content.
  • Determine, either by itself or in conjunction with other parties, which communities should become the Designated Community and, therefore, should be able to understand the information provided. The LOCKSS "publisher manifest page" permission includes permission for the institution's reader's to access the material subject to the institution's subscription agreement. The Designated Community is thus the institution's readers.
  • Ensure that the information to be preserved is Independently Understandable to the Designated Community. At the time of collection, the SIP collected is sufficient for a browser to render the content; because it is collected in exactly the same way that a browser would access it. The LOCKSS system's DIP replicates the SIP exactly by acting as a proxy for the original SIP to the Designated Community, so the information preserved is Independently Understandable.
  • Follow documented policies and procedures which ensure that the information is preserved against all reasonable contingencies, and which enable the information to be disseminated as authenticated copies of the original, or as traceable to the original. LOCKSS systems preserving the same SIP cooperate to audit and repair it, ensuring that the information is preserved against all reasonable contingencies. LOCKSS systems preserving the same SIP collect it independently from the Producer, audit their independently collected SIPs and come to concensus as to the SIP's content. This audit allows the SIP to be authenticated and traceable to the original Producer.
  • Make the preserved information available to the Designated Community. An organization's LOCKSS system's DIP replicates the SIP exactly by acting as a proxy for the original SIP to the Designated Community.