Overview of the OAIS reference model
This article is a description of the function of an OAIS, as described in the reference model for an Open Archival Information System (rmOAIS), written by the CCSDS. The rmOAIS provides ISO standards for the long term preservation of data. It can be applied by any organisation or network that is responsible for long term archival and distribution of data to a Designated Community. The main point of reference for this article is Figure 4-8, OAIS Data Flow Diagram, rmOAIS, Section 4.1.2.
Long term storage in this context refers to the archiving of data over time periods during which technology, specifications, standards and interested parties may change. Thinking in terms of decades is not unreasonable. It is also worth note, however, that an organisation may unintentionally find itself involved in the process of data archival, and even those purposefully involved in such a process may find the archival period and parameters extend beyond that originally intended. It is therefore beneficial to consider, at the outset of a project involving collation and storage of data, the vital attributes of the data and how they should be recorded :
- the data source
- the transport method
- the requirements of metadata descriptors for cataloguing and retrieval
- the storage method and integrity requirements
- what the data could be used for
- who will use the data
In order to define the functions an OAIS may serve, it is necessary to describe those that are involved in using and administering the service – the Management, Producers, Archivers and Consumers.
Management is usually the body that funds the OAIS, approves financial and administrative actions, periodically reviews performance, and tries to ensure the OAIS is put to good use by the community in question. Management may also be involved in usage analysis, particularly in examining Consumer search criteria and Consumer feedback and responses to any questionnaires. However further detail on this type of management process is not so relevant to this article.
The process of data archival and delivery of information objects initiates with the Producers of the data. A Submission Agreement is established between the Procucers and the OAIS, identifying the Submission Information Packages (SIPs) to be submitted. These Agreements may be mandatory requirements or voluntary undertakings, or even informal – virtual – Submission Agreements, but acceptable file formats and content requirements must be defined and met to ensure successful assimilation of the data in the SIP into the AIP.
The Package Information (a data wrapper), Content Information (a Data Object such as data on disk plus Representation Information) and PDI (Preservation Descriptive Information) are provided in a Data Submission Session, which yields a SIP (Submission Information Package).
The Archiver (e.g. the OAIS) has specific responsibilities that it must undertake :
- Negotiate for and accept appropriate information from information Producers.
- Obtain sufficient control of the information provided to the level needed to ensure Long-Term Preservation.
- Determine, either by itself or in conjunction with other parties, which communities should become the Designated Community and, therefore, should be able to understand the information provided.
- Ensure that the information to be preserved is Independently Understandable (e.g. without expert advice) to the Designated Community.
- Follow documented policies and procedures which ensure that the information is preserved against all reasonable contingencies, and which enable the information to be disseminated as authenticated copies of the original, or as traceable to the original.
- Make the preserved information available to the Designated Community.
(The above as listed and detailed in the rmOAIS section 3.)
The OAIS perform operations on the SIP in order to check integrity and to ensure all the necessary data required to supply the Designated Community (DC) with meaningful information is stored. It is important to remember that the Knowledge Base of the DC will have an impact on the data that must be archived. Also, as it may be nesessary to deliver information to various Designated Communities with differing Knowledge Bases (KB), the relevant levels of detail must be kept track of. This can also require updating over time as the KB of the DC changes. The final package ready for archival is the Archive Information Package (AIP). Further detail of these processes follows in the OAIS Functions section.
The remaining stage is delivery of the DIP (Dissemination Information Package) to Consumers via one or more Data Dissemination Sessions. This is achieved via an Order Agreement – either a contract to supply certain data over time or an individual request for information. The Consumer must know what data they require from the archive. This may well necessitate a Search Session utilising the OAIS Finding Aids – these are essentially the tools to search the Descriptive Information, or perhaps even content, of the AIPs that the OAIS holds. Iterative searches may be required, embodied in the passing of various search parameters and findings on those parameters.
Once the search has been successful, the Order Agreement can be placed based on the search results, the delivery method and data format – this forms a Dissemination Information Package (DIP) to be distributed in a Data Dissemination Session. It is also possible that an Order Agreement may be recursive, for example a monthly DDS may be required to provide continuing update and dissemination of the data set a Consumer requires, as that data set grows.
OAIS Functions
This is a brief synopsis of the functions carried out during interactions between the above parties. Each function is defined in detail in the rmOAIS, section 4.
Ingest: The acceptance of Submission Information Packages (SIPs) from Producers, performing quality assurance and creating a valid Archival Information Package. Submission receipts and similar administrative tasks such as logging the time of arrival and so on may also be carried out. Descriptive Information will be extracted from the data and also made ready for storage in the archive.
Archival Storage: AIPs received from Ingest are submitted to permanent storage. The storage hierarchy is maintained and error checking is carried out. Also, Disaster Recovery is of particular concern here, e.g. ensuring sufficient backup of the archive and adhering to all accessibility commitments.
Data Management: Population and maintenance of Descriptive Information and administrative data used to manage the archive. The database itself must be suitably maintained and updated. Any data queries are also handled, results are output and search/result records are kept.
Administration: The management of obtaining suitable Submission Agreements from Producers, management of hardware and software. Archive operations should be monitored, standards and policies should be adhered to, and improvements/updates made where necessary. Customer support is also an adminsrative consideration.
Preservation Planning: Monitoring the OAIS and providing recommendations to ensure information stored in the OAIS remains accessible long term. Evaluation of the contents of the archive and update as required, for example as technology becomes obsolete or the Knowledge Base and expectations of the Designated Community alters.
Access: Handling requests from Consumers and applying access limitations for protected information, generating particular DIP responses and issuing them to Consumers as appropriate.
This concludes the brief outline of the parties involved in OAIS infrastructure, and the functions performed between them. For further information, please refer to the rmOAIS document.