Pandora Archive

Home > About PANDAS > Overview

PANDORA Digital Archiving System (PANDAS)

The PANDORA Digital Archiving System, known as PANDAS, was developed by the National Library following an unsuccessful attempt to find an off-the-shelf system (or systems) to provide an integrated, web-based, web archiving management system.

The need for such a system was evident as the scale of the Library's archiving activity increased and if the best possible efficiencies were to be achieved in building a collaborative, selective and quality assessed web archive. It was also necessary to enable PANDORA partners to contribute to the Archive from various geographic locations.

PANDAS was first implemented in June 2001, and a second much enhanced version was released in August 2002.

Workflows

PANDAS was designed to support the workflows defined by the staff of the National Library's Digital Archiving Section, and also adopted by the other PANDORA partners. These workflows include:

identifying, selecting and registering candidate titles;
seeking and recording permission to archive;
setting harvest regimes;
gathering (harvesting) files;
undertaking quality assurance checking;
initiating archiving processes; and
organising access, display and discovery routes to, and metadata for, the archived resources.

Functions

PANDAS supports these work flows by means of the following functions:

the management of administrative metadata about titles that have been either selected for archiving, rejected, or are being monitored pending a decision;
the management of access restrictions;
the scheduling and initiation of the harvesting of titles selected for archiving;
the management of the quality checking and assurance process and associated problem fixing;
the preparation and organisation of archived instances for public display through title entry pages, and title and subject listings; and
the provision of defined management reports.

Manuals

For further information on how PANDAS supports these functions, refer to the PANDAS Manual (complete) and the PANDAS Quick Start Guide.

Persistent identifiers

PANDAS assigns a system generated running number to each title when it is registered. This number becomes part of the persistent URL applicable to each archived title's title entry page. The PANDAS persistent URL is generated according to a schema developed by the National Library for its digital collections.

The persistent URL, or persistent identifier, is recorded on the title entry page for every title.

As well as providing a persistent identifier at the title level, PANDAS also creates one for all of the component parts, for instance, for an article within a issue of an electronic journal, or for an image or a table within a web site. The persistent identifier for any part of a title that a researcher may wish to cite can be ascertained by using the citation service. This is available towards the bottom of every title entry page, just under the persistent identifier for the title.

Technical documents

For more detailed information about PANDAS see:

Paul Koerbin's staff paper after September
PANDORA: Technical details
Roadmap for future development of the PANDAS software system

PANDAS Evaluation System

Following expressions of interest from other institutions planning to set up web archiving programs, the Library created the PANDAS Evaluation System to enable access to the software by other agencies for assessment purposes.

To request access to the PANDAS Evaluation System, contact Margaret Phillips, Director Digital Archiving, mphillips@nla.gov.au


Last updated 12 August 2004