One of the first problems in archiving data is managing the archival process, upon deciding what to archive.
There are several solutions that are available for managing the archival, in terms of management software. One solution, which is highly recommended is using a biblo software such as OpenBiblio or PMB PhpMyBibli slightly modified to track archival process and index tapes.
Another idea is to simply run Gazelle and have a private torrent tracker, with minimal modifications we could use it to track requests, torrent privately the data and archive it as a group. It also includes a forum and supports private (Invite only) registration. The same archive backend that WhatCD uses. It contains a system for seed requests and upload requests for torrents that don’t yet exist. One stipulation of running Gazelle is that you cannot share your torrent file as-is and must edit it as the tracker uses a private hash key for each user to connect and torrent, being a private auth tracker. We would also have to decide how concerned we are about ratio, if we aren’t concerned about ratio it should be okay to allow the addition of other trackers and possibly DHT on the torrent files.
The proposed system comprises of the following points:
- The present archival content is catalogued. Everything already on tapes and archived is listed to avoid duplicates.
- Other members who are running LTO or other archives are integrated into the “Library” system as external libraries where one can request archived content for their library.
- The items in queue to be archived are made available as an index similar to being in another library or marked as “In curation”
- Items that are in curation are sequentially, in order of some designated priority or in order of submission, to be acquired by some means.
- All items should have a content hash generated and be within a containing folder containing a torrent file as a checksum and metadata. We need a system of scripts for acquiring metadata for movies and anime and films. See what to archive: Discovery.
- For the purpose of this archive we will likely be considering tapes as shelves. Therefore, once a book is on a shelf it is archived successfully.
- Requests for archived content will be handled on a first come first serve basis at the discretion of the archivist. That is there may be some priority of release/retrieval of content for the archive teams to other archives or as a likely lower priority to the people discovering the content to be archived, due to the fact that redundancy is paramount across systems in the event of a disaster.
- A system for contacting other archivists privately should be put in place in the event that someone would like to request content stored on tapes in LTO format. The tapes would be