Ingest takes an AFF image and adds it to the Drives Database.
Images start on the Imaging System.
- Task sends the images to the image archive. (currently /usr/affs/sync.sh on acquisition machine, which sends to the appropriate directory on T and DOMEX)
- On the archive server the image is encrypted if it is not already encrypted.
- Get (Drive SN, YearOfImaging) from AFF file. This is the drive identity tuple (DIT).
- See if the DIT is already in the database; if so, report a failure.
- Create new database entry.
- Using afxml, import each of the metadata fields into the drives table. Right now this is done with the domex/ingest_xml.py program.
- The script ingest/ingest.py reads each AFF file that's been added.
- For each file that's not in the database, it extracts the metadata from the AFF file and adds it to the project database.
- Background task which:
- Figures out which files in the database have no matching WALK file.
- Locks the file (somehow)
- Starts walking the file.
This is a script which is designed to be run on a single machine or a cluster. It:
- Gets a list of all the AFF files.
- Get a list of all the XML files and each file's version number (the version of fiwalk which made it.)
- Detect if there are any dead files (fiwalk crashed).
- Finds an AFF file for which there is no matching XML file.
- We might want to prioritize, so it first walks the unwalked.
- If all are walked, it re-walks ones that were walked with older versions of the walk program.
- Locks it (somehow)
- Starts walking.
- need to have a work queues database table. It should be locked to say "this is in process" and have the machine which is currently running the queue.
- copyright/license restrictions needs to be noted on ingest.
- Would be nice to have derfault rules --- ingest on this machine during this time frame is covered by this copyright.
Every drive has a DriveID. This is an integer that we use to track the drive in the database. Early drive AFF files were in the form driveid.AFF, but this is not a requirement.