Skip to content

Major Classes

Brian "Moses" Hall edited this page Nov 3, 2021 · 3 revisions

Shipment

(lib/shipment.rb) Responsible for the part of the filesystem that encompasses a "shipment" of works that occupy a chunk of the filesystem. Inside the shipment there is one directory per object (i.e., volume). By default each object has one level of hierarchy: a single directory named for the object's barcode. Subclasses like DLXSShipment have a nested hierarchy with a facility to translate bidirectionally between filesystem subpath and object identifier (objid).

The class is responsible for creating and populating a source/ directory with an original copy of the shipment's contents. It can also create temporary directories on behalf of stages that need a place for file conversion intermediates.

shipment.metadata is a Hash mainly used for recording fixity data.

Processor

(lib/processor.rb) This class runs the show for a single shipment, around which it creates a Shipment object and tells it to do the necessary setup. It then creates the stages indicated by its configuration file(s) and command-line options. Before running the stages, it constructs an Agenda object which encapsulates the stages and objids it needs to run. It then passes the relevant part of the agenda to each stage in turn.

Stage

(lib/stage.rb with subclasses in lib/stage/) A discrete unit of work in processing workflow. Stage is meant to be subclassed. The main entry point is the run! method which does setup and in turn calls the run method required of subclasses. Stage provides a Shipment (and various wrappers around Shipment methods) to subclasses, as well as a ProgressBar instance.

Config

(lib/config.rb) Handles merging command-line options and YAML config files. Exposes a Hash-style interface to a data structure merged from config files and command-line options. The Ruby source file has an in-depth description of the class, which will not be reproduced here.

Agenda

(lib/agenda.rb) A list of objids to be processed, as well as a list of stages to be run. Created by Processor and updated each time a stage is run. If an error is encountered for a particular objid, that objid is removed from the agenda for subsequent stages. Instances are typically initialized with "everything"; however, when a previous run detected an error (and recorded it in status.json), only objids that have had fixity changes in the source/ directory (that is, some problem has presumably been corrected) are added to the agenda for reprocessing. A failed shipment with no updated files will produce an empty agenda and do essentially nothing.

TIFF and JP2

(lib/tiff.rb and lib/jp2.rb) Wrappers around tiffinfo and tiffset for TIFF files, and exiftool for JP2 files, for querying and writing image metadata. There is no metadata writing capability currently for JP2 although exiftool should support that in some fashion if needed.

Clone this wiki locally