2.1.3. ‘metadata’ sub-directory

The metadata sub-directory is the central directory for storing metadata and digital content generated by Goobi. For each Goobi process, it contains a directory with the name of the process ID. Directories for individual Goobi processes are structured as follows:

1234/
1234/meta.xml
1234/images/
1234/ocr/
1234/import/
1234/validation/
1234/taskmanager/

Depending on the configuration, the central metadata file meta.xml may be accompanied by other back-up files, e.g. meta.xml.1, meta.xml.2, meta.xml.3

Within the workflow, the images directory is accessible for a limited period to various users. Its structure is shown below:

1234/images/abc_media/
1234/images/abc_jpg/
1234/images/abc_source/
1234/images/master_abc_media/

When you are working with digital content, the most important directory is the one ending in _media. The directory beginning with master_ is normally used to store all master images in unmanipulated status. The other directories are intended to be freely accessible within the workflow and can be added to whenever necessary. Both the directory ending in source and the directory ending in _media are copied when exporting to the presentation system (e.g. intranda viewer).

The images directory may be accompanied by an ocr directory. This contains all the OCR results that are generated within the workflow and added to the process. There is a separate directory for each format of the OCR results.

1234/ocr/abc_alto/
1234/ocr/abc_doc/
1234/ocr/abc_pdf/
1234/ocr/abc_wc/
1234/ocr/abc_xml/

The validation directory is used in cases where automatic validation (e.g. of the images) is performed on the Goobi server and in the workflows.

A sub-folder is created within this directory each time validation is performed. This makes it possible to retain older validation results. As illustrated below, the name generated for each sub-folder contains the date, time and type of validation.

1234/validation/2012-11-20_11-20-01_jpylyzer/
1234/validation/2012-11-20_12-02-13_jhove/
1234/validation/2012-11-23_08-12-56_jpylyzer/

If Goobi is being used with the intranda TaskManager, a taskmanager directory will also be found within the folder. This is where TaskManager stores temporary data to perform tasks that require lengthy processing. Depending on the configuration, it is also used to permanently store and maintain all the ticket and template files created each time the TaskManager is called. The directory is made up as follows:

1234/taskmanager/2012-11-23_08-12-56_jp2validate/
1234/taskmanager/2012-11-25_14-38-15_iii-create_jpeg/

Again depending on the installation, for each Goobi process there is an import directory, which is used by import plug-ins to store original source files for the process in question. Catalogue dataset files and other source files that have been manually read in and imported can be stored here and used in scripts as part of workflow processing. The folder structure could be as follows:

1234/import/abc.mrc
1234/import/abc.original.pdf
1234/import/eod/

Last updated