‘metadata’ sub-directory

The metadata sub-directory is the central directory for storing metadata and digital content generated by Goobi. For each Goobi process, it contains a directory with the name of the process ID. Directories for individual Goobi processes are structured as follows:

1234/
1234/meta.xml
1234/images/
1234/ocr/
1234/import/
1234/validation/
1234/taskmanager/
1234/thumbs/

Depending on the configuration, the central metadata file meta.xml may be accompanied by other back-up files, e.g. meta.xml.1, meta.xml.2, meta.xml.3

Images

Within the workflow, the images directory is accessible for a limited period to various users. Its structure is shown below:

1234/images/abc_media/
1234/images/abc_jpg/
1234/images/abc_source/
1234/images/master_abc_media/

When you are working with digital content, the most important directory is the one ending in _media. The directory beginning with master_ is normally used to store all master images in unmanipulated status. The other directories are intended to be freely accessible within the workflow and can be added to whenever necessary. Both the directory ending in source and the directory ending in _media are copied when exporting to the presentation system (e.g. intranda viewer).

OCR

The images directory may be accompanied by an ocr directory. This contains all the OCR results that are generated within the workflow and added to the process. There is a separate directory for each format of the OCR results.

1234/ocr/abc_alto/
1234/ocr/abc_doc/
1234/ocr/abc_pdf/
1234/ocr/abc_wc/
1234/ocr/abc_xml/

Thumbnails

Smaller versions of the images in images can be saved in the thumbs folder, which Goobi uses to display the images in low resolution. This considerably increases the speed of image display for larger images. For each subfolder of images, one or more subfolders can be created in thumbs with the same name as the images subfolder, extended by an additional underscore _ and a size specification in pixels. This size specification must correspond to the maximum height and width of the images in the respective subfolder. The file names of the images in the thumbs subfolder must correspond to those of the images in the corresponding images subfolder, but with the file extension .jpg.

1234/thumbs/abc_media_800/
1234/thumbs/abc_media_3000/
1234/thumbs/master_abc_media_800/
1234/thumbs/master_abc_media_3000/

If there are matching images in thumbs for an image file in images, these are automatically used in Goobi to display thumbnails and zoomable images when zoomed out.

Validation

The validation directory is used in cases where automatic validation (e.g. of the images) is performed on the Goobi server and in the workflows.

A sub-folder is created within this directory each time validation is performed. This makes it possible to retain older validation results. As illustrated below, the name generated for each sub-folder contains the date, time and type of validation.

1234/validation/2012-11-20_11-20-01_jpylyzer/
1234/validation/2012-11-20_12-02-13_jhove/
1234/validation/2012-11-23_08-12-56_jpylyzer/

If Goobi is being used with the intranda TaskManager, a taskmanager directory will also be found within the folder. This is where TaskManager stores temporary data to perform tasks that require lengthy processing. Depending on the configuration, it is also used to permanently store and maintain all the ticket and template files created each time the TaskManager is called. The directory is made up as follows:

1234/taskmanager/2012-11-23_08-12-56_jp2validate/
1234/taskmanager/2012-11-25_14-38-15_iii-create_jpeg/

Import

Again depending on the installation, for each Goobi process there is an import directory, which is used by import plug-ins to store original source files for the process in question. Catalogue dataset files and other source files that have been manually read in and imported can be stored here and used in scripts as part of workflow processing. The folder structure could be as follows:\

1234/import/abc.mrc
1234/import/abc.original.pdf
1234/import/eod/

Zuletzt aktualisiert