‘metadata’ sub-directory

The metadata directory is the central directory for storing Goobi metadata and digital content. Within this directory there is a directory with the name of the Goobi process ID for each Goobi process. The directories for the individual Goobi processes are organised as follows:

└── 1234
    ├── export
    │   └── ...
    ├── images
    │   └── ...
    ├── import
    │   └── ...
    ├── ocr
    │   └── ...
    ├── taskmanager
    │   └── ...
    ├── thumbs
    │   └── ...
    ├── validation
    │   └── ...
    └── meta.xml

In addition to the central metadata file meta.xml, there may be other backup files such as meta.xml-2021-11-28-175618166, meta.xml.2024-05-08-135432256, etc. in the directory, depending on the configuration.

Directory images

The images directory is temporarily made available to various users within the workflow and has the following structure:

└── 1234
    ├── images
    │   └── abc_8765432_raw
    │      └── 00000001.raw
    │      └── 00000002.raw
    │      └── 00000003.raw
    │      └── 00000004.raw
    │      └── 00000005.raw
    │   ├── abc_8765432_master
    │      └── 00000001.tif
    │      └── 00000002.tif
    │      └── 00000003.tif
    │      └── 00000004.tif
    │      └── 00000005.tif
    │   └── abc_8765432_media
    │      └── 00000001.jpg
    │      └── 00000002.jpg
    │      └── 00000003.jpg
    │      └── 00000004.jpg
    │      └── 00000005.jpg
    │   └── abc_8765432_source
    │       └── source_files_1.pdf
    │       └── source_files_2.xls
    │       └── source_files_3.wmv
    └── meta.xml

Subdirectory media

The most important folder for working with the content is the one that ends in _media. Files that are to be used for publication (e.g. in a Goobi viewer) are stored here. The files stored here are usually compressed derivatives that can be used for working with the digitised material and publishing it in very good quality.

Subdirectory master

The master files are located in the directory that ends with _master. These are usually the unaltered originals, such as those generated by scanners. They are typically uncompressed and not optimised in any other way (straightened, cropped or similar).

Subdirectory source

In the directory whose name ends in _source, source files can be stored to which Goobi users should also have access in order to upload or view them. The files stored here are taken into account and exported as part of the standard Goobi workflow export. In the case of export to the Goobi viewer, for example, this means that the data from the source folder is also exported to the Goobi viewer's hotfolder.

Further subdirectories

In addition to the directories listed here, there may also be other subdirectories. For example, a directory whose name ends in _raw is listed here in order to store RAW files from cameras or other data there. Goobi users can also gain access to these folders. An explanation of how such additional directories are configured in Goobi workflow can be found here.

OCR

In addition to the images directory, there may also be an ocr directory. This contains all the OCR results that were generated within the workflow and added to the process. There is a separate directory with the respective files in it for each existing format of the OCR results.

└── 1234
    └── ocr
        └── abc_8765432_alto
           └── 00000001.xml
           └── 00000002.xml
           └── 00000003.xml
           └── 00000004.xml
           └── 00000005.xml
        ├── abc_8765432_doc
           └── 00000001.doc
           └── 00000002.doc
           └── 00000003.doc
           └── 00000004.doc
           └── 00000005.doc
        └── abc_8765432_pdf
            └── 00000001.pdf
            └── 00000002.pdf
            └── 00000003.pdf
            └── 00000004.pdf
            └── 00000005.pdf

Thumbnails

Smaller versions of the images in images can be saved in the thumbs folder, which Goobi uses to display the images in low resolution. This significantly increases the speed of image display for larger images. For each subfolder of images, one or more subfolders can be created in thumbs with the same name as the images subfolder, extended by an additional underscore _ and a size specification in pixels. This size specification must correspond to the maximum height and width of the images in the respective subfolder. The file names of the images in the thumbs subfolder must correspond to those of the images in the corresponding images subfolder, but with the file extension .jpg.

└── 1234
    └── thumbs
        └── abc_8765432_master_800
           └── 00000001.jpg
           └── 00000002.jpg
           └── 00000003.jpg
           └── 00000004.jpg
           └── 00000005.jpg
        ├── abc_8765432_master_3000
           └── 00000001.jpg
           └── 00000002.jpg
           └── 00000003.jpg
           └── 00000004.jpg
           └── 00000005.jpg
        ├── abc_8765432_media_800
           └── 00000001.jpg
           └── 00000002.jpg
           └── 00000003.jpg
           └── 00000004.jpg
           └── 00000005.jpg
        └── abc_8765432_media_3000
            └── 00000001.jpg
            └── 00000002.jpg
            └── 00000003.jpg
            └── 00000004.jpg
            └── 00000005.jpg

If there are matching images in thumbs for an image file in images, then these are automatically used in Goobi to display thumbnails and zoomable images when zoomed out.

Validation

If automatic validation, e.g. of images, takes place on the Goobi server and also in the workflows used, the validation directory exists.

Within this directory, a subfolder is created for each validation run so that older validation results can also be stored. The generated subfolders are always named in such a way that the folder name lists the date and time as well as the type of validation. This looks as follows, for example:

└── 1234
    └── validation
        └── 2022-11-20_11-20-01_jpylyzer
           └── 00000001.xml
           └── 00000002.xml
           └── 00000003.xml
           └── 00000004.xml
           └── 00000005.xml
        ├── 2022-11-20_12-02-13_jhove
           └── 00000001.xml
           └── 00000002.xml
           └── 00000003.xml
           └── 00000004.xml
           └── 00000005.xml
        └── 2022-11-23_08-12-56_jpylyzer
            └── 00000001.xml
            └── 00000002.xml
            └── 00000003.xml
            └── 00000004.xml
            └── 00000005.xml

If there is collaboration with the intranda TaskManager, the ‘taskmanager’ directory also exists within the folder. Within this directory, the TaskManager saves temporary data for the execution of long-running tasks. Depending on the configuration, all ticket and template files used for the individual TaskManager calls are also permanently saved and stored here. The content of this directory is as follows:

└── 1234
    └── taskmanager
        └── 2022-11-23_08-12-56_jp2validate
           └── 00000001.xml
           └── 00000002.xml
           └── 00000003.xml
           └── 00000004.xml
           └── 00000005.xml
        └── 2022-11-25_14-38-15_iii-create_jpeg
            └── 00000001.jpeg
            └── 00000002.jpeg
            └── 00000003.jpeg
            └── 00000004.jpeg
            └── 00000005.jpeg

Import

Depending on the individual installation, there is also an import folder for each Goobi process. This folder is used by import plug-ins to carry original source files associated with the respective process. Imported and imported catalogue data record files or other source files can be stored here and used within scripts as part of workflow processing. The contents of the folder could look like this, for example:

└── 1234
    └── import
        └── eod
           └── 00000001.tif
           └── 00000002.tif
           └── 00000003.tif
           └── 00000004.tif
           └── 00000005.tif
        ├── abc.mrc
        └── abc.original.pdf

Export

A folder with the name ‘export’ can also exist within the process directory. Files with the same structure as those in the images directory can be saved in this folder. The files in this folder are then included in the export and exported to the Goobi viewer hotfolder, for example. This may be necessary if files are to be included in the export without them being present in the images folder and without them being referenced within the METS file.

The content could look like this, for example:

└── 1234
    └── export
        └── abc_8765432_media
            └── intro.mp3
            └── overview.jpg
            └── some_file.pdf

Zuletzt aktualisiert