1 von 100

English

Overview

Goobi workflow Handbook

Documentation for the Open-Source-Software Goobi workflow from intranda

About this manual

Welcome to the Goobi Manual. This is the official documentation produced by intranda GmbH as the core developer of the open-source workflow management application known as Goobi. The entire body of documentation is divided into various sections. It is based on the experience we have gained over several years, both in the development of Goobi and during the installation and support process, involving numerous productive systems at establishments in various European countries. As well as providing answers to the questions most frequently asked by users, the manual explains typical scenarios in which Goobi can be used for different projects and examines the range of approaches adopted by different institutions.

Further development of Goobi and maintenance of this documentation

If you have any questions about this documentation, suggestions for the further development of this manual or general questions about Goobi, digitisation projects and, of course, the further development of Goobi, please feel free to contact intranda GmbH at any time:

Contact

Address:

intranda GmbH, Bertha-von-Suttner Str. 9, D-37085 Göttingen

Phone:

+49 551 291 76 100

Fax:

+49 551 291 76 105

Email:

URL:

Copyright

Please note that this document must not be amended or distributed in amended form. It must not be used for commercial purposes.

This work is licensed under the Creative Commons Attribution-Non Commercial-No Derivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-ncnd/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Please select on the left the chapter you are interested in.

Overview of documentation

All the official documentation relating to Goobi has been produced by intranda GmbH. It is divided into three main areas, each with a different focus on the same application, and aimed respectively at users, project managers and administrators.

Goobi User

The aim of this part is to provide a simple guide for Goobi users who do not have any specific responsibilities at administrator, project manager or other technical levels. Using examples to illustrate a wide range of typical scenarios, you will learn how to find your way around Goobi, perform basic tasks and complete individual steps in the workflow.

Goobi Managers

At manager level, the documentation takes a look behind the scenes of Goobi. You will find detailed descriptions of how to set up workflows and how to manage projects, users and user groups, as well as explanations of Goobi’s control functions, rulesets and other areas. This part is aimed primarily at those who are involved in the management of digitisation projects and who want to gain a deeper understanding of how Goobi works.

Goobi Administrators

The technical part for Goobi administrators goes way beyond the application’s actual functions. It takes an in-depth look at what goes on behind the user interface, with a particular focus on the configuration and fine-tuning of different settings, creating rulesets, specifying data imports, file system management, storage integration and other technical issues.

What is Goobi?

Goobi is the most popular open-source workflow management software for use in digitisation projects. It covers the entire spectrum of tasks involved in the digitisation process – from book to online presentation. Many libraries, museums, archives, publishers and even scanning companies have already adopted the software.

Goobi allows you to model, manage and supervise freely definable production processes and is used on a daily basis by many institutions to handle all the steps involved in creating a digital library: These include importing data from library catalogues, scanning and content-based indexing and the digital presentation and delivery of results in popular standardised formats.

Goobi meets enormously varied requirements while remaining as user-friendly as possible. It also needs to allow for simultaneous multiple use at a number of centres. As a multilingual web-based application, Goobi meets all these requirements. The following pages provide a detailed description of Goobi's functions and how they can help you coordinate your digitisation projects in a clear and organised way.

Users

Goobi for Users

Basically, Goobi is a web-based application. To a very large extent, the way it is used and operated will be familiar to you from numerous websites. On the user side, all you need to be able to use Goobi is access to the internet and a web browser to take you to the Goobi site.

The basics

Before we look in detail at the methodology of specific user groups, let us take a look at some basics. Any of the terms used in this section that may not be familiar to you can be found together with an explanation in the glossary.

To open Goobi for the first time, simply type the address of your local Goobi installation in your browser’s address bar, for example:

http://workflow.goobi.io

The page displayed by your browser should be similar to the one below:

Logging in

To work with Goobi you will need a valid user name and password. You will usually receive these from your system administrator. In most case, they will be the same as those you were given for access to the operating system, for example. Please contact your system administrator or the Goobi support team at intranda GmbH if you have any problems with access or user authentication.

Once you have a valid user name and password, simply open the Goobi start page. Enter your user name and password and click Log in.

Once you have logged in, the Goobi screen will change. The menu bar will now contain a range of options for working with Goobi (depending on your authorisation level). Your screen will appear roughly the same as in the diagram below.

Your user name will appear at the right-hand edge of the menu bar when you are logged in. Click on your user name to display a menu with a Settings option. This will show you which user groups and projects you belong to. The other menu options available in the menu bar will vary depending on these user groups.

As the illustrations above show, access to the functions offered by Goobi varies considerably depending on the level of authorisation of the user. Whereas some members of the project team may only be able to view their specific work, e.g. as a scan operator, users with a higher level of authorisation can choose from a much wider range of functions.

Users with a higher level of authorisation can view details of all the source works covered by those projects to which they have been assigned. They can also add new source works to Goobi based on Production templates.

Users with administrator rights have access to the widest range of functions and can view processes for all projects, not just for those projects to which they belong. They are also authorised to make changes to processes and workflows, add users, define user groups and assign individual users to user groups and projects. In addition to these settings, administrators can draw on a range of additional functions allowing them to configure projects in detail, specify rulesets with their structural and metadata and define the authentication process.

This section focuses on topics for those project members who work with Goobi on a day-to-day basis, for the most part through the menu option My tasks.

Logging out

For security reasons, you should always log out of Goobi whenever you leave your work station. Depending on the configuration, Goobi will automatically log you out after a period of inactivity. To log out manually, simply click on your user name at the right-hand edge of the menu bar and select the Log out option from the drop-down sub-menu.

Switch between available languages

In the top right corner you can switch between several available languages. Goobi is supplied and installed by intranda in German, English and Spanish. Adding other languages is straightforward, and this can be done at any time if required.

Help function

Towards the right-hand edge of the menu bar there is an option to call the integrated Help function. This can be used for many of the input boxes in Goobi and shows you which information is required and where. As well as an explanation of the content required in each field, the Help function will often provide examples.

Please note, however, that the Help function does not provide explanations for every area of Goobi. They are available only for input forms and boxes.

Changing your password

As with any other technical system, you should change your password from time to time for security reasons. To do so, click on your user name in the menu bar and then select the Change password option. This will display the following screen:

To change your password, enter your new password in the two boxes below. The change will take effect when you click Save. When you next log in to Goobi and from then onwards, you will have to use your new password. Do not disclose your password to any other users. In order to create a secure password, you need to use a mixture of special characters, numbers, both lower and upper case letters and at least 8 characters in total.

Processes

All the processes that you or other users create will remain in Goobi’s database. To display an overview of these processes, select the option Workflow - Processes in the menu bar.

Goobi will display a table of all the processes that you are allowed to access. This will depend on whether you are a member of those projects or an authorised Goobi administrator. These details are stored in your user account and will determine how much of the stored data will be displayed in the Current processes table. Equally (unless you are an authorised administrator), Goobi will not display processes belonging to projects that have been marked as completed by an administrator.

The process list gives you an overview of all those processes that have not yet completed all the corresponding steps. In addition to these active processes, if you wish to display processes that have completed all their steps, select the table menu option Modify display and check the Show finished processes box. This will display all the Goobi processes that you are authorised to access regardless of their current status. If you are a Goobi administrator, by ticking the checkbox Show deactivated projects you can also list those processes for which the overall project has already been marked as completed.

The overview of current processes allows you to work with specific processes and modify their data or the status of individual steps independently of the workflow. To do this, you do not yourself have to be a member of the user group that is registered in Goobi as the group with responsibility for the individual workflow steps making up the processes in the list. This means that you can, for example, access Goobi’s integrated METS Editor regardless of whether the status of the current process indicates that it is at the stage of obtaining and recording structure data and metadata. From this screen, you can also export the data to other systems, e.g. a presentation system for digitised material such as the intranda viewer. You can also, for example, generate PDF files from the digitised material in combination with the structure data and metadata that have been recorded.

Each of the columns in the process view table can be sorted into ascending or descending order by clicking on the column headings. The little sort direction symbols at the right-hand edge of the column heading cell will change accordingly.

The Process title column contains the unique identifier for each process. It is generated automatically from the metadata whenever new processes are added to Goobi.

The Status column shows you immediately how far a particular process has advanced in the workflow. The colour system used here indicates how many steps have already been completed out of the workflow by those users involved. The red section of the bar shows how many steps are still locked and cannot yet be processed by users. The yellow section shows how many steps are currently being processed or waiting to be processed. For details of the status of individual process steps, just click the process in the Process title column. Goobi will display a detailed list of all the steps for that process and their current status.

In the example above, you can see which individual steps out of the workflow have already been completed by users, which are currently being processed and which are still locked until the preceding steps have been completed. You can search for one or more specific processes from the list at any time using the Filter processes box. Your search term will need to be entered using the correct syntax. For more information, please see section Searching processes.

A detailed description of all the functions provided by Goobi in process view can be found in the section Goobi Management, which contains further information about the functions described above and all other available options.

How to find a process

If you want to search for a particular process, you can use Goobi’s straightforward search box. To do so, select the Workflow - Search for volume option from the menu bar.

The input box used to search for a process in Goobi allows you to combine as many search parameters as you like. When you first open the input box, it will display the fields for the most common search criteria.

Each row in the search box contains several fields. The first of these allows you to specify which field you wish to search. The remaining fields relate to the value of the field being searched. In some fields (e.g. Process title and ID) you can enter a text-based search. In others, however, such as Project and Step, Goobi will offer you a choice of the actual search options available.

If the default number of rows shown for combining fields in a complex search is not sufficient, you can add (or remove) rows at any time.

How to create a new process

There is a specific screen for adding new processes to Goobi’s store of data. Select the option Workflow - Production templates from the menu bar. You will be shown a list containing all the production templates already available in Goobi together with their configured workflows.

This list gives you an overview of all those production templates that you as a project member are authorised to view. The only difference between this template list and the process list lies in the Actions column. To add a new process in Goobi, click the correct symbol in the Actions column.

Tip:

To avoid having to keep every production template for every project, production templates can also be used for other projects. Simply click on the small arrow next to the button to select the desired project from the list of usable projects.

This will open a new window allowing you to import the bibliographical data for the new process.

This manual contains a very broad overview of the ways in which new processes can be created in Goobi by importing metadata from a library catalogue. For more information and details of alternative methods of creating one or more processes at the same time in Goobi, please refer to the . To import bibliographical data from one of the configured library catalogues (Pica, Aleph, Z39.50, etc.), select a catalogue from the Search in OPAC drop-down list and enter the relevant identifier. Next, click on the corresponding Import button. Goobi will now fill the relevant fields automatically and set the correct publication type.

Once you have requested the catalogue, fill in all the fields marked at the end with an asterisk behind the label. These are compulsory fields, although you can of course fill in all the other fields, too. Do not enter anything into the Author-Title-Key (ATS) or the Title-Key (TSL) fields. The fields for Process title, Tif header document name, Tif header image description, Author-Title-Key or Title-Key are filled in automatically by Goobi when you save the new process or when you select the Generate button. The Author-Title-Key is a combination of some of the characters from the main title and the names of the people involved. The process name is made up of the Author-Title-Key generated by Goobi and a unique identifier. This way, each process within Goobi is given an unambiguous and meaningful name.

Before you click on the Save button, please ensure that at least one digital collection has been selected for this process. To select more than one digital collection, hold down the Control key. Finally, click on Save to move to the next page.

Goobi will now display an overview page that allows you to work with the new process you have just created. For example, you can generate a docket. To do so, just click on the Print docket link. To create another process, click Create new process. If you want to open the new process to modify its individual workflow or settings, click on Open newly created process.

Once a new process has been created, it can be viewed (together with its first open workflow step) by all those users and user groups who have been configured for that step. So, if the first step in the workflow is Scanning, for example, the newly created process will be visible to all scan operators assigned to the overall project containing that process. Accordingly, the task can be accepted and performed by any of these scan operators.

How different user groups work with Goobi

The next section examines some typical user groups in order to illustrate their structure as part of digitisation projects at numerous establishments and the way they work with Goobi on a daily basis. In terms of the way tasks are performed, any user groups or methods that may be in place at your own establishment but not specifically represented below will nevertheless work along similar lines to those shown in the examples. Accordingly, all the examples given are also applicable to individual workflows and other project-specific arrangements.

Scanning

The methods used by scan operators working with Goobi on a daily basis do not vary substantially from those of other users with different qualifications or responsibilities. All users must first log in and select My tasks from the menu bar.

This list of tasks contains all the workflow steps that the scan operator is authorised to perform as a member of one or more projects and user groups. The symbols in the Actions columns indicate the current status of the workflow steps in those rows. As described above in section My tasks, each user can see immediately which tasks are still open and waiting to be processed and which are already in progress, i.e. being processed by that user or another user with the same level of authorisation and qualifications. The symbols in the column Priority indicate error messages or priorities.

From an individual user’s perspective, working with Goobi on a daily basis generally involves selecting a task from the list of those offered and then clicking on the corresponding Actions button at the end of the row to view full details of that task.

The top-left box in the Details of the task window contains some general information about the task selected and accepted for processing by the user. You can use the Process log box immediately below this to enter whatever information you wish. This information will be visible to all users who subsequently accept other tasks as part of the same process at a later stage of the workflow. Its purpose is therefore to act as an open area for communication between different users in the form of general notes or observations. It can be used, for example, to draw attention to the fact that the volume in question needs to be worked on with particular care on account of its properties, or to a particular feature of the process that users at other work stations at a later stage of the workflow need to be aware of. Comments entered by users manually in the Process log are usually displayed in green.

It is also possible to upload files for this process, which are then listed within the process log. In contrast to the digital copies, these files do not serve to describe the work itself, but can contain information about the work or the method of operation to be observed. For example, it can be used to upload routing slips or offers from restoration service providers. The files uploaded in this way are available for all subsequent workflow steps, as are the comments.

In addition, you can switch to the file area at the top of the process log. Here the files of the different directories of a process are listed. Depending on the respective user authorization, these files can also be downloaded and deleted there. In the lower part of the process log it is possible to select a file to upload. After selecting a file, an optional description can be added. It is also possible to specify whether the file should only be used internally or whether it should also be considered for later export.

Information that has been transferred by external applications or scripts to the currently displayed process is displayed in different colors within the process log. The colors used here have the following meanings:

Colours in the Process log

Colours

Meaning

Grey

Messages containing detailed program information for debugging purposes. This information is intended primarily for more precise analysis and not so much for users.

Blue

Messages containing general information.

Orange

Alerts warning about a critical status.

Red

Error messages documenting errors that have arisen.

Green

Messages entered manually by users in the Goobi interface.

Black

Uploaded files

To the right of the general properties box, you will find a number of extended task properties that can or in some cases must be entered by the user. These will depend on the configuration of your particular site, the process and the workflow step. In the screenshot shown above, for example, Goobi has been configured in such a way that the user must enter details of the opening angle and the scanning device.

Once a task has been accepted by the scan operator, Goobi will create a new folder within that operator’s work drive for storing the digitised files. Depending on the installation and configuration, users are generally allocated a network drive for use on the local work station computer. After a task has been accepted, that drive will contain an additional folder to which the scanned files can usually be saved directly.

Once a task has been completed, i.e. when all the required pages of the physical source have been scanned and saved in digital form in the folder provided by Goobi, the user needs to click on the Fishish this task link in the Possible actions area. This tells Goobi that the task has been completed. Goobi will then check for graphics files in the specified folder within the user’s work drive, check the names given to those files and end the task for the user. Goobi will also remove the folder provided for that specific process within the user’s work drive so the scan operator can no longer access it and the digitised material it contains. Whenever a user tries to close a task, Goobi will draw attention to any compulsory fields that need to be completed, e.g. the opening angle or the scanning device (see screenshot above). This means a task cannot be closed until all the required information has been entered in full by the user.

If, after accepting and starting a task, you decide that you do not want to work on the selected task, you can simply return it. Goobi will then reset that workflow step to its original open status. It is now available to any other authorised users or even to the same scan operator at a later stage. This arrangement can be useful if a task that has been selected involves too much work and cannot be completed, for example, on the last day before that user’s holiday, as it would otherwise remain unavailable to other users for a relatively long period.

While you are working on a task, if you notice an error that has been made at an earlier stage of the workflow that needs to be rectified before you can complete your work, you can send an error message to the work station responsible for that earlier task. This tends to be very rare in the case of scan operators. A description of such error messages can be found in section Quality control.

Once you have closed a task, you will return automatically to the My tasks list. The completed task will no longer appear in the list. You can now continue with the next task from your list.

Quality control

At many sites that work with Goobi, a manual check is made on the quality of the scanning work performed in earlier steps of the workflow. The methods used by the staff who conduct these quality checks are almost identical to those used by the original scan operators. As in , staff in the Quality Control user group first log in to Goobi and then select My tasks from the menu bar. Next, from the list of available completed tasks, they choose one to review. The choice will depend on the priority level and any error messages.

Once the reviewer has selected a task, Goobi will create a new folder within that user’s work drive (in the same way as it did for the scan operator) containing all the digitised material produced in an earlier step of the workflow (usually at the scanning stage).

In most cases, the workflow in Goobi is configured in such a way that quality control staff are not expected to make any changes within the new folder and have read-only access. Any general observations entered at an earlier stage are also visible to the reviewer in the Process log in the bottom left of the window. Any configured additional properties that users from this group need to enter or select during this task are shown next to the general properties. Using any standard image viewer, the user can now review each of the images in the new folder and check the quality of the digitised output.

If the quality matches project requirements, the reviewer will then click on the Fishish this task link to remove that task from the My tasks list. The new folder created by Goobi to allow the reviewer read-only access to the digitised material will then be removed automatically from that user’s work drive, thus preventing any further access to the data for that specific task within the specified workflow.

If a reviewer finds that the quality of individual files of digitised material is unsatisfactory, or that certain pages are missing or repeated, Goobi provides an option to send a correction message to a previous work station.

To do this, select the Report error tab in the Possible actions box. Goobi will display a list of all those tasks previously completed, together with details of the users responsible for those tasks. The reviewer can enter a description of the error found during quality control in the free text box below the selected task.

After the reviewer clicks on the link Send correction message, the incorrectly performed task will reappear in the My tasks list of the work station in question. The user will be able to view details of the error message by holding the cursor over the red warning message or by re-accepting the correction task. Having corrected the error, the user can then enter a description of the solution found. It is therefore possible to assign processes more than once within the workflow (in the event, for example, of an error).

Manual script steps and plugin steps

In addition to the tasks we have already described, it is possible to configure Goobi so that users can call up any external programs, scripts or plugins for specific purposes from within Goobi. To this end, Goobi provides an option to configure additional buttons in the Possible actions area for individual steps of the workflow. The names of these buttons are specified by the Goobi administrator when configuring the workflow. When they are activated, they call up one or more configured instructions on the server.

Based on the way Goobi has been configured by the Administrator, the user can decide whether the scripts are actually executed or not. The user simply needs to click on one or more of the proposed actions. Depending on the configuration, the user will receive a message to confirm that the server has executed the script correctly.

Automatic script-run steps

As well as the manual scripts described in the last section, Goobi can also be configured to execute individual workflow steps automatically if they have been configured to call up scripts.

To do this, as for the manual scripts, the administrator will assign a series of different shell commands and stipulate that they should be performed automatically one after the other in the specified sequence. Automatic script-run steps are not generally included in a user’s task list. If an error occurs during the execution of the configured server-based script calls with the result that the script cannot actually be executed, the automatic workflow step will keep the same status and will be displayed automatically in the assigned user’s task list. Automatic script-run steps are therefore only visible in the user’s task list if errors occur during execution.

To identify the error, the user can employ the same method as that described above for manual script-run steps, i.e. by starting the scripts in question manually and thus identifying the source of the error. In cases where the automatic scripts make use of the function allowing users to send information or error messages concerning a specific process, the error messages are also listed in the Process log within that task without having to run the script again to identify the error.

Automatic script-run steps that successfully execute the configured scripts automatically close the current workflow step and activate the next step in the workflow.

Metadata processing

One of the main areas of work for those users with library training in the field of digitisation projects involves the comprehensive indexing of digitised objects. Basically, this covers pagination, structuring and the recording of metadata.

Those members of the digitisation project who are responsible within their user groups for recording structure data and metadata will complete their tasks in Goobi in a similar way to the other user groups described earlier (scan operators and quality control).

After logging in, they also need to click on My tasks in the menu bar and select the task they wish to process next from the list of tasks displayed. The screen containing the details of the selected task is very similar to that described above for scan operators and quality control users. Here, too, the user will find a box of general properties on the left and the Process log for general observations concerning the process. However, one additional action is available in the Possible actions area. The Edit metadata link allows the user to open Goobi’s Metadata Editor. The precise functions and user operation of the Metadata Editor are described in a separate section of this manual on account of its complexity.

The remaining actions offered by Goobi correspond to those described earlier for the workflow steps and . When the user closes the task by clicking the Finish this task link, Goobi will validate the structure data and metadata. The task can only be closed completely if all the configured rules on structure data and metadata have been observed.

Unlike project members working as scan operators or in quality control, users involved in processing structure data and metadata do not generally work directly with data from the file system. Consequently, at this point Goobi does not provide access to the digitised material in the file system within the user’s work drive. If required, however, this can also be configured for users processing structure data and metadata to allow for read-only access to the images and allowing them to be displayed using an image viewer. In this case, when the task is closed, the folder containing the digitised material would be removed from the user’s work drive to prevent any further access to that material outside the authorised workflow.

Export to the DMS

One of the workflow steps frequently included in a wide range of digitisation projects at numerous establishments involves making the digitised material available to the public together with the corresponding structure data and metadata, obtaining which can sometimes require a considerable investment of time and effort. This digitised material can be made available in a range of systems independently of Goobi and can be used, for example, on completely different hardware from that required to run Goobi. In this step, Goobi will export the digitised material, together with the structure data and metadata in the form of a METS file, to a Document Management System (DMS). This step can either be completely or partially automated, depending on the configuration.

If the procedure is only partially automated, the data is exported manually by the responsible person in the user group using roughly the same methods as those described earlier for the other user groups. After logging in and selecting a task from the My task list, the user will be shown a screen containing the details of that task. As in the previous examples, in addition to the task details, this screen contains a Process log for messages about the corresponding process. However, an additional button can be found in the Possible actions area. This button can be selected to launch the actual export to the external DMS system.

Please note that exporting the digitised material together with the corresponding structure data and metadata to an external DMS can take some time depending on the configuration and the volume of data. If Goobi has been configured to validate the export to the DMS, it will prevent any further work until the communication between Goobi and the DMS has been completed. You will not see any message to confirm that the export has taken place until this communication is over, and in the case of large volumes of data this can easily take several minutes. If the export fails, Goobi will display a detailed error message on the screen that was responsible for preventing the export. Such errors are usually caused by a failure to observe the rules within the structure data and metadata. As part of the DMS export task, Goobi can also (depending on the configuration) allow users access to the structure data and metadata through the Metadata Editor, so that they can resolve any such validation errors themselves immediately. In most cases, however, export errors lead to the sending of an error message to the responsible work station earlier in the workflow. A description of how to send error message to other work stations within the workflow can be found in section Quality control.

If Goobi has been configured by the system administrator in such a way that the user does not have to wait until communication with the DMS has ceased, it will display a positive message immediately (after the user clicks the Export to DMS link) to indicate that the export process is now running in the background. The results of the export are still validated in the background (even though validation is not displayed on the user screen) and are stored with details of system events in the form of log files. This means that the user does not have to wait and can continue to work with Goobi immediately by clicking the Finish this task link. The task is then removed automatically from that user’s My tasks list.

From Version 1.9 of Goobi onwards, users have been able to perform a fully automated DMS export in addition to the manual option. In this case, DMS export tasks will only appear in a user’s task list if an error has occurred during the export process. Assuming that Goobi has been correctly configured, the validation of earlier workflow results should ensure that the automatic export is successful. In such cases, the export process is fully automatic within the workflow and will on completion activate the next task in the workflow.

In Goobi 2.0, the Export function has been updated so that users can integrate various plugins at this point allowing them to respond flexibly to the requirements of different presentation environments. If you wish to use this plugin-based export function, the workflow step must be configured with the Export DMS setting. However, you will also need to enter the plugin that has been provided. If the special plugin-based export function has not been configured for that workflow step, Goobi will use the default export function.

User interface

Once you have opened the Metadata Editor, you have full access to all the editing options relating to the pagination, structure data and metadata of the digitised material. The Metadata Editor is divided into a number of sections.

Structure tree

The grey-shaded area on the left of the screen contains the structure tree, where you can see in hierarchical form all the structure elements that have already been obtained from the source material. When you select a structure element, it will appear in bold in the tree view. For each structure element, the descriptor is based on the type chosen for that element.

Icon

Description

Symbol for collapsing and expanding the structure element hierarchy

Click on the icon just in front of the structure element to expand or collapse individual sections of the structure tree. If you hold the cursor over the small symbol next to each structure element, you will see a pop-up with further details of that structure element without having to open it.

Page display

The right-hand side of the Metadata Editor gives you an overview of the individual digital images that form part of the current process together with the number of pages, the current magnification level, the currently displayed image number and information about which derivative of the available digitised material you are currently viewing.

You can move between individual pages using the Forward and Previous links just above the image. You can also move quickly to the previous and next images by selecting them directly.

The current page is shown in the middle of the page range with the previous and next two pages on each side (provided they are available for display). Clicking on any of the pages will take you directly to the corresponding scanned image. Goobi also features a number of keyboard combinations for repeated navigation between different pages of the source material.

Keyboard combination for navigating between images in the Metadata Editor

Keyboard combination

Description of function

Ctrl + Shift + Cursor left

Move to previous image

Ctrl + Shift + Cursor right

Move to next image

Ctrl + Shift + Cursor up

Move 20 images forward

Ctrl + Shift + Cursor down

Move 20 images back

Ctrl + Shift + Pos 1

Move to first image

Ctrl + Shift + End

Move to last image

Using these keyboard combinations allows you to move quickly and easily through the digitised material, even over large areas. Another navigation option allows you to move directly to a specific image by entering that image number in the Go to image box and pressing return. Goobi will then automatically display the requested image.

Zoom

In addition to all the above navigation options, you can also change the way Goobi displays the scanned image. To do so, just click on either magnifying glass symbol below the page navigation bar to increase or reduce the magnification/zoom level. The current magnification is shown between the two magnifying glass symbols. If you want to select a particular level of zoom, simply click between the two magnifying glass symbols, enter a figure and press the Enter key to confirm.

Rotation

Goobi also allows you to rotate the image in 90 degree stages to view information that can only be read in landscape format. To do so, just click on the arrow pointing to the right to rotate the image clockwise and on the arrow pointing left to rotate the image anti-clockwise.

Selecting the image folder

Underneath the image you will find an option to select from different derivatives of the image (where available). In the drop-down list entitled Folder, Goobi lists all the image folders linked to the currently selected process. For example, if you already have a number of derivative images for a specific process (e.g. master images, scaled or compressed versions of images or derivatives with a different tone such as bitonal images), you can switch between these derivative images simply by selecting the corresponding folder.

Metadata indexing

The main area of the Metadata Editor also contains a navigation bar to perform a range of steps during metadata indexing. Selecting Pagination, Structure data, Metadata and File replacement allows you to switch between the individual editing modes in the Metadata Editor. Each of these modes is explained in greater detail below and should be performed in the specified order.

Pagination

Pagination is a key element of digitisation projects. It involves matching page labels (printed page numbers in the source material) to the scanned images. In older works, the page numbering often changes within the actual source text. Often, there is no printed page number, or the same page number is used more than once, and sometimes the pagination can change several times within a single work. A typical example of altered pagination occurs when the preface or introduction makes use of Roman numerals only to be replaced by Arabic numerals after the index.

To edit the pagination for the source work, open the pagination area in the Metadata Editor by clicking on the option in the navigation bar. Goobi will calculate the number of images in the current process folder and produce a vertical list. The box immediately to the right is used to allocate a pagination number to each image. As you can see below in the box entitled Page selection, Goobi has initially determined that all the pages should be numbered consecutively using Arabic numerals, beginning with the number 1.

Using either the keyboard combinations or the image navigation bar above the scanned image, you can now navigate through the entire set of images to obtain the printed page number for each image. In most cases, the first few pages of books do not contain a page number. Click the first checkbox for image 1 in the Page selection box. Next, in the Define pagination box, select unnumbered from the drop-down list and then click on the link From first selected page. For pagination purposes, you have now told Goobi to treat the volume as though it did not contain any printed page numbers. Next, move to the first image that shows a printed page number.

Tip: While you are navigating through the images, if you come across a page where the pagination changes, just click on the image. This will automatically select the corresponding checkbox in the Page selection list. To actually mark the checkbox, press the space bar on your keyboard. Alternatively, you can of course identify the corresponding image in the Page selection list and then mark the checkbox using your mouse.

When you are selecting pages, please remember that you must go by the image number and not by the printed page number in the source. In Goobi, you can always tell which is the image number. Two numbers are always shown for each image, separated by a colon. The number on the left is the number of the file within the file system, i.e. the image number. The number to the right of the colon is the printed page number in the source, known as the page label. You should always take the image number (to the left of the colon) as your guide to avoid accidentally choosing the wrong pages. It is worth noting that in some source material the same printed page number (page label) can occasionally be used more than once.

Once you have identified the page where the pagination of the source material changes and marked the corresponding checkbox, choose the required pagination type in the Define pagination box. Next, enter the current page number/label and click once again on the link From first selected page. This will instruct Goobi to allocate the required page number to each image, beginning consecutively with the page you have selected.

Sometimes, the pagination of the source material can change more than once. Pages without a page number and repeated instances of the same page number occur frequently. Pages on which a new structure element begins (e.g. new chapters) but that do not contain a printed page number (although page numbering is continued on subsequent pages) are marked with a simulated pagination. This simulated pagination can be recognised in that a logical numbering is assumed despite the absence of a page number, and consequently the assumed page number is shown in brackets.

Example: If a chapter starts on a page without a printed page number and the next page contains the printed page number 4 you would give the first page of the chapter a simulated (i.e. assumed) page number, in this case a number 3. Simulated page numbers are based accordingly on an interpretation derived from the missing page identifier and the available page identifiers on subsequent pages of the source. Simulated page numbers are shown in square brackets. In this example, the designation used for the first page of the chapter would therefore be 3.

For efficient pagination, use the keyboard combinations described in . Especially if the source is particularly large and you cannot or do not wish to check every page, you can ensure that your pagination is actually correct by checking a selection. In most cases, it is relatively easy to detect mismatches in the pagination. To do so, use the keyboard combinations to navigate 20 pages at a time through the set of images, comparing the printed page number on your selected page with the pagination numbers allocated through the automatic consecutive numbering option. You can identify mismatches where the pagination sequence no longer coincides with the sequence of printed page numbers.

Alternatively, you can at any time click the icon next to the page number in the Page selection box to display the corresponding image in the image display area on the right.

Icon

Description

Goobi supports different page numbering methods for pagination purposes. As well as allocating Arabic and simulated Arabic page numbers, Roman and simulated Roman page numbers, free text and unnumbered pages, you can also specify the sequence to be used for consecutive numbering. Using the symbols in the Define pagination box, you can determine how the page sequence in the book should actually appear.

Icon

Description

Goobi supports the following page sequences:

Pagination types supported by the Metadata Editor

Pagination type

Description of pagination type

Tip: If you want to make a change within an existing pagination sequence, you can use one of Goobi’s features that that does not automatically overwrite all the subsequent pages using automatic numbering. To do so, select one or more pages in the Page selection _**_box, enter the required numbering and pagination type and then clink on the link entitled Only the selected pages. In this way, only the selected pages will be affected by the change.

Setting representatives

Within the METS editor, a screen can be labelled as a representative of the object. The representative is an image that represents the work. This is usually the title page. Most digitisation portals such as Europeana, DDB, ZVDD, VD18 or the Goobi viewer use the first image of the METS file as the representative of the object, unless another image is explicitly defined as the representative.

To select the image to be used as a representative, simply click on the star icon next to the desired image. The selected image is then displayed in the form of a blue star icon. Within the METS file, the selected image is then labelled with the attribute USE="banner".

Structuring

When you have completed the task of pagination, the next step in the workflow involves structuring.

Working again in the Metadata Editor, the aim is to identify and record all the relevant structure elements of the material in a tree-like list. In this step of the workflow, all the structure elements to be indexed for that source material are marked using as precise a descriptor as possible and allocated to the corresponding structure element within the hierarchy. By way of example, for a monograph this can mean that several chapters, a foreword, an afterword, an index and other structure elements can be added under the structure element Monograph. Further sub-elements can then be added under these structure elements and so on. In fact, the Metadata Editor allows you to create a hierarchy of elements with as many levels as you wish, including sub-chapters within chapters.

To start the task of organising the structure elements using the Metadata Editor, first select the Structure data option from the navigation bar. Goobi will display an overview of the available options. It is important to remember that all changes to the structure data depend on which structure element is currently selected in the structure tree on the left (highlighted in bold).

Create new structure element

The list of elements from which you can select in the New structure element box will vary depending on which structure element has been activated. You can choose the position where you want to insert the new structure element from the following options:

Positions for new structure elements

Position

Description of position for the new structure element

The list of structure elements from which you can select will vary depending which of the four position options you choose. The range of elements available for selection depends on how the Goobi administrator has specifically defined the system for the current project using the freely configurable rulesets. Within certain structure elements, this means that structure data and metadata can be specified that are admissible within a given hierarchy. For example, within the structure element Monograph it is not possible to assign the structure element Periodical article, although you could assign a Chapter, which is a perfectly normal component of a monograph. Depending on this configuration, the list display will be based on the structure element currently selected in the left-hand navigation tree.

Structural elements for pages

The fastest way of adding new structure elements in Goobi is to choose the option As last sub-element As last sub-element First page input box. Move forward until you come to the last page for the structure element you wish to add, and this time click on the image symbol next to the Last page input box. Clicking the image symbols instructs Goobi to apply the page number currently selected on the right of the image display. Using the little arrows on either side of the image icon, you can enter the page number for the preceding or following page. This allows you to allocate pages without spending an inordinate length of time moving between pages in the image display section. Next, click on Add structure element to insert this new structure element as the last sub-element within the current structure element. This is an efficient method of adding structure elements.

Icon

Description

If you want to create additional sub-elements as part of a structure element that you have already added, first select the existing element to which you want them to belong. You can now use the same method as described above by adding each new structure element as the last sub-element.

This single step of the workflow therefore involves providing two important pieces of information. As well as identifying and recording the digital structure of the source material (to reflect its logical structure in the form of chapters, sub-chapters, forewords, afterwords, indices and any other structure elements that can be configured individually in your Goobi installation), when you add structure elements you are also providing the corresponding page ranges for each one of those elements.

Clicking on the small arrow symbols next to the First page and Last page text boxes instructs Goobi to set these as the beginning and end pages for that structure element. As the metadata file grows in the background, it will eventually be perfectly clear which page ranges correspond to which structure elements. If the digitised material is later made available for viewing, e.g. on a website, researchers will then be able to display all the pages corresponding to a particular sub-element, and sub-chapters downloaded in pdf format will always contain the correctly allocated pages.

For some structure elements, Goobi will display input fields below the selected type so that metadata can be allocated directly to the structure element. Which fields are displayed (e.g. main title, keywords) will depend on the structure type that has been selected and on the ruleset configurations in Goobi.

Structure elements for image areas

If structural elements are to be assigned to image areas, this can be done in a similar way. However, instead of clicking on the image icons for the start image and the end image to be used, select the icon below for the image section.

Now use the mouse to select the area on the displayed image that you want to define for the structural element.

You can then proceed in the same way as described above and enter further metadata for the structural element before finally creating it.

Moving structure elements

There are several ways in which you can subsequently modify the structure of a document.

One of these involves moving structure elements to a different position. To do this, first select the structure element from the structure tree. Next, in the Selected docstruct box, select the function Move docstruct up or Move docstruct down to move the selected element by one position in the required direction.

Icon

Description

However, if you want to move the structure element to a different position within the hierarchy, you will need to select the function Move docstruct to other location. This will open a dialogue box in which you can specify exactly where you want to place the structure element. You will be able to select only from those locations permitted by the Goobi ruleset.

Icon

Description

Copying structure elements from other processes

Within Goobi, it is also possible to copy structural elements from the METS file of a process to another process. To do this, click on the Import structural element from another process function in the Selected structural element box.

First enter the name of the process from which you want to transfer the data and then click on Search for process.

You then have the option of selecting one or more structural elements of the selected operation by clicking on them.

Once you have made your selection, click the Next button to go to the structure tree of the currently open process.

Now select the structure element into which the previously selected structure elements of the other operation are to be inserted. Please note that only those structure elements can be selected as the target that may contain the copied elements according to the rule set. Please also note in this context that this functionality only copies the logical structural elements from another process. Images and page assignments are not copied in this copy process.

Modifying and verifying data

This section explains a number of additional functions in the METS Editor that involve directly modifying the file system for the images. For this reason you should take great care when using the functions described below.

Subsequent changes to pagination

If you wish, you can modify previously created pagination sequences. Goobi allows you to move or delete individual pages or several pages at the same time. To do this, you first need to select the required pages in the Page selection box. Once you have highlighted at least one page in this way, you will be able to choose from the following functions:

Icon

Description

Regardless of whether or not you select a page, you can also use the following function:

Icon

Description

If one or more pages are moved up, each page will displace its predecessor in the pagination sequence and will take over all the settings and page allocations of the predecessor. If a page is moved down, its original place it taken by the next page in the pagination sequence.

If the changes affect the image currently being displayed, the image number in the page display is updated automatically. There is no change in the displayed image.

As well as moving files, you can also delete selected files. This action deletes the selected page completely from the metadata file. It affects both filename allocation and allocation to structure elements or allocated metadata.

The file is also deleted from all the process folders. You can search for files that you wish to delete from all the process folders using the selected filename. Goobi will search the available folders for files that match the filename (but not necessarily the file extension). The file will be deleted in all the subfolders of the ocr and images directories. If the currently displayed image has been deleted, Goobi will display the first image. If not, the image number will be updated.

Goobi also allows you to generate new filenames so that systems that are not based on the sequence of filenames in the METS file can continue to display the images in the correct sequence. All the files are numbered using an eight-digit system based on the sequence in the METS file. The files will be renamed in all the process folders.

Uploading files

This function can be used to upload a file from the user’s computer to a selected process folder.

The user will need to specify a file and choose the position within the pagination sequence where the new file is to be inserted. Next the user can choose whether this new file is to be inserted as an uncounted page or whether it should be integrated into the existing pagination. Click the Upload file button to insert the file into the currently selected process folder. If this is the media folder, the pagination sequence will be updated. The file will be inserted in the selected position and either created as a uncounted page or integrated into the existing pagination sequence. If the file is integrated, the page number for each of the following pages will increase by one. In this context, however, it is important to note that the last file in the pagination sequence is not allocated a page number. It remains unpaginated and if necessary will again have to be changed separately in the pagination sequence.

If there is already a file in the selected folder with same name as the file being updated, the file cannot be uploaded and the user will receive an error message.

Downloading files

This function can be used to download a file from the current process folder to the user’s own computer. The user can select the file to be stored locally using the computer’s file browser.

Server-based imports

Provided that the server-based target folder is not empty, you can use the Server-based import box to select a folder from which previously exported images can now be imported into the current process.

In order to do this, you will need to specify the point in the pagination sequence where you want to insert the new file. You can then choose whether the new files are to be inserted as uncounted pages or whether they are to be integrated into the existing pagination.

For every subfolder in the target folder, Goobi will then search for a counterpart in the current process folder. The selected files are then imported to this folder. Once the files have been successfully imported, the subfolder in the target folder is deleted.

Edit OCR results

The metadata editor has an editor for OCR results with which the ALTO XML files in the process can be edited. This editor can be accessed via a button in the upper menu bar:

The ALTO editor overlays the usual interface of the metadata editor. The interface consists of two main elements. The digital copy is displayed on the left and the automatically recognised OCR text on the right. If you move the mouse over the image or over the text on the right, the recognised line and the recognised word are highlighted in the image:

The editor allows you to correct the words recognised by the OCR. To do this, simply click in the text on the right and change the words as you are used to doing with text editors. When you edit a word, it is also marked on the left of the digital copy. If a word is difficult to read, you can zoom in the image display with the mouse wheel. In addition, the page to be edited can be selected via the navigation below the image.

To save the results, just click the green button at the bottom right. To undo all changes, the editor can simply be closed.

Overview of the keyboard combinations

There are various keyboard combinations available in the metadata editor. These are summarised here once again:

Keyboard combination

Description of function

Ctrl + Shift + Cursor left

Move to previous image

Ctrl + Shift + Cursor right

Move to next image

Ctrl + Shift + Cursor up

Move 20 image forward

Ctrl + Shift + Cursor down

Move 20 images back

Ctrl + Shift + Pos 1

Move to first images

Ctrl + Shift + End

Move to last image

Ctrl + Shift + Space

Activate or deactivate the checkbox for the currently displayed image for pagination.

Ctrl + Shift + Enter

Save the METS file

Ctrl + Shift + v

carry out the validation

Management

Goobi Management

In the following sections you will get a detailed overview of several management areas, as for example for the creation of workflows, management of projects, users, user groups, statistics and rulesets.

LDAP groups

One of Goobi’s key features is the integration of both Goobi and the additional services linked to it to a central authentication system. In most cases, the authentication process is conducted via a central LDAP server that allows users to work in different systems with a single user account and just one login and one password.

This means it is possible to log in to Goobi with your own user account and to access the work drive on which Goobi provides the data to be processed by users.

Access to the work drive is provided in the background via a Samba server that authenticates user requests via the central LDAP servers. To configure the LDAP groups, you need to click on the menu item Administration - LDAP groups. Goobi will display an overview of all the LDAP groups that have been set up. In most cases, only one LDAP group will be configured.

Klicken Sie auf das Icon zur Bearbeitung einer der konfigurierten LDAP-Gruppen, so gelangen Sie in die Bearbeitungsmaske. Innerhalb dieser können Sie die einzelnen Parameter für die LDAP- Authentifizierung konfigurieren.

Icon

Description

Create new LDAP group

Edit existing LDAP group

Tip: You must take great care and consult with the system administrator before making any changes to the data provided. If Goobi is configured incorrectly at this point, you could prevent all other users from using the program. Please feel free to contact intranda’s support team if you wish to reconfigure the LDAP groups but are not sure which parameters to use.

After changing the LDAP group parameters, click on the Save button to return to the list of LDAP groups and apply your changes. Clicking on the Delete button will delete the selected LDAP group permanently from Goobi. You should note that users will then no longer be able to access the authentication function. It is not possible to continue working with Goobi if an LDAP group has been incorrectly configured or deleted.

Users

In order to specify and manage the users that you want to be able to work with your Goobi installation, first select the menu item Administration - Users from the menu. This will display a list of all the users currently entered in Goobi. In the Users column, you will see the first name and surname of each registered user. The Location column tells you the establishment or town where the user is based. The User groups column shows you the user groups to which that user belongs. In the Projects column, you can see which projects each user has been assigned to.

Finally, in the Actions column, you can edit the details for each user. As an administrator, you now have an option to select a user and switch to that users role.

Icon

Description

To do so, click on the icon to the right in the Actions column. This will switch your own authorisation level and role as an administrator to those of the selected user, allowing you for test purposes to check how Goobi behaves for a specific user. This can be particularly useful if individual users have reported problems that cannot otherwise be traced. By switching to the authorisation level and role of a specific user, you can see exactly how the interface appears to that user without that person having to give you his/her user name and password.

Just above the list of users, you will find an option that allows you to display inactive users as well as active users. To do so, click on the checkbox Only show active users. This will remove the tick and deactivate the default setting which restricts the display to active users.

You can also use the filter above the table to search for a specific user from a large list. Simply enter part of that user’s first name or surname and press the enter key to conduct the search.

To add a new user, click on the Create new user link below the table. To edit the details for an existing user, click on the first icon in the Actions column for that user.

The Edit user dialogue box allows you to change the details already stored in Goobi for that person. As well as the Surname, First name and Location, you can also assign a Goobi Login and Password.

If Goobi has been instructed at the main configuration stage to authenticate each user against a configured LDAP, you will at this point need to specify for each person which server is to be used for authentication.

In the Metadata language input field, you can use a language code in free text form. Please ensure that the language code you use matches exactly the code defined in the ruleset with which users will be working in the METS Editor. In the example given below, the rulesets must contain a localised form of the metadata and structure data descriptors (language) for English with the language code en. If the field is left empty, the language set in the browser is used in the metadata editor.

Localisation of the metadata item 'logicalPageNumber' for various languages within a ruleset

You can specify in the Edit user dialogue box whether the user’s Goobi account should be active or inactive. This means you can configure Goobi in such a way that existing user accounts remain in place even though the corresponding users are no longer authorised to access those accounts. This is particularly important to avoid any potential loss of existing information from source material that has already been processed or any difficulties conducting statistical searches that might otherwise be caused by the removal of individual users from the database. It also ensures that information about source material whose workflow was completed some time ago can be analysed precisely and that project managers or administrators can identify which person performed which tasks and when. This can be particularly useful in the event of an error.

By activating the batch download checkbox, you can allow individual users to display a more extensive list of tasks of possible actions underneath the My tasks list. These enable the user to accept several tasks at the same time and to work on them all together, e.g. in an automated batch process to optimise image quality. Once tasks such as these are completed, the source work images corresponding to the process will be stored in the folder provided by Goobi for batch upload on that user’s work drive. This folder is often called 'Ready', 'Done' or similar. This can be configured individually for each Goobi installation. Placing more than one folder of related tasks in the batch upload folder allows users to close all finished tasks for which there is a related folder in the batch upload folder by means of a single click in the Possible actions area. This method enables users to accept and close a large number of tasks simultaneously.

Thanks to its comprehensive system of user authorisations, Goobi ensures that each person can be assigned not only to different projects but also to different roles. Furthermore, every user can perform more than one role at the same time. In this context, roles can also be understood as levels of qualification or as skills. These roles can be assigned in the User groups area. To assign additional roles to a specific user, simply click on the Add user groups button for that user.

In the User group window, select all the user groups to which you want to assign the user and then click on the Save button to confirm your selection. You will find that the selected user has now been assigned to those user groups. You can follow the same process to assign users to projects.

To remove a person from a particular user group or project, click on the symbol for deletion next to that user group or project. Next, to apply permanently any changes you have made, click on the Save button.

If you have added any new users whose login details need to be authenticated by the configured LDAP server, you will need to open their details again in the Edit user dialogue box after first entering them. The additional icon next to the LDAP group configuration will not be visible to the user until that user has been stored and the corresponding details re-opened in the dialogue box. Click on this symbol to re-enter the user’s details in the configured LDAP. Once this step is completed, the user will be able to log in.

In order to set a user as a super administrator, a super administrator must already exist so that this super administrator can set further super administrators. If no super administrator has been set so far, the following command must be issued within the database:

Processes

All objects that are processed in Goobi as part of the whole range of digitisation workflows, and that as such have their own individual progress status, can be viewed in the Processes area. First, click on the menu item Workflow – Processes in the menu. Goobi will now display a list of all the processes you are authorised to view. If you are a Goobi administrator (as in the screenshot below), you are authorised to view all the processes being managed in Goobi.

If you are not an administrator but are authorised to Goobi manager level, you will have been assigned to specific projects. In Process view, you will only be able to display those processes you are authorised to access through your membership of those projects.

The number of hits listed in the table of processes is determined by each individual’s user configuration. In the diagram shown above, for example, you can see the number of rows in the table is limited to ten. In order to display more processes, you would need to move to the next page using the page function below the table.

You can use the simple checkbox filters above the table to quickly filter the results displayed by Goobi or to show processes that are hidden by default. By way of example, an administrator can select the checkbox Show deactivated projects above the process table to display all the processes that were previously hidden because the corresponding project has been deactivated. Using the checkbox Show finished processes, you can display all the processes that belong to active projects but have already completed every step in their workflow. By default, if neither of these checkboxes is selected, Goobi will only show processes that are currently being processed as part of the workflow, i.e. completed processes and deactivated projects are excluded by default.

If you want to filter the list of processes displayed by Goobi still further (e.g. because the list of processes is too large), you can enter your own filter in the Filter processes input box. The options available to you here are very comprehensive and are explained in detail in section Searching processes. As an alternative to this filter function, you can perform a search by selecting the menu item Workflow - Find process in the menu. This will open a detailed filter dialogue box where you can fine-tune your search using a combination of properties, processes, task status, etc. This method of filtering using the Find process dialogue box translates your search request into a search filter string as described in section Searching processes.

After using a particular filter, if you wish to save it for future use, you can store it in the list of pre-defined filters. To do so, once you have entered the filter string, simply click on the save symbol next to the drop-down list of pre-defined filters. If you wish to reuse any of your pre-defined filters on a future occasion, simply choose it from the drop-down list to enter the filter string automatically in the Filter input box. To update the list of results using that filter, press the enter key or click on the reload symbol next to the Filter input box. The list of hits is automatically updated in the Processes window.

Each of the different columns in the table can also be sorted, allowing you, for example, to list the processes in ascending or descending order depending on the process name or status or the project to which it belongs.

You can adjust the way how the processes are displayed in tabular form. You can include other columns in the table, e.g. the identifier and the date on which the process was created. You can also add selection boxes that allow the user to select individual processes for batch actions. If you choose this option, you will be able to activate a checkbox in each row of the process table. This means that you can then apply any of the actions to those processes whose checkboxes are activated rather than to the whole set of filtered processes or to the processes listed on the current page.

You can also view the workflow details for each process by clicking on the Process title. This allows you to check the current status of individual steps of the workflow. Goobi will display a small summary view of the selected process indicating the current status of each workflow step for that process. If you hold the cursor over the small square coloured symbols to the right of each workflow step, you will see a brief overview indicating which users have previously worked on that step and when.

Activities for hit lists

As well as the options described above allowing users to edit or amend each volume independently of the current status of the workflow and to process its structure data or metadata or re-export the data to the presentation system, with Goobi you can also apply different actions to a whole group of processes. These actions are applied to all of the processes displayed in the table. If you want to restrict the change or action to a particular selection of processes, you will need to filter the list accordingly. To do so, simply use the Filter processes box, the list of predefined filters or Goobi’s search function to list only those processes you wish to edit.

Once you have adjusted the list of processes to meet your requirements, you can apply the available actions (shown underneath the process table) to that list. For each action, you can also choose whether to apply the action to the entire set of matches (i.e. all the filtered processes) or just to the processes listed on the page of the table that is currently being displayed. You will be prompted to make this selection when you click on one of the available actions.

The table below contains a description of each possible action.

Description of actions that can be applied to a group of processes

Icon

Description

Harvester

The harvester can be used to automatically import data from external repositories.

Overview

To be able to access the harvester, the user must have the Edit harvester repositories right. The Harvester menu entry is then available under the Administration menu item. This opens the screen for listing all configured repositories.

The function Add repository opens the editing screen to create a new repository.

Configuration

The first step is to enter a name and select the protocol type. The following are available: OAI-PMH, Internet Archive Web Search, Internet Archive CLI and the BACH API.

For BACH, the URL to the BACH server and the authentication token must be specified.

If the Internet Archive Web Search is selected, the URL to the advanced search interface must be specified. To import only certain works, a search filter must also be specified as part of the URL. This way, only publications that are marked as Open Access and have been published can be imported.

In order to import access-protected publications, the Internet Archive CLI must be used. The CLI must be installed for this, usually under the path /usr/local/bin/ia. In addition, the environment variables IA_USERNAME and IA_PASSWORD must be set. A search filter can also be specified here to narrow down the hit list.

For OAI-PMH, the URL to the OAI server must be specified. If the URL contains the parameters set and format, this information is automatically determined together with the base URL. Otherwise, they must be specified manually.

With OAI, the From and Until parameter can also be set to limit the query to a specific time period. If the fields are empty, the entire period since the last request is automatically queried.

Test mode can also be activated. In this case, only the first records of the hit list are imported without the resumptionToken being analysed.

The other settings then apply to all types.

The Poll frequency defines the intervals at which the repository should be queried. The specification is in hours.

Delay defines a time period up to which new data is to be queried. If a number greater than 0 is entered here, a search will not look for all data up to the current date, but for data published up to the configured number of days before the current date.

The ‘Download folder’ field is used to specify the folder in which the data is to be downloaded and saved. The folder is created automatically during the first harvesting if it does not yet exist.

Optionally, a script can then be called that is executed on each downloaded file. This can be used, for example, to perform an XSL transformation on each XML file or to write additional information in all JSON files.

If the data is not only to be downloaded but also imported as Goobi processes, the checkbox to Create processes must be activated.

The Project, Process template and Import format to be used can then be specified.

Manual harvesting

To start harvesting manually, you can use the now run once button in the Actions column of the overview. If the project is active, harvesting is then started once.

Automatic Harvesting

Automatic harvesting takes place regularly. The time at which it should run must be defined for this. This is done in the goobi_config.properties file using the line harvesterJob=0 0 */1 * * * ?. This causes the check to take place every hour on the hour. The configuration is carried out in chron syntax and allows any time periods.

When the check is performed, it is checked for each configured active repository whether the last run was longer ago than the value configured in the poll frequency field. If this is the case, harvesting is started.

Harvesting

When a new harvest is triggered, the records that have been published or updated in the repository since the last run are determined first. For each record, Goobi checks whether it has already been processed once or is new. New files are then downloaded to the configured folder. If a script has been configured, it is called for each downloaded file.

If configured, the files are now imported. In the case of marc-xml or pica-xml, the document type is determined first. Higher-level data such as journal titles or multi-volume works are skipped. In the case of subordinate documents (journal issues, volumes of a multivolume work), the superordinate work is searched for and also downloaded. The metadata is then parsed on the basis of the ruleset from the configured process template.

The process title is created on the basis of the identifier.

Config Editor

This editor allows direct editing of the Goobi workflow configuration files from within the user interface in the web browser.

Overview

The editor can be found as a separate entry in the Administration menu, from where it can be opened.

After opening, all Goobi configuration files are listed on the left side. These can be opened for editing by clicking the respective icon.

When a file is opened, a text editor appears on the right side where the file can be edited.

After making changes to a file and clicking the Save button, a confirmation message is displayed indicating that the file was saved successfully.

If a file has been modified and the user attempts to switch to another file without saving first, a prompt will appear asking how to proceed with the unsaved changes.

Configuration

The configFileDirectories element contains a list of directories that are used to collect all displayed files in the browser interface.

Each directory should be an absolute path that contains xml or properties files. Other file types are not supported until now. The directory name may end with a slash (/), otherwise it will be added automatically.

Backups are automatically created in a subfolder called "backup/". You can override this with the optional attribute backupFolder="myOwnBackupPath/".

IMPORTANT: The directory must be an absolute path while the backupFolder parameter must be a relative path. The backup directory name may end with a slash (/), otherwise it will be added automatically. To save backup files in the selected configuration directory, overwrite the backup folder with backupFolder="".

By default 8 backup files are kept, older files will be deleted. You can override this with the optional attribute backupFiles="".

You can filter the displayed configuration files in a directory with the fileRegex="" parameter. If the parameter is not used or is empty, it will be ignored.

Ruleset editor

This editor is used to directly edit the ruleset files of Goobi workflow directly from the user interface within the web browser.

Übersicht

The editor can be found in its own entry in the Administration menu, from where it can be opened.

After opening, all Goobi ruleset files are listed on the left-hand side. These can be opened by clicking on the respective icon in order to edit them.

If you open a file, a text editor appears on the right-hand side in which the file can be edited. If you edit and save a file, a backup is automatically created in the defined backup directory.

According to the value set in the configuration file, a certain number of older backups are retained here before they are replaced by newer ones.

The file can also be validated by clicking the Validate button. All listed checks will be performed during this process.

After modifying a file and clicking the Save button, a confirmation message appears indicating the file was saved successfully.

If a file has been changed and an attempt is made to change to another file without saving it, the operator is asked how to proceed with the changes.

Validation

When the Validatebutton is pressed, the following checks are performed.

Well-Formedness

Are all opening tags <> properly closed? Are any invalid characters used? Are attributes placed in the wrong position?

Invalid Names

Do all <MetadataType>, <Group>, <DocStrct>, and export formats have a <Name> element (or, for some export formats, an <InternalName> element) and is its value not empty? (ValidateNames)

Empty Translations

Is every value listed within a <language> element non-empty and not composed solely of whitespace? (ValidateTranslations)

Invalid Cardinality

Does each <metadata> and <group> subelement of <DocStrctType> have a valid value for the num attribute? Valid values are: 1o, *, 1m, and +. (ValidateCardinality)

Duplicate Definitions

Are all <MetadataType>, <Group>, and <DocStrctType> elements defined only once? (ValidateDataDefinedMultipleTimes)

Duplicates

Are the same <metadata>, <allowedChildType>, or <group> elements used multiple times within a <Group> or <DocStrctType>? (ValidateDuplicatesInGroups and ValidateDuplicatesInDocStrct)

Incorrect Use of Publication Types

Have <DocStrctType> elements with the attribute <topStruct="true"> been incorrectly used as <allowedChildType> within other <DocStrctType> elements? (ValidateTopstructs)

Undefined Metadata Types

Is there a <metadata> element used within <Group>, <DocStrct>, or in the export formats that has not been defined as a <MetadataType>?(ValidateUsedButUndefinedData)

Unused Metadata Types

Is a <MetadataType> element defined but not used in <Group>, <DocStrct>, or within export formats?(ValidateUnusedButDefinedData)

Unused Structure Types

Is a <DocStrctType> element defined but not used as an <allowedChildType> in any other <DocStrctType>? (ValidateUnusedButDefinedData)

Unmapped Structure Types

Are <DocStrctType>, <MetadataType>, and <Group> elements defined but not used in any export configuration? (ValidateDataNotMappedForExport)

Administration

Goobi Administration

This section includes several topics as for example configuration and granular adaption for data protection as well as information about metadata and mappings.

File system

As a workflow management application for the library environment, Goobi has to be able to deal with a wide range of specific configurations and project-specific requirements. To this end, it has been designed in line with established conventions. These cover individual directory structures and the way Goobi uses these structures in different areas of the application. This section outlines the directory structures that have proven most effective and explains how external storage is integrated into the system.

Global directory structure

As a web-based application, Goobi has its own structure and is located on a defined path in the file system independently of the servlet container being used. This section explains how to organise the directory structures within which Goobi saves its data and the different configuration files.

The base path for all digitisation software in the Goobi environment is:

/opt/digiverso/

The following directories are usually located on this base path:

/opt/digiverso/goobi/
/opt/digiverso/logs/
/opt/digiverso/itm/
/opt/digiverso/viewer/

The logs directory is the main directory for log files. Goobi log files are also stored here (assuming the system is properly configured). The other directories listed above relate to frequently used applications (e.g. viewer for the Goobi viewer, itm for the intranda Task Manager and goobi for Goobi.

The base path for Goobi is:

/opt/digiverso/goobi/

In most cases, this base path will accommodate the following folder structure (see below for details of each sub-directory):

/opt/digiverso/goobi/config/
/opt/digiverso/goobi/import/
/opt/digiverso/goobi/metadata/
/opt/digiverso/goobi/plugins/
/opt/digiverso/goobi/rulesets/
/opt/digiverso/goobi/scripts/
/opt/digiverso/goobi/xslt/

‘config’ sub-directory

The config directory contains all the Goobi configuration files that do not have to be located within the application itself. These are listed below:

config_contentServer.xml
goobi_activemq.xml
goobi_config.properties
goobi_digitalCollections.xml
goobi_exportXml.xml
goobi_mail.xml
goobi_metadataDisplayRules.xml
goobi_normdata.xml
goobi_opac.xml
goobi_opacUmlaut.txt
goobi_processProperties.xml
goobi_projects.xml
goobi_rest.xml
goobi_webapi.xml
messages_de.properties
messages_en.properties

Depending on the specific installation, the config directory may also contain other configuration files in addition to those related to the application’s core components. Accordingly, we recommend that you also use this central configuration directory to store configurations for individual plug-ins that provide additional functionality.

plugin_abc.xml
plugin_xyz.xml

For subsequent ease of maintenance, the paths and file names relating to the configuration of any new Goobi plug-ins that may be developed should also adhere to this convention.

‘import’ sub-directory

Depending on the way Goobi has been installed, the import directory will contain a range of data, mostly on a temporary basis. By way of example, import plug-ins use this directory to enter metadata and associated digital content in order to create processes. The respective import plug-ins are also responsible for deleting files that are no longer needed.

‘plugins’ sub-directory

Depending on the way Goobi has been installed, the plugins directory may contain a number of plug-ins that perform imports or call Web API commands. Depending on the task, the compiled plug-ins are located in either of the directories shown below:

/opt/digiverso/goobi/plugins/administration/
/opt/digiverso/goobi/plugins/command/
/opt/digiverso/goobi/plugins/dashboard/
/opt/digiverso/goobi/plugins/export/
/opt/digiverso/goobi/plugins/GUI/
/opt/digiverso/goobi/plugins/import/
/opt/digiverso/goobi/plugins/opac/
/opt/digiverso/goobi/plugins/statistics/
/opt/digiverso/goobi/plugins/step/
/opt/digiverso/goobi/plugins/validation/

‘rulesets’ sub-directory

Within Goobi, the UGH class library is used to process metadata, map PICA imports and generate METS files. In order to manage the huge variety of configuration options, UGH uses a mechanism known as rulesets. The rulesets directory is the central storage location for these rulesets. It allows you to make individual configurations available for different projects and types of publication.

‘scripts’ sub-directory

A range of scripts can be made available centrally in the scripts directory. These scripts can be used within the workflow to automate certain tasks.

‘xslt’ sub-directory

Goobi uses a mechanism called XSLT transformation to generate dockets as PDF files. This involves generating PDF documents from existing xml files. This is done on the basis of xslt files located centrally in the xslt directory.

Directory structure of the application

The Goobi installation path may vary depending on your installation. Typically, the base path for web applications on an Ubuntu Linux system within an Apache Tomcat servlet container is shown below:

/var/lib/tomcat7/webapps

Accordingly, the Goobi application is located on the following path within the file system:

/var/lib/tomcat7/webapps/goobi/

Integrating external storage

General

Most digitisation projects involve handling very large volumes of data. In most cases, this makes it necessary to link external storage capacity to the server. This can be done in a number of ways. We recommend that the external storage is linked to the following folder in the directory tree:

/opt/digiverso/

This means that all Goobi data can be found in a central location.

Two solutions for integrating external storage are explained in schematic form below. We do not recommend linking via CIFS as this can affect performance and functionality. Furthermore, CIFS does not allow you to produce symbolic links or read-only rights.

The following information is required if you wish to integrate external storage via an NFS Share

• exporting server • exporting directory

You can then add the storage to the directory tree via NFS. It is a good idea to add an entry into the file /etc/fstab that automatically sets up the link when the system starts up. This entry could be as follows:

example.net:/path/to/share /opt/digiverso nfs vers=3,rsize=8192,wsize=8192,soft,intr,rw,auto 0 0

Logical volume in the virtual machine

Another way of integrating external storage is to attach it to the virtual machine as an independent device. This can be different iSCSIs or SAN LUNs. They are subsequently combined into a logical volume in the virtual machine using LVM. The result is an aggregated storage unit based on a number of devices.

Integration of S3 as storage

Goobi workflow allows operation with S3-compatible storage. It should be noted that a local file system is still required to store the metadata. This means that the files meta.xml, meta_anchor.xml and their backups, which exist for each process, will continue to be stored in the file system. Only all other data, such as images and OCR results, are stored on the S3 storage area.

To run Goobi with S3 as storage, the following two settings must be set within the configuration file goobi_config.properties:

# global config if s3 should be used
useS3=true

# the bucket that is used for the content that would normally live in /opt/digiverso/goobi/metadata/
S3bucket=workflow-data

Goobi workflow uses the AWS Java SDK internally. This means that the credentials for accessing the storage system are read either from $HOME/.aws or from environment variables. If another S3 provider is to be used instead of AWS, the connection can be configured relatively granularly. This requires a few more settings within the same configuration files:

S3AccessKeyID=keyID
S3SecretAccessKey=secretkey
S3Endpoint=http://s3.mygoobi.tld

Using S3 as a storage system should basically work with all S3-compatible APIs. During the development of the S3 functionality, both Amazon S3 and MinIO were used for the implementation.

Services

To let Goobi run several services are required as dependency. These services get explained on the following pages.

MySQL database

General

Goobi saves information using a database. As a rule, the preferred option is MySQL. Ideally, the latest stable version of this database engine is installed from the operating system’s standard repositories.

Example: Ubuntu Linux 14.04 LTS

If you are using Ubuntu Linux 14.04 LTS, MySQL is installed from the standard repositories by means of the following command:

The service can be stopped and started using the following commands:

The configuration data for MySQL is located on the following path:

Apache Tomcat servlet container

General

Goobi is a web-based Java application. The Java code needs to be translated to ensure that Goobi can be called from your web browser. This job is done by a servlet container. Some examples are given below:

Servlet Container

Project-URL

Goobi is usually installed in an Apache Tomcat. This servlet container is installed from the operating system’s standard repositories. However, if the standard repositories do not contain version 7, Apache Tomcat is installed and maintained manually.

Example: Ubuntu Linux 14.04 LTS

If you are using Ubuntu Linux 14.04 LTS, you need to use the following command to install Apache Tomcat (version 7) from the standard repositories:

The service can be stopped and started using the following commands:

The configuration data is located in the following path:

In order to install Goobi, you have to place the file goobi.war into the Tomcat’s webapps folder. The file is unpacked automatically. The webapps folder is located on the following path:

Exporting to digital libraries

Goobi allows users to export processes (together with all their digitised objects) to digital libraries. As a rule, the XML file (METS) is exported last. It is generally preceded by any images, OCR results and other digital objects (e.g. audio files, video files, born digital data).

The following systems (e.g. intranda’s Solr Indexer) monitor the hotfolder and start to function when a specifically defined event occurs. Exporting the XML files last of all ensures that all the other files needed have been made available to the Solr Indexer. In this way, processing can begin as soon as the XML file is exported

The different export parameters can be configured in Goobi. They are located under project properties in the Technical data and Mets parameters tabs.

Technical data

The following screenshot shows the Technical data tab from Goobi’s project settings:

The configuration settings are explained below:

List of project configuration parameters used to generate METS files

Name

Typical value

Explanation

DMS export images folder

/opt/digiverso/viewer/hotfolder/

This stipulates the folder to which the images are written when exporting.

DMS export success folder

/opt/digiverso/viewer/success/

This stipulates the folder in which Goobi is instructed to search for a success message indicating that the exported XML file has been correctly processed. A check against this value is only made if the parameter exportWithoutTimeLimit has been set to false in the configuration file goobi_config.properties. At this point, Goobi can communicate with downstream systems in order to establish, for example, whether an item has been successfully exported.

DMS export error folder

/opt/digiverso/viewer/error_mets/

This stipulates the folder in which Goobi is instructed to search for an error message indicating that an error has occurred in processing the exported XML file. A check against this value is only made if the parameter exportWithoutTimeLimit has been set to false in the configuration file goobi_config.properties.

Create process folder

If this checkbox has been activated, Goobi will create a folder with the corresponding process name in each of the folders specified as the DMS export folder for XML file and DMS export folder for images. The exported files will then be placed in these folders. If the checkbox is not activated, the files will be placed directly into the stipulated folders. By default, this checkbox is not activated.

Timeout (ms)

36000

This setting defines the time interval in milliseconds before Goobi removes the exported files from the hotfolder. A check is only made if the parameter exportWithoutTimeLimit has been set to false in the configuration file goobi_config.properties. The default value is 300000 (i.e. 5 minutes).

Mets parameters

The following screenshot shows the Mets parameters tab from Goobi’s project settings:

The configuration settings are explained below:

List of METS parameters for use when configuring the project

Name

Typical value

Explanation

METS rights owner

Example Library

Defines the METS rights owner.

METS rights owner logo

Defines a URL for a logo belonging to METS rights owner.

METS rights owner URL

Defines a URL for the METS rights owner.

METS rights owner contact

[email protected]

Defines a contact email address for the METS rights owner.

METS Digiprov reference

METS Digiprov Referenz

Defines a link to the catalogue entry for the source material.

METS Digiprov presentation

Defines a persistent link to the source material in the digital library.

METS Digiprov reference (anchor)

Defines a link to the catalogue entry for the overarching source item where the material being exported is part of a multi-volume source.

METS Digiprov presentation (anchor)

Defines a link to the overarching source item in the digital library where the material being exported is part of a multi-volume source.

METS pointer Path

Defines a link to a METS resolver for the source material. This link can be used later to download the METS file.

METS pointer path (anchor)

Defines a link to a METS resolver for the overarching source item where the material being exported is part of a multi-volume source. This link can be used later to download the METS file.

METS sponsor

Deutsche Forschungsgemeinschaft

Contains the name of the digitisation sponsor, e.g. Deutsche Forschungsgemeinschaft (DFG).

METS sponsor Logo

Contains a URL to a logo of the digitisation sponsor. The logo is integrated into the design of the DFG Viewer, where it replaces the DFG logo.

METS sponsor URL

Contains the sponsor’s website URL. The URL is linked in the DFG Viewer to the sponsor’s logo.

METS licence

CC-BY

Contains details of the licence under which the digitised material was published.

The information about the METS rights owner, METS rights owner logo, METS rights owner URL and METS rights owner contact can be found in the exported METS file in the section amdSec in the namespace dv. It is intended to ensure compatibility with the DFG viewer.

There follows an explanation of the three variables found in the column of typical values shown above:

$(meta.CatalogIDDigital) is replaced during export by the CatalogIDDigital of the source material from the METS file.
$(meta.topstruct.CatalogIDDigital) is replaced during export by the CatalogIDDigital of the multi-volume source material from the METS file.
$REGEXP(s/PPN=PPN/PPN=/) applies the defined regular expression to the entire line. In this case, a search is performed for PPN=PPN, which is then replaced by PPN=.

Export configuration in the Goobi configuration file

A further export setting can be configured in goobi_config.properties. The corresponding parameter is shown below:

exportWithoutTimeLimit=true|false

If this value is set to true, Goobi will produce a success message for the export once the exported files have been copied to the folders DMS export folder for XML file and DMS export folder for images (as defined in the project settings). Goobi will produce an error message if an error occurs during the copying process.

If the value is set to false (or if it is missing from the configuration file), Goobi will not produce a success message unless it can find a corresponding message in the folder DMS export success folder (as defined in the project settings). Goobi will produce an error message if it finds an error message in the folder DMS export error folder (as defined in the project settings).

If Goobi does not find a success of error message in these folders within the timeout period stipulated in the project settings, it will delete the exported items automatically.

Working with the intranda Task Manager

Goobi communicates with the intranda Task Manager via the TaskClient. In Goobi, this command line program is configured as part of a workflow step. It can accept and process a number of parameters. The parameters required depend on the nature of the task to be performed by the intranda Task Manager. The parameters expected and validated for OCR, for example, are different from those required for SDB.

Details of the intranda Task Manager and how it works with Goobi can be found in the intranda Task Manager documentation.\

Automatic workflow steps

Goobi allows you to mark individual tasks as automatic. These tasks are opened and performed automatically once the preceding workflow task has been completed. Whenever an error occurs in such an automatic workflow task, that task will remain paused and will not be processed any further.

To mark a workflow task as automatic, you need to activate the Automatic task checkbox. This is located in the Task details box as shown in the following screenshot:

Example combination for an automatic script task

In most cases, an automatic task is combined with a script task. Goobi then responds to the script output. Goobi will treat any messages issued on the default output console as simple status reports and will continue operation. However, Goobi will treat any messages issued on the error output console as errors and will interrupt workflow processing.

One example for the combination of an automatic workflow task and a script task is the conversion of images to TIFF/JPEG. Goobi automatically calls a script and converts the images in the specified folder to TIFF/JPEG format.

The following example shows how Goobi calls a script for the intranda image improver:

/opt/digiverso/goobi/scripts/iii.sh

The call involves providing two parameters. The first of these, convert_images, is defined in the script itself. Goobi replaces the other parameter {tifpath} dynamically by the path to the folder in which the image set is located.

Parameters can be combined with quotation marks (") to pass them as an argument to the called process. If a quotation mark is to be passed directly to the new process as an argument, it must be escaped with a preceding second quotation mark (then: "").

Migration of technical data to METS files

Goobi automatically exports METS files to a pre-defined folder. This is usually the directory shown below:

/opt/digiverso/viewer/hotfolder/

A sub-folder is created in this directory for each process, and all the data being exported is copied within this new folder. As well as the actual METS file, this may include the anchor file, the images and OCR data.

Once the export has been completed, Goobi calls PostExport.jar. This program renames the exported METS file (so that the file name corresponds with the b-number) and merges it with the SDB-AMD file.

As a first step, this involves reading in the AMD file and for each element creating a mets:techMD element within the mets:amdSec section of the METS file. The second step involves checking all the FileGroups individually. The file name, checksum and Mime type are formed from the values in the AMD file. Finally, the structure elements of the physical structMap are linked to the individual techMD.

After this, the enriched METS file is written to the pre-defined folder, and from here the presentation can read in the data.

If the data includes multiple manifestation objects and one or more manifestations have already been exported, the anchor files are also merged. This involves entering into the logical structMap a new child element that contains the information for the new object. The sequence within the structMap is defined using the metadata item order number.

Configuration files

At various points, Goobi workflow is controlled via several configuration files. At this point, the most important configuration files are listed and explained in detail.

goobi_digitalCollections.xml

The configuration file goobi_digitalCollections.xml is responsible for controlling the selection list of digital collections. It is usually located at the following path in the file system:

/opt/digiverso/goobi/config/goobi_digitalCollections.xml

In this configuration file you define which collections basically exist and within which projects they should be available for selection.

goobi_digitalCollections.xml

<?xml version="1.0" encoding="UTF-8"?>
<DigitalCollections>
	<default>
		<DigitalCollection>General</DigitalCollection>
		<DigitalCollection>Biology</DigitalCollection>
		<DigitalCollection>Physics</DigitalCollection>
		<DigitalCollection>Mathematics</DigitalCollection>
	</default>
</DigitalCollections>

If collections are to be configured with sub collections, these are separated from each other by the # separator. This looks like the following example:

goobi_digitalCollections.xml

<?xml version="1.0" encoding="UTF-8"?>
<DigitalCollections>
	<default>
		<DigitalCollection>General</DigitalCollection>
		<DigitalCollection>Biology</DigitalCollection>
		<DigitalCollection>Biology#Animals</DigitalCollection>
		<DigitalCollection>Biology#Plants</DigitalCollection>
	</default>
</DigitalCollections>

If the collections are to be configured differently depending on the project, this can be done as follows:

goobi_digitalCollections.xml

<?xml version="1.0" encoding="UTF-8"?>
<DigitalCollections>
  <default>
    <DigitalCollection>General</DigitalCollection>
  </default>
  <project>
    <name>Archive Project</name>
    <DigitalCollection>Monographs</DigitalCollection>
    <DigitalCollection>Files</DigitalCollection>
    <DigitalCollection>Maps</DigitalCollection>
  </project>
  <project>
    <name>Newspapers</name>
    <DigitalCollection>1920 - 1930</DigitalCollection>
    <DigitalCollection>1930 - 1940</DigitalCollection>
    <DigitalCollection>1940 - 1950</DigitalCollection>
  </project>
</DigitalCollections>

goobi_exportXml.xml

The goobi_exportXml.xml file specifies technical details about the properties and associated XML namespaces used when generating docket files.

The file is usually located at the following location:

For example, this configuration file looks as follows:

General

This configuration file is used to specify additional properties to be included when exporting docket files. Since these properties can be gathered from different areas, it is necessary to specify for each property where it is defined. This is done with the help of namespaces, which are also defined beforehand.

Defining namespaces

The <namespace> elements are used to define namespaces that can be used to add further properties for exporting metadata. There are some projects and institutions that define such namespaces. In this configuration file, a <namespace> element is created for each required namespace definition.

Namespaces are identified by a name or abbreviation specified with the name parameter. This should be short and descriptive, as it will be used often later. Namespace names may not occur more than once. Usually the project name or an abbreviation of it is used. The project name is usually evident from the domain in the specified URL.

Accordingly, namespaces refer to a URL defined in the value parameter. This URL must point to an XML specification that contains further information about the respective namespace and the available property names. Further information can be found in the respective documentation for the namespace specifications.

For example, if the following namespace is defined,

so an element mets:xmlData can be used later. Goobi can then assign that in the namespace mets, which in turn is defined in the specification under http://www.loc.gov/METS/, the element xmlData should be searched for. There it is defined according to the project.

Define properties

Properties to be included in the export are specified in <property> elements. The title specified in the name parameter will be reused for the structure of the exported docket file. The value parameter contains one or more properties which are structured as follows.

Syntax for properties

All properties are separated with slashes (/). Additionally, the entire entry starts with two slashes. The basic syntax is as follows:

Since no default namespace can be defined in the list of namespaces, each property is assigned a namespace, resulting in the following syntax:

Any number of elements can be specified. Only namespaces that have been previously defined in the <namespace> elements can be used. According to the namespaces, only certain properties can be specified. For example, it looks like this:

It may happen that properties exist as lists. In this case the number of the list element is indicated with square brackets. It should be noted that the count starts at 0. In the following example the second element from the list dmdSec is used. (The first element would then have the number 0).

In addition, for more complex properties, it is possible to select which component of the property should be used. In the following example, the sub-element with the property name = PublicationYear is selected from the metadata element.

meta.xml or meta_chanor.xml

If only a meta.xml is used to describe the process, all <property> elements are created inside a <mets> element.

If an additional meta_anchor.xml is used to describe the process, then all <property> elements are created within an <anchor> element.

For an export, only the <property> elements from one of the two elements are ever used.

If a goobi_exportXml.xml file is to be used for different projects, some of which use only the meta.xml and some of which also use the meta_anchor.xml, both elements can be specified with their respective <property> subelements without any problems. These do not influence each other.

goobi_mail.xml

The configuration file goobi_mail.xml is used to configure mail dispatch for Goobi. The file is usually located here:

/opt/digiverso/goobi/config/goobi_mail.xml

For example, this configuration file looks as follows:

goobi_mail.xml

<?xml version="1.0" encoding="UTF-8"?>
<goobiMail>
    <configuration enabled="true">
        <smtpServer>mail.example.com</smtpServer>
        <smtpPort>5587</smtpPort>
        <smtpUser>[email protected]</smtpUser>
        <smtpPassword>PASSWORD</smtpPassword>
        <smtpUseStartTls>false</smtpUseStartTls>
        <smtpUseSsl>true</smtpUseSsl>
        <smtpSenderAddress>[email protected]</smtpSenderAddress>
    </configuration>
    <apiUrl>http://example.com/goobi/api/mails/disable</apiUrl>
</goobiMail>

Field

Description

smtpServer

SMTP server for mail dispatch

smtpPort

Port for the SMTP server to use, in case it is not the default port

smtpUser

Username for authentication on the SMTP server

smtpPassword

Password for authentication on the SMTP server

smtpUseStartTls

Activation of the StartTLS encryption type

smtpUseSsl

Activation of the SSL encryption type

smtpSenderAddress

The sender address to be displayed to the recipient.

apiUrl

URL at which a user can deactivate mail reception

To deactivate the global sending of emails, it is sufficient to delete the goobi_mail.xml file or set the enabled attribute to false.

Additionally, the content of the emails can be configured. The following keys are available in the messages files for this purpose:

Field

Description

mail_notification_openTaskSubject

Subject of the email that is sent when a task is opened.

mail_notification_openTaskBody

Content of the email that is sent when a task is opened

mail_notification_inWorkTaskSubject

Subject of the email that is sent when a task is being processed

mail_notification_inWorkTaskBody

Content of the email that is sent as soon as a task is processed

mail_notification_doneTaskSubject

Subject of the email that is sent when a task is completed

mail_notification_doneTaskBody

Content of the email that is sent when a task is completed

mail_notification_errorTaskSubject

Subject of the email that is sent when a task changes to an error status

mail_notification_errorTaskBody

Content of the email that is sent when a task changes to an error status

In addition to normal text or HTML elements, variables can also be used in the subject or body. The following variables are available for this purpose:

Variable

Description

${user}

Name of the user to whom the e-mail will be sent

${projectname}

Name of the project to which the task belongs

${processtitle}

Name of the operation to which the task belongs

${stepname}

Name of the current task

${url_cancelStep}

URL to unsubscribe from notifications for this type of task

${url_cancelProject}

URL to unsubscribe notifications for all tasks of the project

${url_cancelAll}

URL to unsubscribe from all notifications

For example, an email text could be configured as follows:

messages_de.properties

mail_notification_openTaskBody=<html><body><h3>Hallo ${user},</h3><br /><p>folgender Schritt wurde ge\u00F6ffnet und kann nun bearbeitet werden:<ul><li>Projekt: ${projectname}</li><li>Vorgang: ${processtitle}</li><li>Schritt: ${stepname}</li></ul></p><div><a href="${url_cancelStep}">Benachrichtigungen f\u00FCr Schritte mit diesem Namen abbgestellen</a><a href="${url_cancelProject}"></div><div>Benachrichtigungen f\u00FCr dieses Projekt abbestellen</a></div><div><a href="${url_cancelAll}">Alle Benachrichtigungen abbestellen</a></div></body></html>

goobi_normdata.xml

In the configuration file goobi_normdata.xml links to databases are specified, which can be used for universal purposes. A main application is to obtain metadata to digitized objects. The file is usually located at the following location:

/opt/digiverso/goobi/config/goobi_normdata.xml

For example, this configuration file looks as follows:

goobi_normdata.xml

<?xml version="1.0" encoding="UTF-8"?>
<normdatabases>
    <normdatabase name="goobi" url="https://goobi.intranda.com/" abbreviation="" />
    <normdatabase name="kulturnav" url="https://kulturnav.org/" abbreviation="kulturnav" />
    <normdatabase name="GND" url="http://d-nb.info/gnd/" abbreviation="gnd" />
    <normdatabase name="geonames" url="http://www.geonames.org/" abbreviation="geonames" />
    <normdatabase name="dante" url="https://dante.gbv.de/" abbreviation="dante" />
    <normdatabase name="viaf" url="http://www.viaf.org/viaf/" abbreviation="viaf" />
    <normdatabase name="easydb" url="https://easydb.prizepapers.gbv.de/" abbreviation="easydb" />
<!--
    <normdatabase name="REFGEO" url="http://normdata.intranda.com/normdata/refgeo/" abbreviation="intranda Geo Datenbank" />
    <normdatabase name="REFBIO" url="http://normdata.intranda.com/normdata/refbio/" abbreviation="intranda PND" />
    <normdatabase name="RVK" url="http://rvk.uni-regensburg.de/index.php?option=com_rvko&amp;view=show&amp;mode=searchNotation&amp;rvkoNotationKey=" abbreviation="rvk" />
-->
</normdatabases>

The file goobi_normdata.xml contains a simple list of norm databases that can be used in Goobi, for example also by plugins. For each registered database there is a <normdatabase> element, which contains the further information.

The parameter url specifies the URL of the database. This should point to an API that can be queried by Goobi. Depending on the usage or plugin, the start page of the database can also be specified here, where a user can query object data in the browser.

The abbreviation parameter is used to specify an abbreviation that uniquely identifies this database. The abbreviation can be used, for example, for configurations and file imports and exports.

The additional name parameter is not currently used by Goobi Workflow. It is used for completeness in the configuration file and contains the full, human readable, name of the database. Alternatively, a common abbreviation can be specified here. This parameter can also be used by plugins when they read the file themselves.

goobi_opacUmlaut.txt

In the configuration file goobi_opacUmlaut.txt umlauts are specified which should be replaced when automatically generating process titles. The file is usually located at the following location:

For example, this configuration file looks as follows:

Usually, Goobi Workflow automatically generates the process titles used. These contain the first letters of the author and an abbreviation of the corresponding work name.

Since the task title cannot contain umlauts or special characters for compatibility reasons, Goobi tries to replace them with compatible letters with similar meanings.

While it is intuitive for a user that an ä can also be replaced by an a, these two letters have no technical connection. Therefore, this context must be configured in terms of language usage. Usually the configuration shown above is already present and covers most languages.

In the file goobi_opacUmlaut.txt the used replacement rules are specified. The first column contains the characters to be replaced. Behind this is the replacement character. This may not be a special character or an umlaut.

All special characters, which are not considered in this configuration file, are removed without replacement from the name when generating the process title.

The translation algorithm is implemented in such a way that also multi-letter strings can be replaced. The following configuration is also possible, whereby exactly one space character is always expected as a separator:

Please note that the process title is first generated using the original names and then the umlauts are replaced. Thereby it can happen that the length of the process title changes in case of a multi-letter substitution.

For example, if the author's name is Björn and the first four letters are used, the result is the abbreviation björ (without capitalization). Since the ö now becomes oe, the abbreviation changes to bjoer and now has five letters.

Edit task details

On the page shown here, the details and plugins for tasks can be configured. The page does not differ significantly between tasks of process templates and those of existing processes. Differences are marked where appropriate.

Title

The title of the task is specified in this field. This can be freely selected. However, it should be taken into account that GoobiScript calls, for example, use the respective task titles to automatically execute background tasks across many processes.

Therefore titles should be short, meaningful and unique. Spaces and special characters are allowed.

Order

The position number is used to specify the position of a task in a process template or an existing process. Accordingly, an integer must be specified here.

On the one hand, the position number is used to display all tasks of a process in the correct processing sequence. On the other hand, when tasks are completed, those next tasks within a process that follow the sequence according to the current task are unlocked.

Several tasks can have the same position number. This then means for Goobi that the concerning tasks may be executed simultaneously. Parallel processing works with both automatic and manual tasks (for example, when several employees are working on a task in parallel).

If the task sequence of a process is configured with gaps (for example 1, 2, 3, 6), Goobi jumps directly to the task with the next highest number.

The order of tasks also plays an important role in GoobiScript calls. There, based on the number specified here, further administrative precautions can be taken.

Allow parallel tasks

This setting is only available when creating a new task within a process template.

Furthermore, this setting is only relevant when new tasks are to be inserted between already existing tasks.

If this option is set, the position number is set directly when the task is created. It may happen that another task already exists with the same number and both tasks can be processed in parallel later.

If this option is not set, then in the case of a duplication of the sequence number with that of another already existing task, the sequence numbers of the other and all subsequent tasks are shifted back by exactly one number when saving. This ensures that the new task inserted in the process does not duplicate any existing position number and that the subsequent tasks can still retain their defined numbering relative to each other (including missing and duplicated numbers).

If it turns out afterwards that this option was wrongly selected when creating the task and all subsequent tasks have a "wrong" position number, there are two possible solutions: For small adjustments, the position number can be adjusted at any time in the task overview of the process template using the buttons available for this purpose. For large adjustments, or if tasks already exist for the current process template, it is recommended to write a GoobiScript so that it adjusts all "wrong" numbers and is executed on all incorrectly numbered tasks.

Priority

In this list it is possible to select a priority for the current task. It should be noted that the priorities Standard, Priority, High priority and Highest priority are only intended to visually represent the importance of tasks. They have no further technical effect on the corresponding process.

The Correction option also has no effect on the task, but is automatically set if the final result in a task to be completed is not to be accepted and a correction message is sent.

Metadata

This option can be activated if it is in the sense of the task to edit metadata.

If a task is marked with the Metadata property, additional icons and options are displayed at various places in the user interface to access metadata. For example, the button for calling the metadata editor is also displayed if a task has this property.

If metadata in a task is to be uploaded, downloaded, validated or used in any other way, this option must be selected.

Read images

This option can be enabled if the user in this task should get read access to image files in his user folder (on the Goobi server). This may be the case, for example, when images are to be downloaded or displayed for quality checking.

Write images

This option can be activated if the user in this task should get write access to image files in his user folder (on the Goobi server). This is required whenever images are to be uploaded or edited.

Validate on exit

This option can be enabled to validate the metadata of the process when completing this task. This validation has nothing to do with the validation plugin (see below). The type of validation set here checks if all metadata, structure elements (DocStructs) and page counts have been applied according to the rule set.

Export

This option can be enabled if it is in the sense of the task to export data for further processing with other systems. This can be, for example, other database formats, content management systems (CMS) or simply certain file formats. If this task is defined as an export task, an export plugin must be selected in the 'Step plugin' field. Export plugins usually start with the prefix intranda_export_.

If this task is an automatic task and an export task at the same time, the export will take place automatically. Regardless of this, the user will see an export button next to the corresponding task in the overview and can also export the dataset manually.

Skip this task

This option can be selected if the task should be skipped in the process. If a task has this property, it will be closed automatically as soon as a user accepts it. If this task is an automatic task, it will also be skipped and closed automatically.

Automatic task

Different types of tasks can be configured as automatic tasks. This allows, for example, plugins or GoobiSript calls to be executed either directly or in one of the available processing queues.

Specifically, the following types of tasks can be automated: Internal tasks (intranda_step_*), export tasks (intranda_export_*), script tasks, HTTP tasks and time delay tasks (intranda_delay_*). In each case, make sure that the appropriate plugin or script is selected as described in the corresponding chapter.

In order to use processing queues, they must first be enabled and set up in the corresponding configuration files (goobi_config.properties and goobi_activemq.xml). If this option is then activated, a drop-down menu appears below the checkbox from which the desired processing queue can be selected.

If the option Don't execute in processing queue is selected, the corresponding plugin or script will be executed directly as soon as the task is reached in the process. This option is recommended only for tasks that should be triggered by the user in real time, for example by completing the previous task.

The Processing queue for fast jobs and Processing queue for slow jobs options provide two independently operating processing queues that are normally the right choice for most automated tasks. The fast processing queue is intended for tasks that have a rather short execution time and should be completed promptly.

The slow processing queue, on the other hand, should be used for tasks that require a lot of processing time and for which it is not really relevant how quickly they are completed. For example, the slow processing queue can be used for large amounts of image exports, OCR analysis, 3D calculations, or other complex applications. As a result, this processing queue is also suitable for tasks that sometimes require a total computing time of hours, days or even longer over many thousands of processes.

However, by default, Goobi always prioritizes tasks that are being executed by users in real time and tasks that are in the fast processing queue. For example, if Goobi is busy during the day due to the work of many active employees and there is also a well-filled slow processing queue, it is therefore common for this to make its greatest progress at night.

In addition, there is an In queue for external processing option. This processing queue can be used by REST API requests. Suitable REST API requests are usually provided by plugins.

Generate thumbnails

If the task is to be used to generate thumbnails, this checkbox must be selected. A text input field then appears, in which an example configuration for the generation of thumbnails is given. This should be adapted for the project's own needs.

The text input field includes several lines in which a YAML-compatible notation of key-value pairs is expected. Key-value pairs are separated by a colon (:) and each line may contain exactly one key-value pair. The first line contains the string --- to indicate the beginning of the data set. Comments are marked with a hash (#) and may be placed in a separate line as well as at the end of a line used for content. They are ignored by the interpreter. Thus it is also possible to "comment out" certain parts in experimental configurations.

At the beginning the following example configuration is in the text input field:

---
Master: false  #use master image directory 
Media: false  #use media image directory 
Img_directory: "" #set path to custom image directory 
Custom_script_command: "" #command to execute custom thumbnail generation script 
Sizes: #define thumbnail sizes 
- 800

The variables have the following meaning (details, see below):

Variable

Data type

Default value

Meaning

Master

Boolean

false

This value can be set to true to enable thumbnail generation for all image files in the master folder.

Media

Boolean

false

This value can be set to true to enable thumbnail generation for all image files in the media folder.

Img_directory

Text

""

Here you can optionally specify another folder with image files.

Custom_script_command

Text

""

Here you can optionally specify an alternative script or executable program to generate thumbnails.

Sizes

List of integers

800

A list of image file sizes (in pixels) in which the images will be generated must be specified here.

Note that Sizes accepts a list. Each line starts with the string - and then contains an integer. The list entries must directly follow the line Sizes:. Text values must be enclosed in double quotes ". Boolean values can be used directly and can be set to either true or false to turn the feature on or off.

Master

Setting this value to true will generate thumbnails for all image files inside the master folder (usually /opt/digiverso/goobi/metadata/{processId}/images/{processId}_master/) and store them in the thumbnails folder (usually /opt/digiverso/goobi/metadata/{processId}/images/thumbs/{processId}_master_{size}/).

Media

Setting this value to true will generate thumbnails for all image files inside the media folder (usually /opt/digiverso/goobi/metadata/{processId}/images/{processId}_media/) and store them in the thumbnails folder (usually /opt/digiverso/goobi/metadata/{processId}/images/thumbs/{processId}_media_{size}/).

Img_Directory

At this point, a third, additional folder can be specified for the generation of the thumbnails. The generation behaves here in exactly the same way as with the master folder and the media folder. However, this folder must be specified as a full folder path, for example /opt/digiverso/goobi/metadata/{processId}/images/{processId}_custom/. The destination folder for generation would then be accordingly: /opt/digiverso/goobi/metadata/{processId}/images/thumbs/{processId}_custom_{size}/. If the folder path specified here is empty or the parameter is missing, this option will be ignored during generation.

Custom_script_command

A path to an individual script file or to an executable program that is to be used to generate the thumbnail files can be specified here. The script or program should be able to independently recognize the image files and use the correct process-related folder paths. If the field is left empty, Goobi will use internal Java libraries for generation.

Note: This option is not currently supported. All thumbnails are generated using special image processing Java libraries, regardless of any script or program specified here.

Sizes

At this point a list of image file sizes (in pixels) can be specified in which the thumbnails should be generated. This can be done by specifying one or more lines, each of which starts with the YAML list entry prefix (- ) and then contains an integer. Note that each of the specified image file sizes will be used independently, and will always produce thumbnails that have the specified size on their longest side. For example, if the values 400 and 800 are specified, both thumbnails with a size of 400 pixels and thumbnails with a size of 800 pixels will be generated.

Additional technical information

In the examples above, the placeholders {processId} stand for the respective process ID and {size} for the thumbnail size in pixels. These are inserted in folder names accordingly.
All settings (except Sizes) can be omitted, in which case the default values given above will be used.
If multiple thumbnail sizes are specified, all will be generated independently. All respective sizes are stored in their individually named folders (consisting of image size and source folder).
All generated thumbnail images are JPG files using a default color profile of the image file processing Java library.
Watermarks are not included in preview images.
Spaces in filenames of original files are replaced by %20 in thumbnail image file names for technical reasons.
If the folders for the preview images do not exist, they will be created automatically.
If the output folders including preview images already exist and if the source image files have not changed (measured by the last-modified value of all involved image files), the thumbnail images will not be regenerated.

Script step

This option can be enabled to run one or more scripts on the server as part of the current process. For example, these can be Bash or Python scripts that perform background tasks or interact directly with the current task's dataset in the file system.

The numeric return values of the scripts play an important role in workflow control. Therefore, they are documented in detail following this chapter and should be strictly observed.

If the script step option is activated, a table appears below the checkbox in which up to five script files can be entered. For each script a name and a path can be entered, whereby the path is the absolute path (including script file name) in the server file system. The associated name can be freely selected and serves exclusively the simpler recognition for the user.

Any fields in the table can be filled in or left blank. If this task is executed later in a process, Goobi will execute all filled script path fields from top to bottom.

Care must be taken to ensure that the specified scripts are executable. Otherwise, error messages in the journal and server log files will indicate the cause of non-execution. Common causes of errors are for example missing execution rights, wrong environment variables (especially PATH), missing interpreters, wrong file paths or missing parameters. Parameters can easily be specified in the script field.

If another script language is to be used, an interpreter must be specified if necessary. In this case, the interpreter is the executed program and then executes the script file specified as a parameter.

The script lines read from the table (but not the script files themselves) will be searched again by the variable replacer in Goobi Workflow when executing the scripts in a process and any variables will be replaced. This can be used to insert metadata values into certain fields. For example, parameters for the scripts specified here can be written so that the parameter values correspond to the variables for the variable replacer.

A few examples of scripts:

/opt/digiverso/goobi/scripts/copyfiles.sh
bash /opt/digiverso/goobi/scripts/myExampleScript.sh --action convert
python /opt/digiverso/goobi/scripts/doAnything.py
/etc/python /opt/digiverso/goobi/scripts/convertImages.py --type *.jpg *.png

If the available script fields are not sufficient, a kind of meta script file can be written instead, which in turn executes several scripts and occupies only one field in the table.

Return values of script files

The respective return value of a script is used by Goobi Workflow to determine the current status of the corresponding task, especially between the processing of multiple scripts within one task. A distinction is made between Success, Error, Reopen task and Continue task.

The return value, for example 99, is specified in Bash scripts with the command return 99 or in python with sys.exit(99) to the end of the script. This should properly handle any errors caught by if blocks or other situations, and in particular should not use return statements without return values.

The following table gives an overview of the respective behavior of Goobi:

Return value

Status

Action of Goobi

0

Completed

Executes the next script of this task

1 and all undefined values

Error

Cancels execution of this task

98

Open

Cancels execution and restarts the task

99

In process

Executes the next script of this task

If a script outputs text in the error output stream (STDERR), this is also treated as an error condition (1). Errors are documented in the journal and server log files.

The input and output streams STDIN and STDOUT are currently not used.

Combination of multiple scripts

The value 0 can be used if a script is the only or last script in a task. If the script was successfully executed, the task is set to Completed status. If 0 is used and scripts follow afterwards, unwanted side effects may occur.

The value 1 (and all undefined values) can be used if a script fails. Goobi will then not execute any further scripts and set the current task directly to Error status. Error messages (STDERR output and goobi-internal errors) are logged in the journal and server log.

The value 98 can be used if a script could not be executed as desired and Goobi should be told to restart the whole task. Accordingly, the execution of the first defined script will be started again if it is an automatic task. The status of the task is set back to Open by the return value 98 and to In process in case of an automatic task. This return value can be used, for example, for situations when scripts perform error handling in their own error detection and then need to be re-executed to benefit from their own error corrections on the next execution attempt. Validation scripts can also be used at this point to detect errors from previous scripts and cause the task to restart accordingly.

The value 99 can be used if a script could be executed successfully and another script is to be executed afterwards. The status of the task is not yet set to Completed with 99 (in contrast to 0), which makes sense if several scripts follow each other. The last script should then return 0 to complete the task.

If the provided script fields are not sufficient and multiple script calls are outsourced to an external script file, make sure that all return values are passed correctly and arrive back in Goobi to achieve the desired effects in task automation.

HTTP step

This option can be selected if the current task is to perform an HTTP (or HTTPS) transaction with an API of another server. This application is mainly intended for POST, PUT and PATCH requests, i.e. for uploading data to the respective services. However, GET requests can also be sent conditionally to check the availability of resources. As soon as the corresponding checkbox is selected, several input fields appear in which the details of the request can be entered.

With the option `Close step after successful HTTP call, it can be configured whether the task should be closed automatically in case of a successful HTTP request. More details, see below.

First, the HTTP method must be selected. The method to use is determined by the API endpoint used (usually within a REST API) and must match the method specified in the associated specification. Otherwise, the accessed server can be expected to return an error message (405 Method Not Allowed, see below).

In general, APIs should adhere to the following request method conventions, although deviations are possible:

HTTP method

URL or URI

Action

Request content

POST

Created target address

Data set is created

All properties to create

PUT*

Existing target address

Data set is replaced

All unchanged and changed properties

PATCH

Existing target address

Data set is changed

Only changed properties

GET**

Requested target address

Data set is requested

- No content -

* If unchanged properties are omitted, this may lead to unintended deletion of the data in question, since the API endpoint assumes that the properties should be overwritten with "empty" data. Details on the specific behavior of the API should always be found in the associated specification.

** Note that the GET method is only partially supported at this point. The received data is not stored or further processed. The GET method can be used to query the existence or accessibility of data.

The next step is to specify the HTTP URL. This is composed of the domain name or IP address of the server, optionally the port number, the API endpoint and the required URL parameters. If the default HTTP port '80' is used, this can be omitted.

Some examples of API request URLs:

http://localhost:8080/api/endpoint/delete?id=20&project=2
http://192.168.178.21:8888/api/endpoint/list
https://goobi.example.org/api/endpoint/list?project=1

After that the HTTP body is entered. This is only relevant if it is a POST, PUT or PATCH method. The content of the request depends on the API used and the corresponding API endpoint, so we cannot go into more detail here. However, if a REST API is used, data must always be entered in JSON format.

The specified data in the HTTP body, once the HTTP request is later performed for a process, will be post-processed by goobi's internal variable replacer. This makes it possible to use metadata field labels according to the variable translator's syntax rules in the request content. These are then replaced by the actual data in the respective process before the request is sent.

Additionally, the Escape HTTP body as JSON option can be selected. This modifies control characters (for example, line breaks or tabs) in the request content so that they are masked by backslashes (\\) and can be used correctly by the API endpoint if necessary. However, the exact handling of such control characters also depends on the API endpoint and must be read in the associated specification or requested from the provider.

This completes the settings for the request.

As soon as the request is sent later during the execution of the task within a process, the requested server responds with an HTTP status code. This is read by Goobi and decides whether to continue executing the process. In general, successfully answered requests lead to the completion of the respective task. If the next task in the process is an automatic task, it will be executed.

With the option Close step after successful HTTP call, it can be configured that the task is really closed if the HTTP call was successful. If this option is not choosed, the task will remain in the status Open.

If an error status code is returned, the task remains in an error status and manual intervention is required. Errors are logged in the journal and server log files.

In the realm of HTTP status codes, normally all status codes < 400 are considered successes and all >= 400 are considered errors.

The following is a listing of the most common status codes associated with REST APIs:

Successful status codes:

200 OK - The request was processed successfully or the data set could be found
201 Created - The resource was created successfully
204 No Content - The request was processed successfully, the answer intentionally does not include data

Error status codes

400 Bad Request - The request was not formatted correctly - The syntax should be checked
401 Unauthorized - Authentication is required for this request (or the specified authentication is invalid)
403 Forbidden - This resource exists, but may not be requested
404 Not Found - The resource does not exist - possibly the URL or URI is incorrect
405 Method Not Allowed - The wrong HTTP method was used (see above)
429 Too Many Requests - Too many requests were sent within a certain period of time
500 Internal Server Error - The server has detected an internal error for which the request is not directly responsible
501 Not Implemented - The API endpoint does not exist - Possibly an incorrect API version is used

In addition, some further status codes can be returned. The respective meaning can be found in the relevant RFC standards. In case of special problems, however, the responsible support should be contacted.

Status

In this drop-down menu, the status of the task can be set manually. Typically, in a process template, all tasks are set to Locked and the first one to Open, so that later in each process, editing can start at the first task. In the course of editing, Goobi sets all open tasks to In queue or In process and finished tasks to Completed. In case of an error, the task is automatically set to Error.

Additionally, there is an option Deactivated to deactivate a task completely. This will be skipped by Goobi when editing the process. The Error status can also be used manually to indicate that an edit has not taken place as desired and may need to be repeated.

When setting the status in a task in an already existing and edited process, care should be taken here to avoid unwanted side effects caused by automatic tasks being opened or closed manually, which are triggered by this intervention in the workflow.

Batch step

This option can be enabled if the current task should be able to be grouped and executed together with similar tasks within a batch.

Step plugin

In this drop-down menu, a plugin is selected that is to be used for the task currently being processed. There are different possibilities for what the plugin selected here can be used for. Different possibilities are described in more detail in other subchapters on this page.

Plugins can be used for automatic or semi-automatic tasks. In this case, time delay plugins (intranda_delay_*), validation plugins (intranda_validation_*), export plugins (intranda_export_*) or other types of plugins can be selected. Plugins that bring their own user interface and are not already described in one of the other categories on this page can also be selected here (e.g. intranda_step_*).

For tasks that should not use a plugin, this field is left empty.

Validation plugin

A task can be configured to run a validation plugin upon completion. The corresponding plugin can be set in this field. Validation plugins usually start with the prefix intranda_validation_.

Plugin for time delay

Time delay plugins can be included as tasks in the process to pause automatic processing for a certain time. If a task is to be used for time delay, this checkbox must be set and a corresponding plugin must be set.

The corresponding plugin is selected in the Step plugin field. Plugins that can be used for time delay usually start with the prefix intranda_delay_.

Update metadata index when finishing

This option can be enabled if metadata has been modified by completing this task and the search index should be updated for the entire dataset of this process.

Download the docket as PDF file

This option can be enabled to offer generating and downloading a docket PDF file for the current intermediate status of the corresponding process in the task overview. This can be handy to manually check the current technical status of the process.

GoobiScript

When you apply an action to a group of processes, you also have an option to run Goobi scripts within that action. To do so, click on the button Execute GoobiScript in the Possible actions box. Goobi will display an overview of all the scripts that can be applied to the entire list of processes, to the processes listed on the current page or just to a selection of processes.

For every GoobiScript, you need to enter the name of the script you wish to run as well as the corresponding parameters. These parameters are shown when you click on the script in the list. Replace the parameters shown as examples with your desired settings.

After completing the GoobiScript, you can now apply it to selected hits or the entire hit list. Before the execution, however, a security query appears in which the number of operations to which the GoobiScript should be applied must first be confirmed again.

Please note that you may not have access to all the GoobiScripts that Goobi offers. Some of the GoobiScripts may be hidden. Your user group may also not have been granted access to all GoobiScripts. A more detailed explanation of how to assign permissions to GoobiScripts can be found here:

Syntax

The syntax for using GoobiScript is based on the markup language YAML. This allows each GoobiScript to have built-in documentation with small help texts and the individual parameters are visually clear with syntax highlighting. Each parameter is placed on its own line and separated from its value by a colon and a space. Contrary to earlier versions of Goobi, values for parameters containing spaces are now also possible within GoobiScript without having to enclose them in quotes. A simple GoobiScript is structured exemplarily like this:

---
# This GoobiScript allows to add a new workflow step into the workflow.
action: addStep

# Title of the workflow step to add
steptitle: Upload images

# This number defines where in the workflow this new step is ordered into.
order: 3

The lines starting with a hash # form comments for help purposes. They can also be omitted before submitting the GoobiScript. Accordingly, the GoobiScript is a bit more compact afterwards:

---
action: addStep
steptitle: Upload images
order: 3

Run multiple GoobiScripts together

Thanks to the change to the new syntax, it is now also relatively unproblematic to have several GoobiScripts started together. Note that each GoobiScript is separated from the previous GoobiScript by the --- character. This way you can easily combine several commands and start them together. This could look like this, for example:

---
# This GoobiScript allows to add a new workflow step into the workflow.
action: addStep

# Title of the workflow step to add
steptitle: Upload images

# This number defines where in the workflow this new step is ordered into.
order: 3

---
# This GoobiScript allows to assign a user group to an existing workflow step.
action: addUserGroup

# Title of the workflow step to be edited
steptitle: Upload images

# Use the name of the user group to be assigned to the selected workflow step.
group: Photographers

---
# This GoobiScript allows to change the current status of a specific step in the workflow.
action: setStepStatus

# Title of the workflow step to be changed
steptitle: Upload images

# Value of the status. Possible values are `0` (locked), `1` (open), `2` (in work), `3` (done), `4` (error), `5` (deactivated)
status: 1

---
# This GoobiScript allows to add a plugin to a defined workflow step
action: addPluginToStep

# Title of the step to adapt
steptitle: Upload images

# Name of the plugin to be assigned to the workflow step
plugin: intranda_step_fileUpload

Here, too, the commas can be omitted accordingly for a shortened application, so that the entire call becomes clearer:

---
action: addStep
steptitle: Upload images
orderorder: 3
---
action: addUserGroup
steptitle: Upload images
group: Photographers
---
action: setStepStatus
steptitle: Upload images
status: 1
---
action: addPluginToStep
steptitle: Upload images
plugin: intranda_step_fileUpload

The processing of such multiple GoobiScripts takes place after sending in the order of the naming in each case over all affected processes. Related to this example this means, if 3 processes are concerned the following processing:

Please avoid starting additional GoobiScripts unless other GoobiScripts are already being processed. Otherwise, the processing of Goobi scripts may be interrupted. This limitation will be fixed in future versions of Goobi workflow.

Available GoobiScripts

You can choose from the following Goobi scripts:

GoobiScript: addUser

The GoobiScript addUser allows you to add a new user to a specific workflow step. Before you apply this Goobi script, you should ensure that you have the correct login name for the user you wish to add to that step. You also need to check the exact name of the step to which you want to add a new user. For the parameter steptitle you should select the name of the step to which you want to add the new user.

GoobiScript: addUserGroup

The GoobiScript addUserGroup is similar to addUser, as it gives additional user rights for workflow steps. For the parameter steptitle, enter the full name of the step to which you want to add a user group, and for the parameter entitled group enter the exact name of the user group you wish to add for that step.

GoobiScript: cloneProcess

The GoobiScript cloneProcess allows the duplication of one or more Goobi processes. The parameter content can be used to specify whether only the database contents and the METS file should be copied or whether all the associated directories (e.g. the images) should also be duplicated. The parameter title can be used to control the titles of the processes to be created. The variable system of Goobi is used here and thus allows a high degree of flexibility.

GoobiScript: renameProcess

The GoobiScript renameProcess allows you to rename a process or several Goobi processes. A search string is defined with the search parameter, the new value with replace and the search method with type (defined values contains - the search value should be contained in the process title - or full - the process title must match the search value exactly). Both the task title in the workflow and the image directories of the process in the file system are renamed.

GoobiScript: deleteTiffHeaderFile

For the GoobiScript deleteTiffHeaderFile there is no need to enter additional parameters. Running this GoobiScript will delete any previously created TIFF header files that can be used by a program that writes the TIFF headers into the images. This allows you, for example, to make centrally modified TIFF headers available for future use, since missing TIFF header files are automatically created on the basis of the configuration the next time the file is accessed.

GoobiScript: swapSteps

The GoobiScript swapSteps allows you to swap the order of two steps within the workflow of a number of processes. To perform a swap, you need to provide the details of each of the steps involved. Enter the workflow number and full name of the first and second step. Running this script will then swap the order of the steps you have specified. This makes it very easy to change workflows across a large number of processes.

GoobiScript: importFromFileSystem

The GoobiScript importFromFileSystem imports existing image sets from a defined output directory into processes that have already been created in Goobi. This can be useful if you want to import projects into Goobi that were created before Goobi was installed. Please note that all the image directories within the specified output directory must have the same name as the processes in Goobi. An automatic import from the file system can only be performed correctly if the folder name and process title are identical. For the parameter entitled sourcefolder, you need to specify the location of the individual directories containing the processes you wish to import.

GoobiScript: setRuleset

The GoobiScript setRuleset allows you to make a central change to the Goobi ruleset for a group of processes. This could be particularly important after detailed editing and testing of a ruleset (for safety reasons this is performed separately in a newly created ruleset), if you then wish to apply the new ruleset to the processes. For the parameter entitled ruleset, you need to specify the name of the ruleset using the name as it appears in the ruleset list in Goobi. The newly assigned ruleset will be entered when you run the GoobiScript, regardless of which ruleset is currently in place for the individual processes being changed.

GoobiScript: deleteStep

You can run the GoobiScript deleteStep if you want to delete a specific step from the workflow for a group of processes. Running the script will delete the workflow step (specified by its full name in the parameter steptitle) from the list of selected processes. Please note that this GoobiScript will also delete any production-related data being stored for that particular workflow step (e.g. project staff, processing date, status).

GoobiScript: addStep

The GoobiScript entitled addStep allows you to automatically create a new step with a specific name and a specific position in the workflow order. For the parameter steptitle, enter the name of the new step, and for the parameter order enter the required workflow order number. In addition to numerical values, the keyword end can also be used to add the step to the end.

GoobiScript: addStepAtOtherStepPosition

The GoobiScript addStepAtOtherStepPosition enables the creation of a workflow step with a defined title at a defined position within the workflow where another workflow step is already located. By inserting the new workflow step, all existing workflow steps with this or a subsequent position are moved so that the new workflow step can be inserted at the desired target position. The parameter newsteptitle allows you to define the title for the new workflow step to be inserted. The parameter existingsteptitle defines the name of the workflow step that determines the target position of the step to be inserted. The parameter insertionstrategy defines whether the new step is to be inserted before (before) or after (after) the specified existing step.

GoobiScript: setStepStatus

You can choose the GoobiScript setStepStatus to modify the workflow status for a group of processes at the same time. For the parameter steptitle, you need to enter the name of the workflow step whose status you wish to change. For the status parameter, you should enter the required numerical value using the system:

0 = locked
1 = open
2 = in progress
3 = closed
4 = error
5 = deactivated

GoobiScript: setStepNumber

Using the GoobiScript setStepNumber you can modify the workflow order number of an individual step for a group of processes. For the parameter steptitle you need to enter the full name of the workflow step you wish to change. For the number parameter you should enter the workflow order number you want to apply to that step for all the selected processes.

GoobiScript: addShellScriptToStep

The GoobiScript addShellScriptToStep allows you to add shell scripts or other command-line calls to designated workflow steps in a group of processes. For the parameter steptitle you need to specify the full name of the steps you wish to change. For the script parameter, enter the full command that you wish Goobi to execute in the form of a command-line call whenever this step is activated.

Please note that shell commands at Linux level begin with /bin/bash/.

In the parameter label you define the name for the shell script.

If parameters are to be grouped in the command so that they are passed as one argument to the new process, the quotes required for this must be escaped with a preceding quote each. An example for the script parameter would be accordingly:

script: /bin/bash /path/to/script.sh "parameter with blanks"

GoobiScript: setStepProperty

You can use the Goobi script setStepProperty to set individual options for a specific workflow step in a group of processes at the same time. For the parameter steptitle, you should enter the full name of the step you wish to select. For the property parameter, you will need to select one of the following values:

Parameter

Description

metadata

for changing the metadata property

readimages

for changing the property whether a reading access to the images should be possible

writeimages

for the property whether a write access to the images should take place.

validate

for the property whether a validation should take place when the workflow step is completed

exportdms

for the property whether the workflow step should be able to perform an export to the presentation system

batch

for the property whether the workflow step should be executed together with all other workflow steps in batch mode

automatic

for the property whether the workflow step should be executed automatically

importfileupload

for the property whether a file upload should be used for the import in this workflow step (Please note that this function is no longer used in Goobi).

acceptandclose

for the property whether the workflow step should be accepted directly without action and closed again (Please note that this function is no longer used in Goobi)

acceptmoduleandclose

for the property whether a module of a work step should be accepted and executed and the workflow step should also be completed immediately. (Please note that this function is no longer used in Goobi).

script

for the property whether the step should execute a script

delay

for the property whether this workflow step is a delay workflow step that should wait a configured time

updatemetadataindex

for the property that the internal database index is to be updated in this workflow step

generatedocket

for the property whether the user should be able to download a docket in this workflow step

You should also set the value of the actions you have specified here to activated or deactivated by entering the values true or false for the value parameter..

Sample: For example, if you select Scanning as the steptitle, writeimages as the property and true as the value and apply this GoobiScript to a group of processes, this will allow a user who accepts the step entitled Scanning to have write access to the images in his/her working directory for that step.

GoobiScript: export

The GoobiScript export allows you to export a large number of processes. The parameters exportImages and exportOcr can be used to specify whether the associated images and OCR data should be exported. If an export plugin has been configured in the workflow, that plugin will be loaded and used for the export; if not, Goobi will run the default export.

GoobiScript: runScript

Using the GoobiScript runScript, you can initiate a script for a particular workflow step outside the regular workflow. The parameter steptitle is used to enter the full title of the workflow step whose scripts you wish to run.

If the workflow step contains a number of scripts, you can specify which one you wish to run using the script parameter. If this parameter is left blank, all the scripts for that workflow step will be run in the specified sequence.

GoobiScript: deleteProcess

As the name suggests, the GoobiScript deleteProcess is used to delete processes. You have to use the parameter contentOnly (value true or false) to specify whether Goobi should delete only the data from the file system or, additionally, all the information from the database.

GoobiScript: addPluginToStep

The GoobiScript addPluginToStep allows you to add plugins to workflow steps. You can use the parameter steptitle to specify the name of the workflow step and the parameter plugin for the identifier of the plugin that you wish to add.

GoobiScript: updateImagePath

The GoobiScript updateImagePath updates the path to the image files within the METS files. No parameters are required to run this GoobiScript.

GoobiScript: updateContentFiles

The GoobiScript updateContentFiles updates the list of all image files within the METS files. No parameters are required to run this GoobiScript.

GoobiScript: addToProcessLog

The GoobiScript addToProcessLog allows adding messages to the process log. The type parameter determines how the message should be classified. The message parameter specifies the content of the message.

Possible types

Description

debug

Internal system messages, primarily for administrators

info

Information messages that every user should be able to see

warn

Warning messages that every user should see

error

Error messages that every user should see

user

User comments that users enter visibly for all other users

GoobiScript: setProject

The GoobiScript setProject allows you to assign the selected tasks to a defined project. The parameter project specifies which project should be used for this.

GoobiScript: runPlugin

The GoobiScript runPlugin allows the execution of a step plugin for the selected tasks. The parameter steptitle determines the step of the affected tasks from which the plugin is to be executed.

GoobiScript: import

The GoobiScript import is not intended for execution by users from the user interface. Instead, it is started during the execution of mass imports from the selected plugin. It then performs a mass import in the form specified in the import plug-in. The parameter plugin defines the unique name of the plugin. The identifiers parameter determines which identifiers the data records have that are to be imported. The parameter template determines which production template is to be used for the import.

GoobiScript: metadataDelete

The GoobiScript metadataDelete allows you to delete metadata from a process. The field parameter specifies the type of metadata, where the internal rule set name must be used. The value parameter defines the content of the metadata. The parameter ignoreValue determines whether the content of the parameter value is to be ignored and whether the metadata is to be deleted independently of its value. The parameter type can be used to control whether the metadata to be deleted are present as normal metadata, whether they occur within a named metadata group, or whether an entire metadata group is to be deleted. The parameter group allows the naming of the group concerned. The following application scenarios apply:

If metadata is given as value within type (or no value is given) and no name is given within the parameter group, a normal metadata is deleted.
If metadata is specified as the value within type (or no value is specified) and a name is specified within the parameter group, the metadatum is changed within the named group.
If group is specified as the value within type, the group named within the parameter field is deleted.

The parameter position allows you to specify where the metadata should be found:

Position

Description

work

This parameter specifies that the metadata should be adjusted at the level of the physical work. This selection automatically chooses the main element (e.g. a monograph) or, in the case of an anchor record, the sub-element (e.g. the periodical volume).

child

This parameter specifies that the metadata should be adjusted at the level of the sub-element of an anchor record (e.g. a periodical volume or volume).

top

This parameter specifies that the metadata should be adjusted at the level of the anchor record (e.g. at the level of a journal or a multi-volume work).

any

This parameter specifies that the metadata should be adjusted at all levels of the object (e.g. in the volume, in all chapters, title pages, illustrations, etc.).

physical

This parameter specifies that the metadata within the physical structure elements should be adjusted (e.g. metadata of the individual pages).

Sample calls:

Delete a metadata entry on top level:

---
action: metadataDelete
field: DocLanguage
value: deutsch
position: child
ignoreValue: false

Delete a metadata entry on top level, but the current value should be ignored

---
action: metadataDelete
field: DocLanguage
value: deutsch
position: top
ignoreValue: true

GoobiScript: metadataAdd

The GoobiScript metadataAdd allows you to add new metadata to a process. The field parameter defines the type of metadata, where the internal ruleset name must be used. The value parameter defines what content the new metadata should contain. The parameter ignoreErrors determines whether, in the event of an error, the processing and saving of the METS file should be continued or the processing for the object should be aborted. The parameter type can be used to control whether the change is to be made to a simple metadata or within a metadata group. The parameter group defines the metadata group to which the metadata is to be added. The following application scenarios exist:

If metadata is given as value within type (or no value is given) and no name is given within the parameter group, the metadata is added as normal metadata.
If metadata is given as the value (or no value is given) and a name is given within the group parameter, a metadata is added to the named group.
If group is specified as the value, a new metadata group is created with the name specified within the group parameter.

The parameter position allows you to specify where the metadata should be added:

Position

Description

work

child

This parameter specifies that the metadata should be adjusted at the level of the sub-element of an anchor record (e.g. a periodical volume or volume).

top

This parameter specifies that the metadata should be adjusted at the level of the anchor record (e.g. at the level of a journal or a multi-volume work).

any

This parameter specifies that the metadata should be adjusted at all levels of the object (e.g. in the volume, in all chapters, title pages, illustrations, etc.).

physical

This parameter specifies that the metadata within the physical structure elements should be adjusted (e.g. metadata of the individual pages).

Any authority data can also be added using the authorityName and authorityValue parameters.

Sample calls:

Adding a metadata entry on top level:

---
action: metadataAdd
field: DocLanguage
value: deutschTop
position: top
ignoreErrors: false

Adding a metadata entry on second level:

---
action: metadataAdd
field: DocLanguage
value: deutschChild
position: child
ignoreErrors: false

Adding a metadata entry with authority data:

---
action: metadataAdd
field: DocLanguage
value: deutschChild
position: child
ignoreErrors: false
authorityName: gnd
authorityValue: 123456789X

GoobiScript: metadataReplace

The GoobiScript metadataReplace allows you to replace a metadata with a new value. The old value is thus replaced by another value and is therefore no longer available. The field parameter determines which type the metadata has, whereby the internal ruleset name must be used here. The search parameter defines the current content of the metadata. The replace parameter defines which content the metadata is to have instead. The parameter group specifies whether the change is to be made within a metadata group. If no value is given here, the change is made to a normal metadata. If, on the other hand, a value is given, the change of the metadata takes place within the named metadata group. The parameter position allows you to specify where the metadata to be replaced should occur and be replaced:

Position

Description

work

child

This parameter specifies that the metadata should be adjusted at the level of the sub-element of an anchor record (e.g. a periodical volume or volume).

top

This parameter specifies that the metadata should be adjusted at the level of the anchor record (e.g. at the level of a journal or a multi-volume work).

any

This parameter specifies that the metadata should be adjusted at all levels of the object (e.g. in the volume, in all chapters, title pages, illustrations, etc.).

physical

This parameter specifies that the metadata within the physical structure elements should be adjusted (e.g. metadata of the individual pages).

Any authority data can also be added or changed using the authorityName and authorityValue parameters.

Sample calls:

Search for a value within a certain top-level metadata and replace it with something else:

---
action: metadataReplace
field: DocLanguage
search: deutschTop
replace: deutschNewTop
position: top

Find a value within a certain second level metadata and replace it with something else:

---
action: metadataReplace
field: DocLanguage
search: deutschChild
replace: deutschNewChild
position: child

GoobiScript: metadataReplaceAdvanced

The GoobiScript metadataReplaceAdvanced allows replacing a metadata with a new value. In contrast to metadataReplace, regular expressions can be used here to manipulate values. The field parameter determines what type the metadata has, whereby the internal ruleset name must be used here. The value parameter defines a regular expression that is applied to the content of the metadata. The parameter group specifies whether the change is to be made within a metadata group. If no value is given here, the change is made to a normal metadata. If, on the other hand, a value is given, the change of the metadata takes place within the named metadata group. Any authority data can be added or changed using the authorityName and authorityValue parameters.

Sample calls:

Finding a value within a specific top-level metadata and replacing it with something else:

---
action: metadataReplaceAdvanced
field: DocLanguage
value: s/deutsch/german/g
position: top

GoobiScript: metadataChangeValue

The GoobiScript metadataChangeValue allows the manipulation of existing metadata of a process. Prefixes or suffixes can be added to an existing metadata to extend the content of a metadata. The field parameter specifies the type of metadata, where the internal ruleset name must be used. The content of the prefix parameter is used to prefix a text with the current value of the metadata. The content of the parameter suffix is used to append a text after the current value of the metadata. The parameter group specifies whether the change is to be made within a metadata group. If no value is given here, the change is made to a normal metadata. If, on the other hand, a value is given, the change of the metadata takes place within the named metadata group. The parameter position allows you to specify where the metadata should be present and adjusted:

Position

Description

work

child

This parameter specifies that the metadata should be adjusted at the level of the sub-element of an anchor record (e.g. a periodical volume or volume).

top

This parameter specifies that the metadata should be adjusted at the level of the anchor record (e.g. at the level of a journal or a multi-volume work).

any

This parameter specifies that the metadata should be adjusted at all levels of the object (e.g. in the volume, in all chapters, title pages, illustrations, etc.).

physical

This parameter specifies that the metadata within the physical structure elements should be adjusted (e.g. metadata of the individual pages).

condition

This parameter can be used optionally to specify a condition for the replacement of metadata. If this parameter is used and the given value is not empty, the replacement is applyed to all data sets that contain the text given here in the previor metadata field.

Sample calls:

Add a prefix to a top-level metadata::

---
action: metadataChangeValue
field: DocLanguage
prefix: start_
suffix: _end
position: top

Add a suffix to a top-level metadata, but there must be a specific value in the metadata:

---
action: metadataChangeValue
field: DocLanguage
suffix: ist eine schwierige Sprache
position: top
condition: Deutsch

Add a prefix and suffix to a second-level metadata:

---
action: metadataChangeValue
field: DocLanguage
prefix: start_
suffix: _end
position: child

GoobiScript: metadataChangePersonType

The GoobiScript metadataChangePersonType changes the role type of a person. The call requires four parameters. The oldType parameter specifies what the old type should be, and the newType parameter specifies the type that the person should receive instead. The parameter ignoreErrors determines whether, in the event of an error, processing and saving of the METS file should continue or whether processing should be aborted for the object. The parameter position, on the other hand, controls where the person should be present to be changed:

Position

Description

work

child

This parameter specifies that the metadata should be adjusted at the level of the sub-element of an anchor record (e.g. a periodical volume or volume).

top

This parameter specifies that the metadata should be adjusted at the level of the anchor record (e.g. at the level of a journal or a multi-volume work).

any

This parameter specifies that the metadata should be adjusted at all levels of the object (e.g. in the volume, in all chapters, title pages, illustrations, etc.).

physical

This parameter specifies that the metadata within the physical structure elements should be adjusted (e.g. metadata of the individual pages).

GoobiScript: metadataChangeType

The GoobiScript metadataChangeType changes the type of a metadata. The call requires four parameters. The oldType parameter specifies the old type of the metadatum. The newType parameter specifies the new type for the metadata. The parameter ignoreErrors controls whether, in case of an error, the processing and saving of the METS file should be continued or whether the processing for the object should be aborted. The parameter type can be used to control whether the change is to be made to a simple metadata or within a metadata group. The parameter group defines the metadata group for which the metadatum is to be changed. The following application scenarios exist:

If metadata is specified as value within type (or no value is specified) and no name is specified within the parameter group, a normal metadata is changed.
If metadata is specified as the value within type (or no value is specified) and a name is specified within the parameter group, the metadata is changed within the named group.
If group is specified as the value within type, the type of the named group is changed from the old value (oldType) to the new value (newType).

The parameter position allows you to specify where the metadatum should occur so that it is affected by the change:

Position

Description

work

child

This parameter specifies that the metadata should be adjusted at the level of the sub-element of an anchor record (e.g. a periodical volume or volume).

top

This parameter specifies that the metadata should be adjusted at the level of the anchor record (e.g. at the level of a journal or a multi-volume work).

any

This parameter specifies that the metadata should be adjusted at all levels of the object (e.g. in the volume, in all chapters, title pages, illustrations, etc.).

physical

This parameter specifies that the metadata within the physical structure elements should be adjusted (e.g. metadata of the individual pages).

GoobiScript: changeProcessTemplate

The GoobiScript changeProcessTemplate allows you to change the workflow for the affected processes. The templateName parameter defines which production template is to apply to the operations. If this GoobiScript is applied to processes, Goobi tries to set the steps already performed to the identical status in the updated workflow if possible. This can only succeed if the steps have the same titles.

GoobiScript: updateDatabaseCache

The GoobiScript updateDatabaseCache ensures that the internal database table of the Goobi database is updated with the status of the workflows and the associated media files as well as metadata. This is important if, for example, the metadata has been modified outside of Goobi, or if a new index field has been defined. Among other things, various statistics are based on these database tables and therefore require as up-to-date values as possible for the visualization of information.

No parameters are required to run this GoobiScript.

GoobiScript: propertySet

The GoobiScript propertySet allows you to add and change a process property. The parameter name specifies the name of the property. The value parameter specifies the value that the property should have. If a property already exists with the name specified here, its value is changed to the value specified here.

GoobiScript: propertyDelete

The GoobiScript propertyDelete allows the deletion of process properties. The parameter name specifies the name of the properties to be deleted.

GoobiScript: executeStepAndUpdateStatus

The GoobiScript executeStepAndUpdateStatus executes a selected step and then updates the workflow for further processing of the following work steps. The steptitle parameter determines which step is to be executed. After the call, Goobi checks whether this is a script step, an export step, a plug-in step or an HTTP step and executes it accordingly. If the work step is marked as automatic, the further workflow process is continued after the execution. If, on the other hand, the call triggers a wait mode, the status is not changed by Goobi but waits for a status change by the respective plugin or script itself. If an error occurs while the started step is being executed, the status of the workflow step is set to error.

GoobiScript: exportDatabaseInformation

The GoobiScript exportDatabaseInformation exports all database contents of the selected Goobi operations to an internal XML file. This is then located in the Goobi file system within the process folder and has a file name that can be used to import the data into another Goobi instance. The path of such a file is e.g.

/opt/digiverso/goobi/metadata/123/123_db_export.xml

No parameters are required to run this GoobiScript.

GoobiScript: moveWorkflowForward

With the GoobiScript moveWorkflowForward the status of the workflow can be moved forward step by step. Each time this GoobiScript is executed, the active workflow step is changed accordingly (e.g. from open to in work).

GoobiScript: moveWorkflowBackward

With the GoobiScript moveWorkflowBackward the status of the workflow can be moved backwards step by step. Each time this GoobiScript is executed, the active work step is changed accordingly (e.g. from completed to in work).

GoobiScript: setPriority

The GoobiScript setPriority can be used to define the priority of individual or all process steps. The parameter priority determines which priority should be used. The following values are available: standard, high, higher, highest and correction. The parameter steptitle determines for which workflow step the priority is to be set. If the parameter steptitle is not specified, the priority is changed for all workflow steps of the selected processes.

goobi_config.properties

In the configuration file goobi_config.properties, some fundamental settings are done for Goobi Workflow. The file is usually located at the following file system path:

/opt/digiverso/goobi/config/goobi_config.properties

For example, this configuration file looks as follows:

goobi_config.properties

# -----------------------------------
# application information
# -----------------------------------

ApplicationTitle=http://goobi.io
ApplicationHeaderTitle=Goobi workflow

# Text that describes the website
ApplicationHomepageMsg=info
ApplicationWebsiteMsg=info

# Developer mode (true) or production mode (false)
developing=false

# -----------------------------------
# directories
# -----------------------------------

# Main folder for Goobi incl. subfolders config, xslt, rulesets, metadata etc.
# Path configured here should end with path separator
# sample and default if missing: /opt/digiverso/goobi/
goobiFolder=/opt/digiverso/goobi/

# use this folder if metadata directory is not goobiFolder + metadata/
#dataFolder=

# parent folder for home directories, default is goobiFolder + users/
#dir_Users=/home/

#folder for debugging files, can be used by opac beautifier 
#debugFolder=

# folder for the process log
# for compatibility reasons, also folder_processlog_internal is requested
#folder_journal_internal=intern

# folder for mass upload functionality
#doneDirectoryName=fertig/

# path for swapping, this could for example be /tmp/unused/
#swapPath=

# naming rule for master folder 
#process.folder.images.master={processtitle}_master

# naming rule for media folder
#process.folder.images.main={processtitle}_media

# naming rule for source folder
#process.folder.images.source={processtitle}_source

# naming rule for image fallback folder - not needed any more, is replaced with thumbs directories
# default value is an empty string
#process.folder.images.fallback={processtitle}_jpeg

# naming rule for ocr text folder
#process.folder.ocr.txt={processtitle}_txt

# naming rule for pdf folder
#process.folder.ocr.pdf={processtitle}_pdf

# naming rule for ocr xml folder
#process.folder.ocr.xml={processtitle}_xml

# naming rule for ocr alto folder
#process.folder.ocr.alto={processtitle}_alto

# naming rule for import folder
#process.folder.import=import

# naming rule for export folder
#process.folder.export=export

# create master directory if it does not exist
createOrigFolderIfNotExists=true

# indicates whether the source folder should be created automaticly or not, default is false
createSourceFolder=false

# -----------------------------------
# global user settings
# -----------------------------------

# set a default language, language can be changed by the user. If no language is set, the browser default is used
#defaultLanguage=

# anonymized statistics, displaying user on steps, etc
# possible values: true/false
anonymize=false

# enable or disable usage of gravatar icons
#enableGravatar=true

# The minimum password length for user accounts
# This value is also used for new generated passwords, they are generated with
# this length + 10
minimumPasswordLength=8

# Multiple additional user rights can be set here, default is an empty list
userRight=

# -----------------------------------
# user interface features
# -----------------------------------

ui_useIntrandaUI=true

# include the file accessibility.css in the template. Can be overwritten in user settings
renderAccessibilityCss=false

# show statistics box on startpage, default is true
showStatisticsOnStartPage=true

# enable or disable the finalize button in task/batch edition screens. The default value is to display the button
#TaskEnableFinalizeButton=true

#show button to link into home directory
#ui_showFolderLinkingInProcessList=false

#display confirmation dialogue when link into home directory is set from process list 
#confirmLinking=false

# A button to reimport already exported content can be activated
#renderReimport=false

# use this parameter to exlude user agents from session list
excludeMonitoringAgentName=Munin
excludeMonitoringAgentName=munin
excludeMonitoringAgentName=nagios-plugins
excludeMonitoringAgentName=monitoring-plugins
excludeMonitoringAgentName=ELB-HealthChecker/2.0
excludeMonitoringAgentName=python-requests

# -----------------------------------
# ldap
# -----------------------------------

# Logins ueber LDAP verwenden
ldap_use=false

# -----------------------------------
# truststore
# -----------------------------------

# Keystore for LDAP and other services
# There is no default value, but it can look like this example:
# /opt/digiverso/goobi/scripts/mykeystore.ks
#truststore=
#truststore_password=

# -----------------------------------
# open id connect
# -----------------------------------

# Must be set to true to use OpenID Connect
#useOpenIdConnect=false
# Automatic redirection to OpenID Connect login
#OIDCAutoRedirect=false
# The OpenID Connect authentication server
#OIDCAuthEndpoint=
# API endpoint for logout command
#OIDCLogoutEndpoint=
# The issuer of OpenID Connect
#OIDCIssuer=
# The JWK set of OpenID Connect
#OIDCJWKSet=
# The client ID of this server
#OIDCClientID=
# The notifying method
#OIDCIdClaim=email
# Can be set to true to use SSO logout
#useOIDCSSOLogout=false

# -----------------------------------
# single sign on
# -----------------------------------

# Enables a login method via HTTP header field
#EnableHeaderLogin=false
# The login type via the HTTP header field
#SsoParameterType=header
# The name of the HTTP header field to login
#SsoHeaderName=Casauthn
# Enables a logout page for 
#showSSOLogoutPage=false

# -----------------------------------
# external users
# -----------------------------------

# enable an additional login area for external users, it allows a different UI and a self registration 
#EnableExternalUserLogin=false

# assign the self registered users to this institution, this could be "goobi" for example
#ExternalUserDefaultInstitution=

# assign the self registered users to this authentication type
#ExternalUserDefaultAuthentication=

# -----------------------------------
# database search
# -----------------------------------

# enable fulltext search mode for metadata searches. Doesn't work on h2 or older mysql/mariadb databases
useFulltextSearch=false

# configure fulltext search mode, possible values are 'NATURAL LANGUAGE MODE' or 'BOOLEAN MODE'
# see https://www.w3resource.com/mysql/mysql-full-text-search-functions.php
#FulltextSearchMode=BOOLEAN MODE

#truncation characters in sql queries
#DatabaseLeftTruncationCharacter=%
#DatabaseRightTruncationCharacter=%

# enable this to use a specific index for my tasks queries. The best table index might be different from database to database
# if commented out, no specific index is used
#SqlTasksIndexname=status_x_title

#generate one metadata index field for multiple metadata coming from the METS file
#index.ids=CatalogIDDigital,CatalogIDSource

# -----------------------------------
# processes and process log
# -----------------------------------

# Set this to true to automatically reset the process log if processes are cloned
#ProcessCreationResetJournal=false

# allow white spaces in directory names or replace them with __
#dir_allowWhiteSpaces=false

# allow import with plugin mechanism for mass imports
massImportAllowed=false

# allow process title duplication
#MassImportUniqueTitle=true

# number of maximal items per batch, if not configured the default is 100
batchMaxSize=500

# enables the option to see the last edition date, username and title of the last finished step
ProcesslistShowEditionData=false

# Defines the start times of the daily delay job, the daily vocab job and the daily history analyser job
# If missing or value is -1, the job is disabled.
# Every other number is interpreted as MILLISECONDS after midnight.
# These values are requested by org.goobi.production.flow.jobs.JobManager
dailyDelayJob=-1
dailyVocabJob=-1
dailyHistoryAnalyser=-1

# This is the upload frequency of the goobi authentication server in MINUTES
# This value is requested by org.goobi.production.flow.jobs.JobManager
goobiAuthorityServerUploadFrequencyInMinutes=-1

# Activates additional columns for search result
downloadAvailableColumn=CatalogIdDigital
downloadAvailableColumn=TitleDocMain
downloadAvailableColumn=PublicationYear
downloadAvailableColumn=PlaceOfPublication

# Text templates for error reporting and problem solutions
#task.error.Missing\ pages=The following pages are missing: {}
#task.error.Blurred\ images=The images {} are unsharp. Please create these again.
#task.solution.Problem\ solved=The problem was solved. {}
#task.solution.The\ original\ print\ is\ blurred=The original pages are printed blurry. It is not possible to create sharper images. {}

# -----------------------------------
# scripts
# -----------------------------------

# These values can be set to paths to bash script files that do the concerning tasks
#script_createDirUserHome=
#script_createDirMeta=
#script_createSymLink=
#script_deleteSymLink=

# -----------------------------------
# s3 bucket
# -----------------------------------

# Can be set to true to enable S3 usage
#useS3=false

# Can be set to true to use a custom S3 service
#useCustomS3=false

# If useCustomS3 is enabled, the endpoint can be specified here
#S3Endpoint=

# The used S3 bucket is specified here
#S3bucket=

# The access id of the account of goobi against the S3 service
#S3AccessKeyID=

# The secret access key of that account
#S3SecretAccessKey=

# The number of retries if a connection does not succeed
#S3ConnectionRetry=10

# The timeout for any connection tries
#S3ConnectionTimeout=10000

# The timeout for socket concerning things
#S3SocketTimeout=10000

# -----------------------------------
# proxy server
# -----------------------------------

http_proxyEnabled=false
#http_proxyUrl=127.0.0.1
#http_proxyPort=3128
http_proxyIgnoreHost=127.0.0.1
http_proxyIgnoreHost=localhost

# -----------------------------------
# internal servers and interfaces
# -----------------------------------

# allow external programms to send commands to Goobi via WebAPI
useWebApi=false

# The token salt value can be used to make authentication on the Goobi REST API more secure
#apiTokenSalt=

#the jwtSecret is needed to (among others) authenticate mail delivery deactivation
#jwtSecret=

# goobi base url, can be used when url cannot be detected from user sessions
goobiUrl=http://localhost:8080/goobi

#The url of the plugin server of goobi
#pluginServerUrl=

# Basispfad fuer OCR (ohne Parameter)
ocrUrl=

# TimeOut for GoobiContentServlet-Request via HTTP in ms (default value, if nothing defined here: 60000)
goobiContentServerTimeOut=30000

# The url, the user name, the password and the upload frequency can be set
# for the content server here
goobiAuthorityServerUrl=
goobiAuthorityServerUser=
goobiAuthorityServerPassword=
goobiAuthorityServerUploadFrequency=

# account name for geonames api
#geonames_account=

# -----------------------------------
# message broker
# -----------------------------------

# Set this to true to let the message broker start
#MessageBrokerStart=false

# The default value is the configuration folder + "goobi_activemq.xml"
#ActiveMQConfig=

# The server IP or domain of the message broker
#MessageBrokerServer=localhost

# The port number of the message broker
#MessageBrokerPort=61616

# The password to access the message broker
#MessageBrokerPassword=

# The number of parallel messages can be set here
#MessageBrokerNumberOfParallelMessages=1

# External Queues can be enabled here
allowExternalQueue=false

# The type of the external queue, currently the possible values are "SQS" and "activeMQ"
externalQueueType=activeMQ

#set this to true if you want to test the SQS external queue with elasticMQ
useLocalSQS=false

# -----------------------------------
# mets editor
## mets editor / general properties
# -----------------------------------

# initialise all sub elements in Mets editor to assign default values, default value is true
MetsEditorEnableDefaultInitialisation=true

# create pagination when mets editor is opened 
#MetsEditorEnableImageAssignment=true

# use special pagination type for automatic default pagination (uncounted, roman, arabic)
MetsEditorDefaultPagination=uncounted

# configure the locking time for mets editor timeout in ms, default is 30 minutes
MetsEditorLockingTime=1800000

# use external ocr for text in mets editor or use existing files 
#MetsEditorUseExternalOCR=false

# The number of backups can be set here. 0 means that no backups are created
numberOfMetaBackups=0

# -----------------------------------
## mets editor / user interface
# -----------------------------------

# OCR-Button fuer ausgewaehltes Strukturelement anzeigen
showOcrButton=false

# Display the METS editor area for manipulation of the image set
MetsEditorDisplayFileManipulation=false

# Display archived folders
MetsEditorShowArchivedFolder=false

# display/hide metadata popup in structure tree
#MetsEditorShowMetadataPopup=true

# use a maximum of characters to display titles in the left part of mets editor, the default value is 0 (everything is displayed)  
MetsEditorMaxTitleLength=0

# -----------------------------------
## mets editor / images and thumbnails
# -----------------------------------

# enable to show image comments in METS editor, imageQA and LayoutWizzard
ShowImageComments=false

# Number of images in thumbnail view
MetsEditorNumberOfImagesPerPage=96

# This value can be set to true to use image tiling in the METS editor
MetsEditorUseImageTiles=true

# sorting of images
# At this time implemented sorting options:
# number (default): 1 is lesser then 002, compares the number of image names, characters other than digits are not supported
# alphanumeric: 1 is greater then 002, compares character by character of image names, all characters are supported
ImageSorting=number

# Prefix for image names as regex. Default is 8 digits \\d{8} and gets validated
ImagePrefix=\\w+
#ImagePrefix=\\d{8}
#ImagePrefix=.+

# define owner of images, when read access is provided. Default is root user 
#UserForImageReading=root

# This can be set to true to use image thumbnails
UseImageThumbnails=true

# Size of thumbnails in METS editor
MetsEditorThumbnailsize=200

# Maximum number of requested thumbnails to not overload the server
MaxParallelThumbnailRequests=100

# The maximum image size in pixels
MetsEditorMaxImageSize=15000

# The maximum image size in bytes, MaxImageFileSizeUnit must be set as factor
MaxImageFileSize=4000

# The unit for the maximum image size in bytes, MaxImageFileSize must be set as numeric value
MaxImageFileSizeUnit=MB

# Sizes for big images in METS editor to allow standard display and deep zoom
# This value can be set multiple times
MetsEditorImageSize=

# The size of image tiles
# This value can be set multiple times
MetsEditorImageTileSize=

# The scale of image tiles
# This value can be set multiple times
MetsEditorImageTileScale=

# A list of image file types that are used for process checks
historyImageSuffix=.tif

# -----------------------------------
## mets editor / validation
# -----------------------------------

# grundsaetzliche Metadatenvalidierung durchfuehren oder nicht
useMetadatenvalidierung=true

# Validate the images in the METS editor
MetsEditorValidateImages=true

# regular expression to check if the process title is valid
validateProcessTitelRegex=[\\w-]*

# regular expression for all characters to remove in title generation
ProcessTitleGenerationRegex=[^\\w-]

# -----------------------------------
## mets editor / export
# -----------------------------------

# The path to the exif tool to export the images
ExportExiftoolPath=/usr/bin/exiftool

# set if Master-Images-Folder 'orig_' should be used at all
useOrigFolder=true

# if this parameter is missing or 'false' the old export mechanism is used, otherwise there is no timelimit for export 
exportWithoutTimeLimit=true

# Validate images on mets export. Default value is true
ExportValidateImages=true

# Defines the name of the metadata field where the project title gets exported.
# If the field is empty, missing or contains an unknown value, the project title is not written.
#ExportMetadataForProject=MetadataName

# Defines the name of the metadata field where the institution name gets exported.
# If the field is empty, missing or contains an unknown value, the institution name is not written.
#ExportMetadataForInstitution=MetadataName

# Defines the name of the metadata field where the dfg viewer link gets exported.
# If the field is empty, missing or contains an unknown value, the link is not written.
#ExportMetadataForDfgViewerUrl=MetadataName

# Define if files shall get exported if optional file groups for these files are configured
ExportFilesFromOptionalMetsFileGroups=false

# export in temporary file, move it to destination or export directly to destination
ExportInTemporaryFile=false

# Use UUID for each file id instead of incremental numbers
ExportCreateUUID=true

# Create premis elements for technical metadata for each exorted file  
ExportCreateTechnicalMetadata=false

# Define here if in the automatic export images shall be exported too or not
automaticExportWithImages=true

# Define here if in the automatic export OCR results shall be exported too or not
automaticExportWithOcr=true

# Allow the PDF generation as downloadable file instead of storing it into the users home directory
pdfAsDownload=true

General

This configuration file has been used in Goobi Workflow for a very long time and therefore contains partly obsolete settings which are supported nowadays only for compatibility reasons. These are marked separately and if necessary describe an alternative how to replace these settings.

Since the settings have been added over a long period of time and are still being added, different naming conventions have always been used in the variable names. Therefore, special attention must be paid to the correct use of upper and lower case as well as underscores and periods in the variable names.

This configuration file contains settings for many different topics. Sometimes it is unavoidable that certain settings fit to several topics and, depending on the application purpose, can also be in different categories. If certain settings cannot be found at the expected place, it is recommended to search the page with the search function of the browser (usually Ctrl+F).

Many settings have default values that are chosen to be useful for most users. Therefore, not all settings need to be specified in the configuration file or can simply be commented out.

Comments are marked with a hash symbol (#) at the beginning of the line. This is also useful to disable settings.

# This is a comment and is ignored
# setting=This is a setting that is also ignored
setting=This setting is used

Data types used

Data types are specified for the settings in this configuration file. Unless otherwise specified, the following values are allowed:

Type

Description

Text

Any text can be specified here.

Boolean

Here the values 'true' or 'false' can be used to enable or disable a functionality.

Number

Any integer numbers can be specified here. Comma numbers are not allowed and are not used by any settings so far. If necessary, the setting specifies which range of values can be processed.

Time

This data type is actually intended for very large numbers and is used for time periods in this configuration file. The usage is described in more detail in the respective settings.

Basic information about Goobi Workflow

There is some basic information about the installed Goobi Workflow instance and about Goobi in general, specified in the configuration and partially displayed on the login page in the web browser.

Property

Type

Default value

Description

ApplicationTitle

Text

http://goobi.io

The URL specified here leads to the general Goobi web page. There you can find more information about Goobi Workflow, the Goobi Viewer and Goobi-to-go. The documentation and the community section are also linked there.

ApplicationHeaderTitle

Text

Goobi workflow

The name of the software is entered here. It will be displayed on the Goobi Workflow login page.

ApplicationHomepageMsg

Text

Here you can optionally specify the URL of an associated home page for the currently installed Goobi Workflow instance.

ApplicationWebsiteMsg

Text

The URL of an associated website for the currently installed Goobi Workflow instance can optionally be specified here.

developing

Boolean

false

This switch indicates whether Goobi Workflow is in development mode. On production systems this value is always false.

Directories

Some directory paths can be set in the configuration file. Most of them are specified relative to the Goobi directory. The Goobi directory is usually located in /opt/digiverso/goobi/.

Some directories contain the text sequence {processtitle}. This is used in the respective directory names to insert the automatically generated title of a process. This ensures that the corresponding directories are created and used separately for each process.

Property

Type

Default value

Description

goobiFolder

Text

/opt/digiverso/goobi/

This is the root directory of Goobi. It contains common directories such as config/, plugins/, rulesets/ etc.

dataFolder

text

metadata/

All files of processes are stored in this directory.

dir_Users

Text

users/

This directory contains information about Goobi users.

debugFolder

Text

This directory can be used to write an intermediate result to an XML file when importing Opac data. This is created in the directory specified here under the name opacBeautifyAfter.xml.

folder_journal_internal

Text

internal

Process log files are stored in this directory. **This value replaces the value folder_processlog_internal and is preferred if both values are used.

folder_processlog_internal

Text

internal

Process log files are stored in this directory. **This value is deprecated and was replaced with folder_journal_internal, but is still supported for compatibility reasons.

doneDirectoryName

Text

finished/

Files to be uploaded to other systems can be copied to this directory. The directory and all files in it will be deleted after the upload.

useSwapping

Boolean

false

This parameter specifies whether more space should be made available on external disks.

swapPath

Text

The "swap" directory is used to have more space available on the current system. The directory is usually located on a network hard disk or an external disk that must be mounted in the operating system before use. To use swapping, useSwapping=true must be set.

process.folder.images.master

text

{processtitle}_master

This directory is used to store the original images for the respective process.

process.folder.images.main

Text

{processtitle}_media

In this directory additional data for the respective process will be stored.

process.folder.images.source

Text

{processtitle}_source

Additional resources for the respective process are stored in this directory.

process.folder.images.fallback

Text

This directory contains files that are created during the processing of the process and are to be processed later.

process.folder.images.[name]

Text

At this point any other [name] values can be used to provide further directories for different applications, for example also plugins.

process.folder.ocr.txt

Text

{processtitle}_txt

This directory is used to store TXT files for OCR processing.

process.folder.ocr.pdf

Text

{processtitle}_pdf

This directory contains PDF files for OCR processing.

process.folder.ocr.xml

Text

{processtitle}_xml

This directory contains XML files for OCR processing.

process.folder.ocr.alto

Text

{processtitle}_alto

This directory contains additional files for OCR processing.

process.folder.import

Text

import

Files for importing processes are stored in this directory.

process.folder.export

Text

export

Files for exporting processes are stored in this directory.

createOrigFolderIfNotExists

Boolean

true

This value can be set to true to automatically create the master directory for processes if it does not exist yet.

createSourceFolder

Boolean

false

This value can be set to true to create the directory named in process.folder.images.source if it does not exist yet.

General user settings

All custom user settings are stored by Goobi Workflow in the internally used database. The settings specified here are relevant for pages where no user is logged in (for example, the login page) and therefore the user-specific settings do not exist. Other settings in this category apply to all users and are not account-specific customizable.

Property

Type

Default value

Description

defaultLanguage

Text

This setting can be used to specify the default language of the Goobi Workflow user interface. This language will be used before a user is logged in and the language set for him/her will be loaded. The language abbreviations of the languages available on the Goobi Workflow instance can be used as language. By default these are de, en, es, fr, it, iw, nl and pt.

anonymize

Boolean

false

This switch can be set to true if the names of other users should be hidden in statistics and step and process details.

enableGravatar

Boolean

true

This switch can be used to specify whether profile pictures are used for users. Profile pictures are specified in the user accounts. If this setting is enabled and a user has no profile picture, the Goobi logo will be used by default.

Minimum password length

Property

Type

Default value

Description

minimumPasswordLength

Number

8

This value specifies the minimum password length for users. If a length less than 1 is specified, this value is replaced by 1.

The password length is only checked when creating an account and when changing the own password. Existing passwords that are too short can still be used as long as this value is increased afterwards.

Administrators have the possibility to generate a new random password for users. These are always generated 10 characters longer than the minimum password length.

Additional user permissions

Additional user permissions can be specified. For this purpose, a new entry with the 'userRight' property is made for each added permission. Thus, this value can occur multiple times and will be interpreted as a list by Goobi Workflow.

Property

Type

Default value

Description

userRight

Text

This setting can be used to add additional user permissions.

This could look like this

userRight=Statistics_Latest
userRight=Statistics_Most_Relevant
userRight=Statistics_Users

Note: You should not enter user permissions whose name already exists in Goobi Workflow. This can lead to wrong behavior of some functions.

Additional functionalities in the user interface

Goobi Workflow allows some optional functionalities for the user interface in the web browser. For this purpose there are the following switches that can be set to 'true' to enable functionalities:

Property

Type

Default value

Description

ui_useIntrandaUI

Boolean

true

This switch can be used to set whether the Intranda web interface is used as user interface. Nowadays this switch is always true

renderAccessibilityCss

Boolean

false

This switch sets whether an accessible design is loaded by default (for example on the login page). This is useful if some users depend on the accessible design and the user-specific accessible design is not yet loaded on the login page.

showStatisticsOnStartPage

Boolean

true

This switch can be set to show statistics on the start page (after login).

TaskEnableFinalizeButton

Boolean

true

This switch can be set if a user should have a button in the user interface to complete steps by himself.

ui_showFolderLinkingInProcessList

Boolean

false

This switch can be set to show extended download buttons for tasks in the task list.

confirmLinking

Boolean

false

This switch can be set to interpose a confirmation prompt before executing scripts.

renderReimport

Boolean

false

This switch can be set to display a button when downloading a process that allows to re-import the process.

The following setting is used to filter the list of displayed sessions in the session overview. The setting itself can be used multiple times, each time specifying a client name (for example browser name) to be filtered from the session list.

Property

Type

Default value

Description

excludeMonitoringAgentName

Text

This setting specifies a name of a client which should not be displayed in the session list.

For example, this looks like this if there are multiple names:

excludeMonitoringAgentName=Munin
excludeMonitoringAgentName=munin
excludeMonitoringAgentName=nagios-plugins
excludeMonitoringAgentName=monitoring-plugins
excludeMonitoringAgentName=ELB-HealthChecker/2.0
excludeMonitoringAgentName=python-requests

Activate LDAP

The actual LDAP configuration is located in the database used by Goobi Workflow. In this database, there is a data record with numerous setting options for each LDAP group used. Whether LDAP is included can be controlled with the following parameter.

Property

Type

Default value

Description

ldap_use

Boolean

false

This value indicates whether an LDAP service should be used.

Configure truststore

The truststore is used in Goobi Workflow to manage certificates and SSH keys. These can be used, for example, for authentication to the LDAP server or to other servers. To use the truststore, the following values must be configured.

Property

Type

Default value

Description

truststore

Text

This value specifies where the truststore is located.

truststore_password

Text

This value specifies the password for authentication in the truststore.

OpenID Connect

Normally, user accounts are stored in the mariadb database managed by Goobi Workflow. Additionally there is the possibility to authenticate users with OpenID accounts.

Property

Type

Default value

Description

useOpenIdConnect

Boolean

false

Setting this value to true will enable the ability to login with OpenID Connect in Goobi Workflow.

OIDCAutoRedirect

Boolean

false

If this value is set to true, the login page will be redirected directly to the login page of the OpenID provider. After successful login, this redirects directly back to Goobi Workflow.

OIDCAuthEndpoint

Text

This specifies the API endpoint (URL or URI) used to authenticate the user. Specifying this value also configures the provider of the OpenID service.

OIDCLogoutEndpoint

Text

Since a second API endpoint is used for logout, it must be explicitly specified here. This endpoint is also specified in the form of a URL or URI.

OIDCIssuer

Text

The issue service of the OpenID provider is configured here.

OIDCJWKSet

Text

The JWK service of the OpenID provider is configured here.

OIDCClientID

Text

This specifies the client id that Goobi Workflow uses against the OpenID service.

OIDCIdClaim

Text

email

The value specified here can be set in the user database in the ssoId column to allow Goobi Workflow to use or ignore the OpenID service depending on the account.

useOIDCSSOLogout

Boolean

false

If this value is set to true, the user will be redirected to an intermediate page after successfully logging out.

Single Sign On (SSO)

SSO can be used to allow and configure authentication via HTTP headers. This allows or disallows staying logged in to a session in the browser using HTTP header fields.

Property

Type

Default value

Description

EnableHeaderLogin

Boolean

false

Setting this value to true will enable SSO login in Goobi Workflow.

SsoParameterType

Text

header

This value determines where exactly the SsoHeaderName will be searched. If header is specified here, the value will be searched in the HTTP header errors. If attribute is specified here, the value will be searched in the URL parameters.

SsoHeaderName

Text

Casauthn

The text specified here will be requested as HTTP header field. The value sent in this field will be read and must match the SSO ID.

showSSOLogoutPage

Boolean

false

If this value is set to true, a corresponding intermediate page will be displayed after logout.

Set up external users

The following settings can be used to determine whether users with external accounts can be logged in to the system and, if so, how default values should be set that otherwise exist for normal accounts in the database.

Property

Type

Default value

Description

EnableExternalUserLogin

Boolean

false

If this value is set to true, people with external accounts can log in to the system.

ExternalUserDefaultInstitution

Text

Since external accounts are not assigned to any institution, an alternative institution name for all external accounts can be specified here.

ExternalUserDefaultAuthentication

Text

Here you can specify an LDAP group to which users with external accounts should be assigned by default.

Database settings

In Goobi Workflow it is possible to search for specific terms in processes, tasks and other records. Since the search is handled by the SQL database used in the background, there are some settings available to adapt the search to the needs of each project.

Property

Type

Default value

Description

useFulltextSearch

Boolean

false

If this switch is set to true, a full text search is performed that searches all database entries instead of a simple search using titles and a few metadata.

FulltextSearchMode

Text

BOOLEAN MODE

The search mode can be used to specify how to search the database. If BOOLEAN MODE is specified, regex-like expressions are used for searching. An alternative value is NATURAL LANGUAGE MODE. In this mode the search term (even with syntactically special characters) is searched directly in the text.

DatabaseLeftTruncationCharacter

Text

%

This character (or character string) is used as prefix in a database search query in connection with the SQL command LIKE. A % used as a prefix stands for any other text that can occur before the search term.

DatabaseRightTruncationCharacter

Text

%

This character (or character string) is used as suffix in a database search query in connection with the SQL command LIKE. A % used as suffix stands for any other text that can occur after the search term.

SqlTasksIndexname

Text

An SQL index can be used for the search. The name of the index used is specified here.

For the settings DatabaseLeftTruncationCharacter and DatabaseRightTruncationCharacter, % is specified in each case. This will cause the database search to be performed with SQL as follows (This example is for illustration purposes only and does not work this way in the actual database):

SELECT Title FROM Book WHERE Book.Title LIKE "%search term%";

This selects all records in which the search term occurs at any position. For example, if you omit the prefix % and search for search term%, you can search at the beginning of a record. This means, for example, that all books beginning with "The" can be searched for. If the prefix and suffix are omitted, only the search term is searched for and all records are selected that exactly match the search term.

To make database searches easier, aliases can be defined to group multiple properties together as one meta-property. For example, when searching for a person, it would be quite time-consuming to search for all authors, publishers, clerks, other persons, etc. For this reason, search terms can be listed and grouped under one name.

Property

Type

Default value

Description

index.***

Text

This value can be used to specify a comma-separated list of terms to be summarized under the term specified in ***. The value can be used multiple times to make multiple summaries.

For example, following aliases could be configured:

index.Person=Author, OtherPerson, Publisher, Editor
index.Institution=University, Museum, Archive

Processes and process log

The following settings can be used to set the process log and some details for the editing of processes.

Property

Type

Default value

Description

ProcessCreationResetJournal

Boolean

false

This value can be set to true to not copy the process log when duplicating a process. **This value replaces the value ProcessCreationResetLog and is preferred if both are used.

ProcessCreationResetLog

Boolean

false

This value can be set to true to not copy the process log when duplicating a process. **This value is deprecated and was replaced with ProcessCreationResetJournal, but is still supported for compatibility reasons.

dir_allowWhiteSpaces

Boolean

false

This value is set to true to allow spaces in file and directory names when creating or uploading files or directories. It is generally recommended to leave this setting at false because scripts in particular often cannot handle spaces in directory or file names and then the space is interpreted as a separator between two parameters. If false is set here, however, all spaces will be replaced internally by underscores to avoid the problem mentioned above.

massImportAllowed

Boolean

false

This value can be set to true to enable uploading of very large amounts of data.

MassImportUniqueTitle

Boolean

true

This value is set to true to allow only files with different names when uploading files. This simplifies later handling because then no name conflicts can occur.

batchMaxSize

Number

100

This value is used as limit for displaying batches and processes.

ProcesslistShowEditionData

Boolean

false

If this value is set to true, more information about process editing will be displayed in the process log. This includes information about the user who last worked on the process, when the process was last worked on, and which step was last worked on.

Configuration of jobs

For automatically executed background processes, it is possible to specify when they should be executed. The first three properties in the following list, marked with daily, are executed every day. The number of milliseconds specified is the time between 0:00 and the time the task is executed. This has the advantage that certain tasks can be executed at night, for example, when the load on the server is low.

The second advantage of this configuration is that the times of day can be set according to the difference between server time and the most used user time of day. This can happen when a server is located in another country (or uses UTC) and employees worldwide from different time zones work together on the server.

If -1 is specified, the corresponding job will be disabled.

The number of milliseconds can be calculated as follows:

One second has 1000 milliseconds
One minute has 60 seconds and thus 60 000 milliseconds
One hour has 60 minutes and thus 3 600 000 milliseconds
One day has 24 hours and thus 86 400 000 milliseconds.

So the given time should be between 0 and 86 400 000 to avoid errors. A few examples:

For 0:00 o'clock 0 is indicated
For 3:00 o'clock (3:00 AM) 3 * 3 600 000 = 10 800 000 is indicated
For 18:30 (6:30 PM), 18 * 3 600 000 + 30 * 60 000 = 66 600 000 is given

For the setting 'goobiAuthorityServerUploadFrequencyInMinutes' a time in minutes is specified.

Property

Type

Default value

Description

dailyHistoryAnalyser

Time

-1

This setting specifies when to backupt the processing of events from the last 24 hours.

dailyDelayJob

Time

-1

This setting determines when steps are executed for which a delay is enabled.

dailyVocabJob

Time

-1

This setting determines when the locally managed vocabulary is synchronized with other servers.

goobiAuthorityServerUploadFrequencyInMinutes

Time

-1

This value is used for requesting the authority server. Server requests are made in the background every n minutes, where n is the number of minutes specified here. For example, if 2 is specified, a request will be made every 2 minutes.

Downloadable information

In Goobi Workflow, processes, templates, masterpieces and metadata can not only be searched and displayed, but also exported to various file formats and downloaded. Some data from the database are already taken into account by default:

processes.Title The title of a process
processes.processesID The ID of a process
processes.creationdate The creation date of a process
processes.sortHelperImages The number of images in the process
processes.sortHelperMetadata The number of metadata in the process
projects.title The title of the project in which the process is located
log.lastError The last detected error in the editing of this process

In addition, the downloadAvailableColumn setting can be used to include other properties in the exported files. For this purpose the mentioned setting can be used multiple times. All matching rows will be read in together by Goobi Workflow and processed as a coherent list.

Property

Type

Default value

Description

downloadAvailableColumn

Text

This specifies exactly one additional table column to be included in the export.

In each line exactly one table column name from the Goobi database is specified. The following table columns are currently available (Goobi version 22.08):

prozesseeigenschaften.prozesseeigenschaftenID
prozesseeigenschaften.Titel
prozesseeigenschaften.WERT
prozesseeigenschaften.IstObligatorisch
prozesseeigenschaften.DatentypenID
prozesseeigenschaften.Auswahl
prozesseeigenschaften.prozesseID
prozesseeigenschaften.creationDate
prozesseeigenschaften.container
vorlageneigenschaften.vorlageneigenschaftenID
vorlageneigenschaften.Titel
vorlageneigenschaften.WERT
vorlageneigenschaften.IstObligatorisch
vorlageneigenschaften.DatentypenID
vorlageneigenschaften.Auswahl
vorlageneigenschaften.vorlagenID
vorlageneigenschaften.creationDate
vorlageneigenschaften.container
werkstueckeeigenschaften.werkstueckeeigenschaftenID
werkstueckeeigenschaften.Titel
werkstueckeeigenschaften.WERT
werkstueckeeigenschaften.IstObligatorisch
werkstueckeeigenschaften.DatentypenID
werkstueckeeigenschaften.Auswahl
werkstueckeeigenschaften.werkstueckeID
werkstueckeeigenschaften.creationDate
werkstueckeeigenschaften.container
metadata.processid
metadata.name
metadata.value
metadata.print

It should be noted that only the column name is specified in the configuration. Otherwise the columns cannot be found, because the column names specified here are directly composed with the expected table names and searched. This has the side effect that by specifying for example 'Title' the title of processes, the title of templates and the title of masterpieces are taken into account.

For example, the following configuration can be made:

downloadAvailableColumn=Titel
downloadAvailableColumn=DatentypenID
downloadAvailableColumn=werkstueckeID
downloadAvailableColumn=name
downloadAvailableColumn=value
downloadAvailableColumn=print

This would result in the following additional table columns being used:

prozesseeigenschaften.Titel
prozesseeigenschaften.DatentypenID
vorlageneigenschaften.Titel
vorlageneigenschaften.DatentypenID
werkstueckeeigenschaften.Titel
werkstueckeeigenschaften.DatentypenID
metadata.name
metadata.value
metadata.print

Success and error messages for tasks

Manual tasks can be either completed successfully or set to error state by users in Goobi. In order to provide more information about the cause and resolution of errors, error and success messages can be pre-configured to be available later as a drop-down menu when completing (or canceling) tasks.

The following examples show how such messages can be used:

task.error.Missing\ pages=The following pages are missing: {}
task.error.Blurred\ images=The images {} are unsharp. Please create these again.
task.solution.Problem\ solved=The problem was solved. {}
task.solution.The\ original\ print\ is\ blurred=The original pages are printed blurry. It is not possible to create sharper images. {}

Note: Messages cannot be translated automatically and should be specified in a language that is understandable to as many users involved as possible. For international projects, English should be used.

Any number of messages can be specified. The respective drop-down menus are only available if at least one corresponding message is specified. All messages for error descriptions start with the prefix task.error.. All messages for problem fixes start with the prefix task.solution..

After each prefix there is a short text which should be displayed as a dropdown item. Spaces must be escaped in it. After the equal sign follows a detailed description. This will be displayed later in the step details. At the placeholder {} the additionally specified remark (from the text field below the dropdown menu) will be inserted later.

Scripts

Scripts can be configured to set up the file system. In a Goobi installation there are already suitable scripts included, but they are not set as default values here.

Property

Type

Default value

Description

script_createDirUserHome

Text

This script can be used to create and set up user directories. Example: script_createDirUserHome.sh

script_createDirMeta

Text

This script can be used to create and set up the metadata directory. Example: script_createDirMeta.sh

script_createSymLink

Text

This script can be used to create system links (to other directories or files). Example: script_createSymLink.sh

script_deleteSymLink

Text

This script can be used to delete system links (to other directories or files). Example: script_deleteSymLink.sh

Include S3 cloud

Amazon provides a cloud system where data objects can be stored in "buckets". This can be integrated by Goobi Workflow and used to exchange data with other servers.

To use S3, access data from the S3 service used is required. These must be specified in this configuration file.

Property

Type

Default value

Description

useS3

Boolean

false

This value specifies whether an S3 store should be used.

useCustomS3

Boolean

false

This value specifies whether the S3Endpoint, S3AccessKeyID and S3SecretAccessKey settings specified in this configuration file should be used for selecting the S3 service and the necessary authentication.

S3Endpoint

Text

This value specifies a domain name or IP address where the service can be reached. Usually a port number must be specified, for example: http://123.123.123.123:9000

S3bucket

Text

This value specifies the name of the bucket used.

S3AccessKeyID

Text

This value specifies the account ID that Goobi uses to access the service.

S3SecretAccessKey

Text

This value specifies the access key or password that Goobi uses to identify itself for the account used.

S3ConnectionRetry

Number

10

This value specifies how many times to retry a failed interaction.

S3ConnectionTimeout

Number

10000

This value specifies how long to wait for a server response when connecting. The value is specified in milliseconds.

S3SocketTimeout

Number

10000

This value specifies how long to wait for a server response when interacting. The value is specified in milliseconds.

Proxy server settings

A Goobi server can use a proxy server for certain transactions with other servers or clients. By default, no proxy server is configured. To use a proxy server, you must first enable its use.

Property

Type

Default value

Description

http_proxyEnabled

Boolean

false

This switch is set to true to enable the use of a proxy server.

http_proxyUrl

Text

This value specifies the URL of the proxy server.

http_proxyPort

Number

8080

This value specifies the port number of the proxy server.

http_proxyIgnoreHost

Text

Multiple IP addresses or URLs can be specified here to which Goobi connects without a proxy server. For this, this value may occur multiple times with one address each.

The default values for http_proxyIgnoreHost are predefined as follows. The list can be extended as needed:

http_proxyIgnoreHost=localhost
http_proxyIgnoreHost=127.0.0.1

Server and API settings

This category includes some settings that can be used to configure URLs and credentials to specific web services. Settings are also available to determine the behavior of the internal REST API. Especially for URLs, make sure to use the correct protocol (HTTP or HTTPS) and port number if the corresponding server or service uses one other than 80.

Property

Type

Default value

Description

useWebApi

Boolean

false

This value can be set to true to enable Goobi Workflow's internal REST API. If it is disabled but still requested, a 404 Not Found Error will be returned with a corresponding error message.

apiTokenSalt

Text

Here, an additional text can be specified that should be used as salt value for encrypting the authentication data for the REST API.

jwtSecret

Text

The Goobi Workflow REST API uses a JSON Web Token (JWT) for authentication. This is specified here. To send an authenticated request to the REST API, the token must be specified in the request.

goobiUrl

Text

The Goobi URL where the Goobi Workflow Server can be reached on the Internet can be specified here. This URL is only specified for internal purposes and can therefore also be set to https://localhost:8080. The protocol and port number should also be specified to avoid possible errors due to incorrect default values.

pluginServerUrl

Text

The URL of the plugin server is specified here. The plugin server is also a REST API interface of the Goobi Workflow Server used to offer plugins for download. It is used by the integrated plugin management.

ocrUrl

Text

This URL specifies an OCR service used by Goobi Workflow or OCR plugins to perform OCR analysis on a document.

goobiContentServerTimeOut

Number

60000

This timeout (in milliseconds) specifies the time Goobi Workflow waits for a response from the content server.

goobiAuthorityServerUrl

Text

This specifies the URL of the authority server used to manage vocabulary data.

goobiAuthorityServerUser

Text

This specifies the user name used by Goobi Workflow to log in to the Authority server to retrieve vocabulary data.

goobiAuthorityServerPassword

Text

This specifies the password used by Goobi Workflow to authenticate to the Authority server to retrieve vocabulary data.

goobiAuthorityServerUploadFrequency

Number

0

This specifies the frequency with which Goobi Workflow makes requests to the Authority server.

geonames_account

Text

This value contains the authentication information that Goobi Workflow uses to authenticate itself to the Geonames web service for a query.

Configure message queues

Goobi Workflow uses multiple message queues to communicate with other processes on the same server (localhost) or other servers. ActiveMQ is always used for production use. For development purposes, SQS can be used in some cases. Since the entire constellation and configuration of the message queues is somewhat confusing, all configuration possibilities are documented here for the sake of completeness, even if individual constellations are not used on production systems. In each case, it is indicated which configurations are relevant for a production system.

Message queues used

Goobi Workflow uses one or more slow queues to transmit normal process communication notifications.
There is a fast queue for transmitting particularly small or time-critical information.
An internal DLQ (Dead Letter Queue) is used to catch undeliverable notifications on the same server.
An external DLQ is used to catch undeliverable notifications between multiple servers.
There is a separate queue for commands that can be sent either on the same server or between different servers. These can be executable scripts, for example.

Simple Queue Service (SQS)

In principle, all queues can be operated with ActiveMQ. The external DLQ and the queue for commands have the special feature that they can work either with ActiveMQ or with SQS (Simple Queue Service). As long as ActiveMQ is used for these queues, it is possible to switch between a localhost service and an external service. The localhost service is set with default parameters and does not need to be configured further. If an external service is to be used or other individual configurations are to be made, this is configured in the file goobi_activemq.xml. The SQS service, on the other hand, is always located on the same server (localhost) and does not need to be configured.

Configuration

Property

Type

Default value

Description

MessageBrokerStart

Boolean

false

This value is set to true to enable the use of message queues in Goobi Workflow.

ActiveMQConfig

Text

goobiFolder + config/goobi_activemq.xml

A configuration file for ActiveMQ can be specified here. In this file also the URL and the port for the communication with other servers are configured. If this property is set and thus the default value is overwritten, an absolute path must be specified.

MessageBrokerServer

text

localhost

This setting is used by Spring Framework and specifies the URL (or localhost) of the ActiveMQ service.

MessageBrokerPort

Number

61616

This setting is used by Spring Framework and specifies the port of the ActiveMQ service.

MessageBrokerUsername

Text

This is the account name that Goobi Workflow uses to register with ActiveMQ.

MessageBrokerPassword

Text

This is the password Goobi Workflow uses to authenticate itself with ActiveMQ.

MessageBrokerNumberOfParallelMessages

Number

1

Here you can specify the number of slow message queues. With a higher value more data transmissions can be executed in parallel. A speed advantage arises especially with large amounts of data, if the used server has many processor cores and these are released by the operating system for Goobi or ActiveMQ.

allowExternalQueue

Boolean

false

If this setting is set to true, Goobi Workflow will also use a DLQ for communication with other servers, i.e. a queue for notifications that could not be sent without errors.

externalQueueType

Text

activeMQ

This value can be set to activeMQ or SQS to switch the queue service for the command queue and the external DLQ. In production systems this is always activeMQ.

useLocalSQS

Boolean

false

If the queue service is set to SQS, this parameter can be set to true to use the default connection used by Goobi Workflow via http://localhost:9324. Otherwise, a default configuration used by SQS will be used. This setting is not used on production systems.

Additionally the names of the respective queues can be configured. This is normally not required and is documented here for completeness.

Property

Type

Default value

Description

GOOBI_INTERNAL_FAST_QUEUE

Text

goobi_fast

This is the queue for fast information exchange of small information units.

GOOBI_INTERNAL_SLOW_QUEUE

Text

goobi_slow

This is the queue (or several) for exchanging larger data packets.

GOOBI_EXTERNAL_JOB_QUEUE

Text

goobi_external

This is the queue for external communication with other servers.

GOOBI_EXTERNAL_JOB_DLQ

Text

goobi_external.DLQ

This is the queue for non-deliverable information generated by communication with other processes on other servers.

GOOBI_EXTERNAL_COMMAND_QUEUE

Text

goobi_command

This is the queue to send commands between servers.

GOOBI_INTERNAL_DLQ

Text

ActiveMQ.DLQ

This is the queue for non-deliverable information that is generated when communicating with other processes on the same server.

Metadata editor

The metadata editor has many setting options that can be sorted both technically and thematically. As a compromise, and because there is no "one" correct sorting, all settings are sorted by categories, such as "User Interface", "Export", etc. For OCR settings, for example, this means that showing/not showing the OCR button is configured in the "User Interface" section, while technical details about OCR are described in the "Export" section.

General settings

In the General Settings you can find settings concerning the editor itself and default values for new documents.

Property

Type

Default value

Description

MetsEditorEnableDefaultInitialization

Boolean

true

This value can be set to true to make default configurations in the document structure when loading a document in the metadata editor.

MetsEditorEnableImageAssignment

Boolean

true

This value can be set to true to enable automatic assignment of image files to a document structure.

MetsEditorDefaultPagination

Text

uncounted

This setting can be used to specify the default pagination system. Valid values are arabic (0, 1, 2, ..., 7, 8, 9), roman (I, V, X, L, C, D, M) and uncounted (no page numbers).

MetsEditorLockingTime

Time

180000

This value specifies how long a document edited in the metadata editor is reserved for a user and locked for other users. The reservation is set to prevent multiple users from editing a document at the same time and overwriting changes made by other users unnoticed. The reservation is removed as soon as the user saves the document and leaves the metadata editor or the lock time has expired. The lock time is specified in milliseconds, the default value is half an hour and is calculated for example with Lock time = 30 minutes * 60 seconds * 1000 milliseconds = 180000 milliseconds.

MetsEditorUseExternalOCR

Boolean

false

This value can be set to true to use an external OCR service. This is configured with the ocrUrl=address setting. If the use of the external OCR service is disabled, the OCR equivalent text contents will be loaded directly from resource files.

numberOfMetaBackups

Number

0

This value specifies the number of backups of the meta.xml file and the meta_anchor.xml file. Note: If 0 is entered here, no backups will be stored.

User interface

The "User Interface" category documents settings that can be used to configure the display of content or the display/non-display of buttons.

Property

Type

Default value

Description

showOcrButton

Boolean

false

This value can be set to true to display a button in the metadata editor for running the OCR analyze on the currently selected structure element.

MetsEditorDisplayFileManipulation

Boolean

false

This value can be set to true to display unsaved modified documents in the metadata editor.

MetsEditorShowArchivedFolder

Boolean

false

This value can be set to true to show archived image files.

MetsEditorShowMetadataPopup

Boolean

true

This value can be set to true to show a button that can be used to display additional metadata for a document in a popup.

MetsEditorMaxTitleLength

Number

0

This value specifies the maximum number of characters to be displayed in names of document structure elements. If a name is longer than the value specified here, only the first characters are displayed according to the maximum length. This value can be set to 0 to disable the maximum length.

Image files and thumbnails

This category documents settings for image files, thumbnails, the display of images, and the tiling of images in the user interface.

Property

Type

Default value

Description

ShowImageComments

Boolean

false

This value can be set to true to show image comments for individual pages of documents.

MetsEditorNumberOfImagesPerPage

Number

96

This value specifies how many thumbnails are displayed on a page in the metadata editor by default.

MetsEditorUseImageTiles

Boolean

true

This value can be set to true to load image files on the user interface tile by tile. This allows a smoother display of the images.

ImageSorting

Text

number

This value specifies the criterion by which image file names are sorted. With the value number file names are sorted numerically (for example 1, 10, 30, 100, 200, 1000). With the value alphanumeric file names are sorted lexicographically (for example 1, 10, 100, 1000, 200, 30).

ImagePrefix

text

\\d{8}

This value can contain a regular expression (regex) and specifies which string image file names must start with to be accepted as valid files. The default prefix specifies that a file name must start with 8 digits (e.g. YYYYMMDD).

UserForImageReading

Text

root

Image files can be downloaded in read-only mode. These then do not belong to the user, but to another virtual user who provides the image files with read-only privileges. The username of this virtual user is configured with this value. Usually root is used.

UseImageThumbnails

Boolean

true

This value can be set to true to display thumbnails of document pages.

MetsEditorThumbnailsize

Number

200

This value specifies the size in pixels at which thumbnails are displayed in the metadata editor.

MaxParallelThumbnailRequests

Number

100

This value can be set to limit the number of thumbnails loaded simultaneously. This option is especially useful on weaker servers.

MetsEditorMaxImageSize

Number

15000

This value specifies the maximum image size in pixels that an image may have in order to be displayed in the metadata editor.

The maximum size of image files (in bytes) is defined with two independently configurable values. With the value MaxImageFileSize a number is specified, such as 1, 5 or 10. With the additional value MaxImageFileSizeUnit the unit of measurement is specified. These in combination give the maximum number of bytes an image file may not exceed. Important: Only integer numbers can be used.

Property

Type

Default value

Description

MaxImageFileSize

Number

4000

This value specifies the factor for the maximum image size in bytes.

MaxImageFileSizeUnit

Text

MB

This value specifies the unit of measurement by which the factor is multiplied to obtain the total image size in bytes.

Since there are many misunderstood units of measurement, the following table lists all accepted values and their internally used numeric values.

Unit

Factor

Description

B

1

Byte

K or KB

1000

Kilobyte

KI or KIB

1024

Kibibyte

M or MB

1000*1000

Megabyte

MI or MIB

1024*1024

Mebibyte

G or GB

1000*1000*1000

Gigabyte

GI or GIB

1024*1024*1024

Gibibyte

T or TB

1000*1000*1000*1000

Terabyte

TI or TIB

1024*1024*1024*1024

Tebibyte

This can be used, for example, to make the following settings:

# 20 Megabyte -> 20 * 1000 * 1000 = 20 000 000 Byte
MaxImageFileSize=20
MaxImageFileSizeUnit=MB

# 1 Gibibyte -> 1 * 1024 * 1024 * 1024 = 1 073 741 824 Byte
MaxImageFileSize=1
MaxImageFileSizeUnit=GIB

The following values can be used to configure information about supported image and tile sizes for JSON API requests. The following API request can be used to retrieve this information about images from operations:

/process/image/{process}/{folder}/{filename}/info.json

Thereby all values can be used multiple times and return multiple values in the API request.

Property

Type

Default value

Description

MetsEditorImageSize

Text

This value can be used to specify multiple sizes (in pixels) for common images.

MetsEditorImageTileSize

Text

This value can be used to specify multiple sizes (in pixels) for tiles.

MetsEditorImageTileScale

Text

This value can be used to specify multiple scaling sizes for tiles.

For example, a configuration could look like this:

MetsEditorImageSize=4096

MetsEditorImageTileSize=64
MetsEditorImageTileSize=128
MetsEditorImageTileSize=256

MetsEditorImageTileScale=1
MetsEditorImageTileScale=32

When completing steps, various internal data checks are performed. Among other things, the number of existing and processed image files is checked. The historyImageSuffix value can be used to specify one or more file types that will be considered for this count.

This setting is used for file extensions and can also contain general texts with which a file name should end, but no regular expressions are interpreted.

Property

Type

Default value

Description

historyImageSuffix

Text

.tif

This value can be used to specify one or more file types.

For example, if all *.tif, *.jpg and *.jpeg files are to be considered, the following list could be used:

historyImageSuffix=.tif
historyImageSuffix=.jpg
historyImageSuffix=.jpeg

Validation

This category contains settings for validating operations, image files and metadata.

Property

Type

Default value

Description

useMetadatavalidation

Boolean

true

This value can be set to true to validate the metadata of the current document when saving and exiting the metadata editor.

MetsEditorValidateImages

Boolean

true

This value can be set to true to display a validate button in the metadata editor to allow user-initiated validation.

validateProcessTitleRegex

Text

[\\w-]*

This value specifies a regular expression (regex) to be used to check the validity of process titles.

ProcessTitleGenerationRegex

Text

[\\W]

This value specifies a regular expression (regex) to be used to remove invalid special characters from process titles.

Export

This category describes settings that affect the export of operations to downloadable files or files that can be cached on the server.

Property

Type

Default value

Description

ExportExiftoolPath

Text

/usr/bin/exiftool

This specifies the file path and program name of the program used for extracting additional metadata (EXIF data) from image files.

useOrigFolder

Boolean

true

This value can be set to true to get image files directly from the master folder when importing.

exportWithoutTimeLimit

Boolean

true

This value can be set to true if export processes should not be subject to a time limit.

ExportValidateImages

Boolean

true

This value can be set to true to validate image files during export.

ExportMetadataForProject

Text

Here, the name of the metadata field is specified where the name of the project should be exported to.

ExportMetadataForInstitution

Text

Here, the name of the metadata field is specified where the name of the institution should be exported to.

ExportMetadataForDfgViewerUrl

Text

Here, the name of the metadata field is specified where the URL of the DFG viewer should be exported to.

ExportFilesFromOptionalMetsFileGroups

Boolean

false

This value can be set to true to consider optional file groups when exporting.

ExportInTemporaryFile

Boolean

false

This value can be set to true to store exports in temporary files.

ExportCreateUUID

Boolean

true

This value can be set to true to set the UUID (universally unique identifier) of files when exporting.

ExportCreateTechnicalMetadata

Boolean

false

This value can be set to true to include additional technical metadata related to the document structure when exporting.

automaticExportWithImages

Boolean

true

This value can be set to true to include image files in automatic export processes.

automaticExportWithOcr

Boolean

true

This value can be set to true to perform OCR analysis during automatic export processes.

pdfAsDownload

Boolean

true

This value can be set to true to make exported PDF documents available for download. If this value is set to false exported PDF documents will be stored in the user folder of the corresponding user on the server.

English

Overview

Goobi workflow Handbook

About this manual

Further development of Goobi and maintenance of this documentation

Copyright

Overview of documentation

Goobi User

Goobi Managers

Goobi Administrators

What is Goobi?

Users

Goobi for Users

The basics

Logging in

Menu

Logging out

Switch between available languages

Help function

Changing your password

Processes

How to find a process

How to create a new process

How different user groups work with Goobi

Scanning

Quality control

Manual script steps and plugin steps

Automatic script-run steps

Metadata processing

Export to the DMS

User interface

Structure tree

Page display

Page navigation

Zoom

Rotation

Selecting the image folder

Metadata indexing

Pagination

Setting representatives

Structuring

Create new structure element

Structural elements for pages

Structure elements for image areas

Moving structure elements

Copying structure elements from other processes

Modifying and verifying data

Subsequent changes to pagination

Uploading files

Downloading files

Server-based imports

Edit OCR results

Overview of the keyboard combinations

Management

Goobi Management

LDAP groups

Users

Processes

Activities for hit lists

Description of actions that can be applied to a group of processes

Harvester

Overview

Configuration

Manual harvesting

Automatic Harvesting

Harvesting

Config Editor

Overview

Configuration

Ruleset editor

Übersicht

Validation

Well-Formedness

Invalid Names

Empty Translations

Invalid Cardinality

Duplicate Definitions

Duplicates

Incorrect Use of Publication Types

Undefined Metadata Types