July 2022
Developments and innovations to Goobi workflow in July 2022

Core

Generation of URLs for the SRU Interface and the IIIF Manifest

In the project settings there are now two additional fields in the tab METS Parameter to record the URL to the SRU interface of the Goobi viewer and the URL of the IIIF Manifest for the complete work.
New fields for SRU and IIIF
Example entries could look as follows:
http://viewer.example.org/sru/$(meta.CatalogIDDigital)
https://viewer.example.org/api/v1/records/$(meta.CatalogIDDigital)/manifest/
The result within a METS file created in this way is then accordingly as follows:
<dv:links>
<dv:iiif>https://viewer.example.org/api/v1/records/1234567889/manifest/</dv:iiif>
<dv:sru>http://viewer.example.org/sru/1234567889</dv:sru>
</dv:links>

Adjustments for the image comments

Not too long ago we introduced the new image comments here. With their first use at various institutions, however, we noticed that the internal storage location for the data was not ideal. For this reason, we have made a change for this. Those who have already used this functionality should therefore follow our update instructions provided here for the next update:
9.2. Update steps
Goobi workflow (English)

GoobiScript metadataChangeType fixed

In the GoobiScript metadataChangeType the parameter any did not work as expected. This has been corrected and now allows metadata to be changed at all levels of a METS file as desired.
GoobiScript metadataChangeType
For the sake of completeness, here is the link to the detailed documentation of GoobiScript:
7.4. GoobiScript
Goobi workflow (English)

Improved logging during application stop

Administrators love logging. It helps to better understand why a problem occurred in case of an error and what can be considered as the trigger for a failure. For reliable operation of productive applications, meaningful logging is not only helpful, but often simply mandatory. It is all the better if the developers of a software work closely together with the administrators who actually run the software.
The result of such cooperation in the context of Goobi is that we are constantly trying to improve the logging of Goobi workflow. This month we took a closer look at the logging of the application stop and made some improvements. After all, sometimes too much logging is not right.

Plugins

Processing newspapers

We had already reported on this at the Goobi Days. In recent months, however, our developments for processing newspapers have been used more and more frequently. That is why we would like to explicitly address this topic again together with the latest developments:
For some time now, it has been possible for Goobi to recognise newspaper editions semi-automatically and to index them with metadata. To this end, the workflow first automatically recognises newspaper editions on the basis of trained models. We have recently expanded this considerably and fed it with more ground truth data. Thus, the recognition accuracy of newspaper editions has improved many times over.
Workflow for the processing of newspapers
After this automatic recognition, a plugin is available in Goobi worklow for enriching the newspaper editions with edition-related metadata. This makes it relatively easy to record issue numbers, dates and publication days, including morning and evening editions, without having to do this individually for each issue. Instead, a semi-automatic mechanism takes effect here, which counts up the subsequent issues accordingly with a few entries from a user, enriches them with dates and also names them.
Recording of newspaper editions
Indexing supplements
In addition, supplements can also be created quickly here, so that a comprehensive METS file is subsequently generated, which has standardised date information and can be searched in the Goobi viewer, e.g. via a calendar entry.
Metadata editor for a newspaper
Research via a calendar in the Goobi viewer
To meet the requirements of the German Digital Library and the DFG Viewer, by the way, an additional export plugin was recently added that prepares and provides the data according to the specifications.
Researching a newspaper in the DFG Viewer
We have worked a lot on the individual components described here in the last few months and have repeatedly optimised their operation. Some things will continue to happen here in the coming weeks. We would be happy to present this in detail in a workshop at the upcoming Goobi Days.

Mass imports with archive stock enrichment

Up to now, we had not even mentioned the various newer import plugins that we had recently implemented for some specific mass imports. For these plugins, there is no detailed stand-alone documentation yet.
User interface of a mass import for files
However, it is important for us to point out that there are several new mass imports that not only create processes in Goobi workflow, but also increasingly create or enrich entire archive trees with tectonics.
Enriched tectocics based on the imported Excel files
Depending on the project, structured METS files can also be created here, as with other importers for Goobi workflow, where structural elements and associated metadata are imported together with authority data for the associated image files.
Generated structured metadata and images based on the imported Excel files.

Better full text extraction from PDF files

The step plugin intranda_step_pdf-extraction allows full texts to be extracted from PDF files and saved as ALTO files. However, we recently noticed an error in the processing, which was responsible for inappropriate gaps appearing within some words, among other things due to an existing hyphenation in the words.
We were now able to find and fix the cause of this problem. By the way, the plugin was already documented here:
Split PDFs, Extract Full Text and Read Table of Contents
Goobi workflow Plugins (English)
We have published the plugin at the following URL, where the corrected version can now also be downloaded:
GitHub - intranda/goobi-plugin-step-pdf-extraction: This is a Goobi workflow plugin to automatically read information from PDF files.
GitHub

Documentaries

A lot has happened again in the area of our documentation:

Inline documentation for GoobiScript

In the GoobiScript addToProcessLog the included explanations were not ideal. There was a small correction here, which now looks like this:
Better inline help

Small change in the update instructions

In order to make the update instructions for Goobi workflow more intuitive, they are now no longer structured by date, but by the respective published version. This change only applies to all future notes that are included there and not retroactively. It should hopefully make it a little clearer for administrators in the future.
Restructured update guide

Configuration files

We have again put a considerable amount of energy into documenting more configuration files. Almost all configuration files are now documented there in an exemplary manner. Some remaining work is still to be done. But if you want to have a look, here are some sample links to new documentation:
Configuration for dockets:
7.14 goobi_exportXml.xml
Goobi workflow (English)
Configuration of the queues:
7.13 goobi_activemq.xml
Goobi workflow (English)
Configuration for properties:
7.12 goobi_processProperties.xml
Goobi workflow (English)
Configuration for translations:
7.11 messages_xx.properties
Goobi workflow (English)
Configuration for the Web Api:
7.10 goobi_webapi.xml
Goobi workflow (English)
Configuration for the authority data:
7.9 goobi_normdata.xml
Goobi workflow (English)
Configuration for the field display in the metadata editor:
7.3 goobi_metadataDisplayRules.xml
Goobi workflow (English)

Version number

The current version number of Goobi workflow with this release is: 22.07. Within plugin developments, the following dependency must be entered accordingly for Maven projects within the pom.xml file:
<dependency>
<groupId>de.intranda.goobi.workflow</groupId>
<artifactId>goobi-core-jar</artifactId>
<version>22.07</version>
</dependency>
Copy link
On this page
Core
Generation of URLs for the SRU Interface and the IIIF Manifest
Adjustments for the image comments
GoobiScript metadataChangeType fixed
Improved logging during application stop
Plugins
Processing newspapers
Mass imports with archive stock enrichment
Better full text extraction from PDF files
Documentaries
Inline documentation for GoobiScript
Small change in the update instructions
Configuration files
Version number