4.12. Analysing images using intranda LayoutWizzard

Using the Goobi plugin LayoutWizzard, images from your digitised material can be analysed automatically for additional processing, and the results can be automatically validated. In addition, the intranda TaskManager allows you to outsource this CPU-intensive and time-consuming automatic analysis (as well as the subsequent task of storing the digitised material once it has been processed) to an external server.

This plugin is made up of two components: LayoutWizzardAnalyse and LayoutWizzardSave. The first of these carries out the actual image analysis, while the second stores the processed images in an output folder (usually the derivatives folder for the Goobi process). A separate description of each component is given below.

LayoutWizzardAnalyse

From a workflow step, the following instruction must be used to call LayoutWizzard’s image analysis function:

/usr/bin/java -jar /opt/digiverso/itm/bin/TaskClient.jar
-itm http://localhost:8080/itm/service
-t LayoutWizzardAnalyse
-n Left-To-Right,individual-pages
-s {origpath}
-d {tiffpath}
-gid {processid}
-i {stepid}
-T {processtitle}
-e

This step will use the analysis parameters from the analysis file in the process folders (if available):

imageData.ser

If not, it will use the parameters from the LayoutWizzard configuration file specified in the plugin configuration (see under the heading Configuration).

For every image in the input folder (-s), this step will calculate the page rotation and the position of the page edges and spine. This data is then stored in the above-mentioned analysis file for subsequent workflow steps.

If an error occurs during analysis, the process is cancelled, and an error message is sent to Goobi.

LayoutWizzardSave

From a workflow step, the following instruction must be used to tell LayoutWizzard to save the processed images:

/usr/bin/java -jar /opt/digiverso/itm/bin/TaskClient.jar
-itm http://localhost:8080/itm/service
-t LayoutWizzardSave
-n none
-s {origpath}
-d {tiffpath}
-gid {processid}
-i {stepid}
-T {processtitle}
-e

This step reads the analysis results from the analysis file in the process folder of the corresponding Goobi process:

imageData.ser

This file must have been generated in a previous analysis step, either in Goobi or in TaskManager. Straightened and cropped versions of the output images are then stored in the output folder (-d) as uncompressed Tiff files. If an error occurs due to invalid analysis values, the process is cancelled and the analysis file deleted so that new values can be generated in a correction step. If there is no error, the analysis file is retained in case the step is repeated.

The parameters used in the above instruction are explained in the following table:

Parameter

Possible Goobi variable

Meaning

-itm

http://localhost/itm/service

URL to the intranda TaskManager

-e, --returnError

-

If entered, the TaskClient will end with an error code to prevent the workflow from continuing automatically

-p

0 – 10

Priority to execute this job

-gid

{processid}

Goobi process ID

-i

{stepid}

ID for the workflow step that launches the call

-T, --title

{processtitle}

Goobi process title for which the call is launched

-t, --jobtype

LayoutWizzardAnalyse, LayoutWizzardSave

Job type

-n, --templatename

Left-To-Right, individual-pages

Comma-separated list of processing parameters (see under heading Templates) for LayoutWizzardAnalyse. Not relevant for LayoutWizzardSave.

-s, --source

{origpath}

Pfad Path to the master directory for the process

-d, --destination

{tifpath}

Path to the target directory in which the processed images are to be stored (only required for LayoutWizzardSave)

Templates

The analysis step accepts two parameters as a template. These can be read, for example, from process attributes of the corresponding Goobi process. They can be added to the parameter -t.

The parameters used in the above instruction are explained in the following table.

Parameter

Possible values

Meaning

Orientation

Left-To-Right

The digitised material has been digitised from left to right, i.e. in the direction in which most European languages are read.

Left-To-Right

The digitised material has been digitised from right to left, i.e. in the direction in which Semitic languages such as Arabic and Hebrew are read.

PageMode

individual-pages

The digitised material has been scanned only in single pages.

double-pages

The digitised material has been scanned only in double pages.

double-pages-with-individual-covers

The digitised material has been scanned in double pages except for the first and last page (usually the book covers).

If this method is used to read in parameters, all the possible parameters must be given as comma-separated lists in the order shown above:

<Orientation>,<PageMode>

Configuration

The plugin uses a configuration file that must be located in the #config# folder of the TaskManager plugin. In most cases the path will be:

/opt/digiverso/itm/plugins/config/

The file must be named as shown below:

plugin_LayoutWizzard.xml

The file contains a single parameter indicating the path to the LayoutWizzard configuration file:

<config_plugin>
<layout-wizzard-config-path>
/opt/digiverso/LayoutWizzard/config/layoutwizzard_config.xml
</layout-wizzard-config-path>
</config_plugin>

Stopping the plugin

Translation needed: Please notice that the following paragraph needs an English translation. This text gets updated over the next days.

Wenn das Plugin gestoppt wird, wird das aktuell bearbeitete Bild fertig analysiert bzw. gespeichert und eine Ergebnisdatei geschrieben. Nach einem Neustart des Plugins setzt die unterbrochene Bearbeitung nach dem zuletzt bearbeiteten Bild fort. Alle Bilder werden in einem eigenen Prozess im Tomcat-Server bearbeitet. Um die aktuelle Bearbeitung eines Bildes zu unterbrechen, muss daher der Tomcat selbst neu gestartet werden. Dann wird der so unterbrochene TaskManager-Job jedoch nicht weiter bearbeitet sondern muss manuell abgebrochen und neu gestartet werden. Nach einer in der LayoutWizzard-Konfiguration festgesetzten Zeit wird jedoch die Bearbeitung eines Bildes automatisch unterbrochen, so dass ein Tomcat-Neustart in der Regel nicht notwendig ist.