4.13. Distributed storage using Storage Balancer

The intranda TaskManager Storage Balancer plugin can be used to distribute data from Goobi to different network shares. The plugin is integrated into Goobi in an automated step and is usually scheduled at the end of a workflow. The Storage Balancer plugin is allocated a folder by Goobi, and this folder is then copied to the smallest suitable storage from a configurable list.

The plugin works by copying the Goobi process data to a network drive previously selected automatically by the plugin from a configured list. If the checksums for the copied data are successfully matched against the original data, the original data is deleted and a symbolic link is created between the original Goobi directory and the directory in which the copies are stored.

Starting the plugin

A sample TaskClient call is shown below:

/usr/bin/java -jar /opt/digiverso/itm/bin/TaskClient.jar
-itm http://localhost:8080/itm/service
-s {processpath}
-d dest
-e
-i {stepid}
-T {processtitle}
-t STORAGEBALANCER
-n template
-gid {processid}

Parameters

The command parameters are explained in the following table:

Parameter

Possible Goobi variable

Meaning

-itm

http://localhost/itm/service

URL to the intranda TaskManager interface

-e --returnError

-

If entered, the TaskClient will end with an error code to prevent the workflow from continuing automatically

-p

0 – 10

Priority to execute this job

-gid

{processid}

Goobi process ID

-i

{stepid}

ID for the workflow step that launches the call

-T, --title

{processtitle}

Goobi process title for which the call is launched

-t, --jobtype

STORAGEBALANCER

Job type

-n, --templatename

-

This parameter is not used by the plugin.

-s, --source

{processpath}

Path to main process directory

-d, --destination

-

This parameter is not used by the plugin because the target is automatically defined from a list of storage locations.

Configuration of the plugin

The storage mount points are configured in the following file:

/opt/digiverso/itm/plugins/config/storagebalancerconfig.xml

A typical configuration structure is shown below:

<storages>
<storage>
<path>/mnt/storage1</path>
<buffer>10G</buffer>
</storage>
<storage>
<path>/mnt/storage2</path>
<buffer>20M</buffer>
</storage>
<storage>
<path>/mnt/storage3</path>
<buffer>50G</buffer>
</storage>
</storages>

Within the configuration, the <buffer> element can end in either M or G. It will then be converted into MB or GB. The configured value is used when operating Storage Balancer to ensure that data is not written to a storage area of the required size. You can add and remove further storage areas to the configuration during actual operation.