Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Goobi Administration Plugin for resetting pagination for multiple processes
Identifier
intranda_administration_reset_pagination
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:13:09
This documentation describes the installation, configuration and use of the Administration Plugin for automated pagination resetting within a large number of processes within Goobi workflow.
The plugin consists of the following files to be installed:
These files must be installed in the correct directories so that they are available in the following paths after installation:
If the plugin has been installed and configured correctly, it can be found within the menu item Administration
. After entering, the parameters described above can once again be individually adjusted in the interface.
After clicking on the button Execute plugin
the update of the METS files starts. A progress bar informs about the progress. Within the table, the processes already processed are listed and the respective status about the success of the execution is displayed.
The plugin is configured via the configuration file plugin_intranda_administration_reset_pagination.xml
and can be adapted during operation. The following is an example configuration file:
filter
With this parameter, a filter can be set as the default. This is automatically pre-filled when entering the plugin, but can then be adjusted as desired each time the plugin is used within the user interface.
To use this plugin, the user must have the correct role authorisation. Therefore, please assign the role Plugin_administration_reset_pagination
to the user group.
After installing the plugin and the associated database, the plugin must also be configured. This takes place within the configuration file plugin_intranda_administration_archive_management.xml
and is structured as follows as an example:
The connection to the Goobi viewer is configured in the <export>
area. The location to which an export as EAD-XML is to be made and which inventories are to be exported are defined here. The export takes place automatically at regular intervals or can be started manually from the user interface.
In the second area <backup>
an automatic backup of the individual inventories can be configured. A separate file is created for each inventory. You can define how many backups should be kept and which tool should be used to create the backups. If a password is required for database access, this can also be configured here.
This is followed by a repeatable <config>
block. The repeatable <archive>
element can be used to specify which files the <config>
block should apply to. If there is to be a default block that applies to all documents, *
can be used.
The <processTemplateId>
is used to specify the production template on the basis of which the Goobi processes are to be created.
The parameters <lengthLimit>
<separator>
<useIdFromParent>
and <title>
are used to configure the naming of the task to be generated:
The value <lengthLimit>
sets a length limit for all tokens except the manually set prefix and suffix. The default setting is 0
, i.e. it does not limit the length.
The parameter <separator>
defines the separator to be used to combine all separate tokens. The default setting is _
.
The parameter <useIdFromParent>
configures whose ID should be used to create the task title. If it is set to true
, the ID of the parent node is used. Otherwise, the ID of the current node is used.
The <title>
parameter configures which metadata should be used for the title generation. The value
attribute can contain a static text or the name
attribute can contain the name of a metadata field. The type
is used to control what should happen with the value NORMAL
inserts the field unchanged, CAMEL_CASE
replaces spaces and starts each word with a capital letter, AFTER_LAST_SEPARATOR
always inserts the field at the end, BEFORE_FIRST_SEPARATOR
always inserts it at the beginning. If no title has been configured, the process title is formed on the basis of the node ID.
The two parameters <nodeIdentifierField>
and <processIdentifierField>
are used to link the node and the process. The <nodeIdentifierField>
field contains the name of the field that contains the identifier of the node. Any configured field can be used. Unless otherwise specified, id
is used. The <processIdentifierField>
contains the metadata in which the identifier of the node is to be saved. This is usually NodeId
.
If a new EAD file is imported or the ‘ Update references to processes ’ button is used, the configured metadata is searched for in all processes. The system then compares whether the metadata contains the value that is entered in the field in a node. If this is the case, a link is created between the node and the process. For all nodes for which no match was found, any existing links are removed.
This is followed by a list of <metadata>
elements. This controls which fields are displayed, can be imported, how they should behave and whether there are validation rules.
Each metadata field consists of at least the following mandatory information:
name
This value is used to identify the field. It must therefore contain a unique name. If the value has not been configured separately in the messages files, it is also used as the label of the field.
level
This defines the area in which the metadata is inserted. The numbers 1-7 are permitted: 1. identification
, 2. context
, 3. content and internal organisation
, 4. conditions of access and use
, 5. related documents
, 6. notes
, 7. directory control
.
There are also a number of other optional details:
xpath
Defines an XPath expression that is used both for reading from existing EAD files and for writing the EAD file. In the case of the main element, the path is relative to the <ead>
element; for all other nodes, it is always relative to the <c>
element.
@xpathType
This defines whether the XPath expression returns an element
(default), an attribute
or a text
.
@visible
This can be used to control whether the metadata is displayed in the mask or hidden. The field may contain the two values true
(default) and false
.
@repeatable
Defines whether the field is repeatable. The field may contain the two values true
and false
(default).
@showField
Defines whether the field is displayed open in the detail display, even if no value is yet available. The field may contain the two values true
and false
(default).
@rulesetName
A metadata from the rule set can be specified here. When a Goobi process is created for the node, the configured metadata is created.
@importMetadataInChild
This can be used to control whether the metadata should also be created in Goobi processes of child nodes. The field may contain the two values true
and false
(default).
@fieldType
Controls the behaviour of the field. Possible are input
(default) , textarea
, dropdown
, multiselect
, vocabulary
, nodelink
, gnd
, geonames
, viaf
value
This sub-element is only used for the two types ‘dropdown’ and ‘multiselect’ and contains the possible values that are to be available for selection.
vocabulary
This sub-element contains the name of the vocabulary to be used. It is only evaluated if vocabulary
, dropdown
or multiselect
is set in the field type and no <value>
elements have been configured. The selection list contains the main value of each record.
searchParameter
This repeatable subfield can be used to define search parameters with which the vocabulary is filtered, the syntax is fieldname=value
.
@validationType
Here you can set whether the field should be validated. Different rules are possible, which can be combined. unique
checks that the content of the field has not been used elsewhere, required
ensures that the field contains a value. The type regex
can be used to check whether the value filled in corresponds to a regular expression, date
checks whether the value corresponds to an EDTF date and list
tests whether the value used is contained in the configured list.
Several validation rules can also be combined, for example unique+required
, regex+required
, regex+unique
or date+required
. In this case, all specified rules must be fulfilled.
@regularExpression
The regular expression to be used for regex
validation is specified here. IMPORTANT: the backslash must be masked by a second backslash. A class is therefore not defined by \w
, but by \w
.
validationError
This subfield contains a text that is displayed if the field content violates the validation rules.
@searchable
This can be used to control whether the metadata should be offered as a field in the advanced search. The field may contain the two values true
and false
(default).
In the default setting, the individual sections 1 Identification
, 2 Context
, 3 Content and internal organisation
, 4 Access and usage conditions
, 5 Related documents
, 6 Notes
and 7 Directory control
are collapsed for reasons of space and are not displayed. The element <showGroup level=‘1’ />
can be used so that they are already expanded and displayed when a node is selected. The ordinal number in the level attribute is used to control which area is expanded. The attribute <showField=‘true’
can be used within the <metadata>
definition to display unfilled metadata immediately without adding it using a badge.
The two elements <eadNamespaceRead>
and <eadNamespaceWrite>
define which XML namespaces are to be used for reading and writing EAD documents. Usually both contain the same value. However, EAD2 documents can also be read and exported as EAD3 documents. In this case, the corresponding namespaces must be defined and care must be taken in the xpath expressions of the individual metadata to ensure that both variants are specified. It is therefore easier to use the enclosed converter and convert from EAD2 to EAD3 before importing the documents.
Namespace for ead2 (deprecated): urn:isbn:1-931666-22-9
Namespace for ead3 (current): http://ead3.archivists.org/schema/
Namespace for ead4 (in draft status): https://archivists.org/ns/ead/v4
Goobi Administration Plugin for managing archive collections
Identifier
intranda_administration_archive_management
Repository
Licence
GPL 2.0 or newer
Last change
16.09.2024 13:07:31
This documentation describes the installation, configuration and use of the Administration Plugin for managing archive collections from within Goobi workflow. This allows the data from several archives to be managed and enables even small archives to record data in a standardised way without having to install third-party software that is subject to a charge. Export as standardised EAD files is possible at any time and can also be carried out automatically at regular intervals.
The plugin consists of the following files to be installed
These files must be installed in the correct directories so that they are available in the following paths after installation:
The plugin also requires an additional configuration file, which must be located in the following location:
The plugin for editing archives can be found under the menu item ‘Administration’.
To use the plugin, the user must first have the Plugin_Administration_Archive_Management
right. If this right has not yet been assigned, the user will receive the following message:
The corresponding rights must therefore first be assigned to the respective user groups.
Once the required rights have been assigned and, if necessary, a new login has been created, the plugin can be used.
The user initially only has read access. In order to be able to change data, the following additional rights are available, which can be assigned if necessary:
Plugin_Administration_Archive_Management_Write
Write access to the data
Plugin_Administration_Archive_Management_Upload
Upload or import (new) EAD files
Plugin_Administration_Archive_Management_New
Creation of new inventories
Plugin_Administration_Archive_Management_Vocabulary
Authorisation to extend selection lists from vocabularies
Plugin_Administration_Archive_Management_Inventory_NAME
Access to individual selected inventories, whereby the suffix NAME must be replaced by the name of the inventory
Plugin_Administration_Archive_Management_All_Inventories
Access to all inventories
Plugin_Administration_Archive_Management_Delete
Delete the selected inventory
Plugin_Administration_Archive_Management_Process
Create processes
A detailed explanation of how to use the plugin and its functions can be found on this page:
After installation, the plugin and the associated interface are configured in the configuration file plugin_intranda_administration_archive_management.xml
. This is described in detail on the following page:
To start up the Goobi-to-Goobi mechanism, various plugins must be installed and configured on both the source and target systems. These are described in detail here.
First of all, the source system must be prepared for export. This includes first of all the installation of the correct plugin. Afterwards, only a permission for the appropriate users has to be configured to allow the export.
On the source system, the plugin plugin_intranda_administration_goobi2goobi_export
must first be installed to create the export directories. To do this, the following two files must be copied to the appropriate paths:
Please note that these files must be readable by the user tomcat
.
To enable the user to export data, the user must have the following roles:
These roles can be configured within the Goobi workflow user groups. To do this, simply select the roles on the right-hand side or enter them in the input field and then click on the plus icon.
With this configuration the preparation on the side of the initial system is already completed.
The target system must also be prepared for the import. After the installation of the corresponding plugin and the corresponding configuration files, some configurations have to be checked or made.
On the target system, the plugin plugin_intranda_administration_goobi2goobi_import
must first be installed to import the export directories. To do this, the following two files must be copied to the appropriate paths:
After the installation of the actual plugin, the corresponding configuration files must also be installed. These can be found under the following paths:
Again, please note that the installed files must all be readable for the user tomcat
.
To enable a user to perform the import, the user must have the following role:
This role can be configured within the Goobi workflow user groups by entering it in the input field on the right-hand side and clicking on the plus icon.
To influence the data to be imported during the import of the infrastructure, the configuration file plugin_intranda_administration_goobi2goobi_import_infrastructure.xml
can be adapted. This configuration can look like the following example:
In this configuration file all fields are optional. If a field is missing, its value is not overwritten during configuration. If the field is empty, it will be imported empty, otherwise it will be overwritten with the value from this configuration file. The fields for adding or removing are basically repeatable.
To import the data to the target system, you can specify in the configuration file plugin_intranda_administration_goobi2goobi_import_infrastructure.xml
where the data is located and how it should be processed during the import. This configuration can look like the following example:
In the upper part of the file, some general settings are made that apply to all imports. These general settings are followed by the individual configured rules.
General settings: globalConfig
The individual rules for the import operations will be defined within the <config>
element. The name of the rule is defined in <rulename>
. If no rule is explicitly selected during the import, it will be determined by the project name of the processes. The field is repeatable, so that several identical rules can be created, for example if the same workflow is used in different projects.
By means of <step>
individual steps of the process can be manipulated. All fields are optional. If they are not specified, the original value is used. Otherwise the field is overwritten with the configured field content. If the field is of type String, it can also be specified empty to empty it.
In this element, the assigned docket can be replaced. The xsl file to be used must exist on the server. If a docket has already been defined with the new specifications, it will be used, otherwise a new docket will be defined and stored in the database.
This rule can be used to change the assigned project. The project must already exist. Changes to the projects themselves can be made using Import infrastructure
.
| Element | Example | Meaning | | :--- | :--- | :— | | @name
| | Project A
| Old Project | | | newProjectName
| Project B
| New Project |
This rule is used to manipulate process properties.
| Element | Example | Meaning | | :--- | :--- | :— | | @name
| CollectionName
| Name of the property to be adjusted. | | oldPropertyValue
| Digitised
| Value of the property to be adjusted. If a value is specified, the property must contain this value. | | newPropertyName
| Collection
| New name of the property. Optional. | | newPropertyValue
| default collection
| New value of the property. Optional. |
This rule can be used to change the assigned rule set. If the ruleset does not yet exist, it is created and saved in the database. The file must exist on the server.
| Element | Example | Meaning | | :--- | :--- | :— | | @name
| Default
| Name of the ruleset used so far. | | newRulesetName
| default ruleset
| New name for the ruleset. | | newFileName
| ruleset.xml
| New file name for the ruleset. This must exist on the target system. |
With this rule the metadata can be changed. Values of existing metadata can be changed, new metadata added or existing metadata deleted.
Further general settings can be defined within a rule.
The import of data on the target system takes place using two different plugins. These must first be installed and configured accordingly. More information about their installation and configuration can be found here:
After the successful installation, you can continue with the actual import. A distinction must be made here between the pure import of processes and the import of an exported infrastructure. Depending on the project, the import of the infrastructure may be necessary as the first step.
In the area for importing the infrastructure, the previously exported infrastructure of the source system can be imported. To do this, first open the plugin Goobi-to-Goobi Import - Infrastructure
in the Administration
menu.
At this point you can now upload a zip file that was previously created on the source system. After the successful upload, the file is unpacked on the server and analyzed. The user then receives a summary of the data to be imported.
If users, projects, groups, etc. already exist in the target system with the same name as the data to be imported, they do not count as new data to be imported and cannot be overwritten. After selecting the importing data, the import can be started by clicking on Execute import of infrastructure
.
If desired, the data can be manipulated during the import. This is possible by adapting the configuration file plugin_intranda_administration_goobi2goobi_import_infrastructure.xml
. More details can be found in the section Configuration for importing the infrastructure
here:
To import the processes from the source system, they must first be successfully exported and transferred to the target system. How the transfer of the sometimes very large amounts of data can take place is described here:
Once the data has been completely transferred to the target system, you can start the import of the data. To do this, open the plugin Goobi-to-Goobi Import - Data
in the Administration
menu. There the configured rules for the import are displayed in the upper part of the user interface. If these rules are edited on the target system, they can be reloaded at any time by clicking on the Reload rules
button.
The actual import takes place in the lower area of the user interface. There the user can first search for the data to be imported by clicking on Reload files
. If this search takes longer than 10 seconds due to the large amount of data, the further search takes place in the background and the user gets the feedback that he should please update the page again after some time.
If files are successfully listed after the search of the data to be imported, they can now be selected. To do this, you can either select them individually or let Goobi select them all by clicking on Select all
. To do this, you need to select the rule that you want to apply to the import. This can either be selected directly or determined using Autodetect rule
. In this case, the system checks whether there is a rule that corresponds to the name of the project to which the process was assigned.
A click on the button Perform import of data
then starts the actual import. During this import, an internal Goobi ticket is created for each selected process and sent to the internal queue (Message Queue). The individual tickets are processed in the background and the processes are thus imported successively.
You can configure the import and the underlying rules in detail in the configuration file plugin_intranda_administration_goobi2goobi_import_data.xml
. Further information about this configuration can be found in the section Configuration for import of data
:
Goobi Administration plugin for periodic updating of existing METS files with content from a data query
This documentation describes the installation, configuration and use of the Administration Plugin for automated repeated retrieval of data (e.g. from a catalog) to update records in Goobi workflow.
The plugin consists in total of the following files to be installed
These files must be installed in the correct directories so that they are in the following paths after installation:
In addition, there is a configuration file that must be located at the following location:
The Data Poller plugin is automatically activated by Goobi. Its runtime starts at the configured start time and repeats according to the configured number of hours, e.g. every 24 hours, i.e. once a day.
If a user wants to have access to the plugin's user interface in addition to this automatic feature, he must belong to a user group that has been granted the following plugin-specific permission for this purpose:
To assign this right, the desired user group must first have the permission entered in the right pane.
If the permission for the user group should be newly entered, the user must first log in to Goobi again to be able to use this permission level. Afterwards he can click on the plugin Data Poller
in the menu Administration
and there he can also manually re-trigger an update of the data records by means of a query at any time.
If the plugin finds updated metadata for an operation and therefore updates the METS file, it will first automatically create a backup of the current METS file meta.xml
and if relevant also of meta_anchor.xml
. The backup is stored next to the updated METS file.
The updates of the metadata by the plugin usually take place fully automatically in the background. In order to be able to understand nevertheless at any time for a data record, what happened with this in the meantime, the events are logged. For each process for which there were changes from this plugin, detailed entries are therefore automatically inserted within the ‚journal'. In addition to the timestamp, these contain, among other things, an exact listing of the changed metadata fields together with their contents. Thus, it is possible at any time to trace the previous or the new value as well.
The configuration of the plugin is done via the configuration file plugin_intranda_administration_data_poller.xml
and can be adjusted during operation. The following is an example configuration file:
Goobi Administration Plugin for copying an anchor file to all associated volumes
This documentation describes the installation, configuration and use of the Administration Plugin for the automated transfer of a central anchor file of a volume (e.g. from journals or multi-volume works) to other volumes within Goobi workflow.
To be able to use the plugin, the following files must be installed:
There is currently no configuration file for this plugin.
If the plugin has been installed and configured correctly, it can be found under the menu item Administration
.
Once the plugin has been fully set up, it can be used. To do this, first add the newly defined metadata InternalNote
within the tape that is to be marked as the master anchor and enter AnchorMaster
as the value. This is illustrated in the following screenshot:
The adapted journal volume was defined as the master with this change. From now on, the metadata of the parent work (e.g. the journal) used there will serve as the default for all other associated volumes. Changes that are to be made for all volumes within the anchor files will therefore be made within this data record from now on.
As soon as a volume has been defined as the master within a Goobi process, the plugin can be used to transfer all metadata from the master to all associated volumes. To do this, proceed as follows:
First open the plugin using the Administration
menu and then the Copy master anchor data
menu item.
Enter the catalogue identifier of the parent work in the input field of the plugin (e.g. the ID of the journal) and then click on the Start copying process
button. This starts the copying process, which automatically copies the metadata of the master anchor data record to all associated volumes (e.g. all volumes of the journal).
The plugin does not have its own configuration file. Nevertheless, customisation of the rule set used is a mandatory requirement for the operation of the plugin. This is shown by way of example using a rule set that can be found under the following path:
The metadata InternalNote
must be defined within the rule set:
This metadata must now be allowed within the definition of the volumes. This is done using a journal volume as an example:
With this adjustment to the rule set, the preparations for using the plugin are already complete.
The following functions are available within the plugin for archive management:
Once the plugin has been opened, a list of available archive holdings is displayed. Here the user can select an archive inventory and start editing it.
Alternatively, a new archive inventory can also be created. In this case, a name must first be assigned to the file. The name must be unique as it is used for identification. In addition, no special characters such as :/\
should be used, as the name is also the basis for the file name of the EAD export.
The third option is to import an existing file. An EAD file can be selected and uploaded here. If no inventory with the name of the file exists yet, the file is imported as a new inventory and opened directly. If the name is already in use, the existing inventory can be overwritten with the content of the EAD-XML file after a query.
If the user has authorisation to create new inventories, a copy of an inventory can also be created using the corresponding button. This creates a new fonds and copies all nodes with all their metadata. The only exception here is the ID of the nodes. These are automatically created and assigned to the nodes.
After selecting the archive to be edited, the user is forwarded to the editing screen. The structure tree can now be edited in the left-hand area. The details of the selected node can be edited in the right-hand area.
By clicking on the buttons Cancel
(read rights) or Save and exit archive
(write rights), you will be redirected to the page for selecting an archive.
The structure of the archive file can be edited in the left-hand area of the editing screen. All nodes including their hierarchy can be viewed here at a glance. There is an icon in front of each element that can be used to display or hide the sub-elements of the node. To select a node, click on it. It is then highlighted in colour and the details of the selected node are displayed on the right-hand side. If a node has been selected in the left-hand area of the editing screen, the buttons on the right-hand edge of the left-hand box can also be used to change the node. The following options are available:
To generate several sub-nodes at once, the number of nodes to be created and their type must be defined. Various metadata can then be defined and entered in all new nodes. Either the same text can be used in all fields, an identifier can be generated or a text with a subsequent counter can be generated. The counter format and the start value can be defined here.
In the upper area of the hierarchy display, you can also search within the metadata of the nodes. The nodes found, including the hierarchy, are displayed and highlighted in colour. To reset the search, it is sufficient to empty the content of the search term again and perform an empty search accordingly. Alternatively, the button on the left-hand side of the search field can be used.
The advanced search can be used to the right of the field. Individual fields can be searched for here. Which fields are available can be controlled via the configuration file (attribute searchable=‘true’
within <metadata>
).
If a node has been selected in the left-hand area, the details of the selected node are displayed in the right-hand area.
The right-hand area is divided into several categories. The corresponding Goobi process is displayed at the top of the right-hand section, along with an option to create the docket. If no Goobi process has yet been created for the node, a new process can be created on the basis of the configured production template. The selected node type is used as the document type in accordance with the configuration. Depending on the configuration and the rule set used, the following options are available, for example:
Folder
File
Image / Picture
Audio
Video
Other / Miscellaneous
The individual metadata of the node is listed below the document type. They are divided into the following areas in accordance with the ISAD(G) standard:
Identification
Context
Content and internal organisation
Conditions of access and use
Related documents
Annotations
Cataloguing control
Each of these areas can be opened and closed individually. Even if an area is collapsed, it is very easy to recognise which metadata per area is possible and which is already filled in. The individual metadata are displayed as differently highlighted badges. A dark background indicates that a value has already been entered for this metadata. A light background indicates that this field is still empty. If a field can be created repeatedly, the badge contains a plus icon.
If the details of an area are expanded, the individual metadata is displayed. By default, only those fields that already have a value are displayed. Additional fields can be added by clicking on one of the badges. Fields can be removed again using the minus icon.
Both the Download as EAD file
button and the Execute validation
button ensure that the metadata is valid. The configured rules are applied and it is checked whether individual values violate them. If this is the case, the affected nodes are highlighted in colour in the left-hand area. If such an invalid node is selected, the affected badges are displayed in red and the configured error text is displayed in the metadata alongside the border.
A failed validation does not prevent the archive from being saved or Goobi processes from being created.
Unless editing is only carried out in read-only
mode, data is always saved automatically when you insert or delete nodes, switch to another node, export the data, create a copy of it or create links or end editing using Save and exit
.
The two buttons for Download as EAD file
and the Viewer export
generate a new EAD based on the current status of the nodes. With the exception of the top node, each node is displayed as an independent <c>
element. The data of the top node is written within the <archdesc>
below the <ead>
element.
With viewer export, the generated file is written to the Goobi viewer hotfolder, whereas with download it can be saved locally.
The generated file contains all metadata in the form in which it was specified in the configuration file. The content of the xpath
attribute of the metadata is used. If there is no entry for a field, it is an intensive metadata that is not exported as an EAD.
dbExportPrefix
import/
This specification is required if the database information to be imported is not located as xml files in the respective process folder. The specification contains the path to the database information within an s3 bucket and is not required when importing into a local file system.
importPath
/opt/digiverso/goobi/metadata/
Target directory into which the data is to be imported.
bucket
example-workflow-data
Name of the s3-bucket in which the data to be imported is located. This value is not required for imports into a local file system.
createNewProcessIds
false
This parameter defines whether the process identifiers from the old system should be reused or whether new IDs should be created.
temporaryImportFolder
/opt/digiverso/transfer/
This parameter specifies the path to the folder containing the data to be imported. The value only needs to be configured if it differs from the value within importPath
.
@name
Example task
Contains the name of the step to be changed.
@type
delete
This value contains the type of manipulation. Possible values are delete
, change
, insertBefore
, insertAfter
.
NewStepName
New step name
New name of the step.
priority
5
New priority of the step.
order
10
Order of the step.
useHomeDirectory
0
Controls whether to link to the user's home directory.
stepStatus
0
Sets the step status. Allowed values are 0
(locked), 1
(open), 2
(inwork), 3
(done), 4
(error) and 5
(deactivated).
types
automatic="true"
Contains in attributes the different settings of a step.
scriptStep
scriptStep="true" scriptName1="script 1" scriptPath1="/bin/true"
Defines scripts for the workflow steps
httpStep
httpStep="true" httpMethod="POST" httpUrl="http://itm.example.com/itm/service"
Defines the configuration of the HTTP call for the step.
usergroup
Administration
Name of the assigned user group. This value can be repeated to define multiple user groups.
@name
Default docket
Name of the previously used docket. The change is only made if the process has previously used a docket with this name.
newDocketName
docket
New name of the ticket.
newFileName
docket.xsl
New file name for the docket.
@name
CatalogIDDigital
Internal name of the metadata.
@type
change
Type of change. Allowed values are add
, change
and delete
.
position
top
Describes the position at which the change is to be made. Allowed values are all
, anchor
, top
and physical
.
valueConditionRegex
/PPN\d+\w?(?:_\d+)?/
This regular expression checks whether the previous field content matches a defined value. This specification can be a fixed value or a regular expression.
valueReplacementRegex
s/^PPN(.+)$/$1/g
If the value change
was used as @type
, this parameter contains a regular expression for manipulating the previous metadata. If @type
is set to add
, the field content is used as the value of the metadata.
skipProcesslog
true
Determines whether the process log of the source system should be transferred (false
) or ignored (true
).
skipUserImport
true
Specifies whether the users of imported tasks in a workflow within Goobi should be created as deleted users (false
) or whether the information about execution by specific persons should be ignored and thus made anonymous. (true
).
Identifier
intranda_administration_data_poller
Repository
Licence
GPL 2.0 or newer
Last change
07.10.2024 13:54:01
type
Here the type of the rule
can be determined. You can choose between hotfolder
and filter
. Depending on the type, additional parameters must be specified within the rule
. These are described in the subsections below this table.
title
At this point an internal name is specified, which is mainly used for the user interface to distinguish the different rules
startTime
This parameter sets the start time at which the plugin should execute this rule.
delay
This can be used to specify how often the plugin should be executed. The specification is made in the form of hours.
enabled
The rule is executed only if the enabled
attribute takes the value true
.
catalogue
Here it is possible to define which catalogue is to be used for querying new data. This is the name of a catalogue as defined within the global Goobi catalogue configuration within goobi_opac.xml
. catalogue
has the subelements fieldName
and fieldValue
.
fieldName
Is an attribute of the catalogue
element and controls within which field the catalogue is queried. Often this value is 12
.
fieldValue
Is an attribute of the catalogue
element. Definition of the metadata from the METS file that is to be used for querying the catalogue. Usually this is the identifier that was also used for the initial catalogue query and is usually stored within the metadata ${meta.CatalogIDDigital}
.
exportUpdatedRecords
If this value is set to true
, a new data export is performed after the catalogue query for all those data records that were actually updated during the catalogue query. The data export in this case is the step that was defined as the first export
step within the workflow for the process. This usually means the export and thus the publication of the task within the Goobi viewer. It should be noted here that the tasks are only exported if the mechanism for mergeRecords
is also set to true
.
mergeRecords
If the value true
is set, the existing METS file will be updated with the current data from the catalogue. Any additional metadata can be excluded for the update. Also, the logical and physical structure tree within the METS file remains unchanged. If the value is set to false
, then the existing METS file will be completely replaced by a new METS file generated using the catalogue query.
analyseSubElements
This element can be used to define whether metadata for structural elements already existing within the METS files should also be queried by the catalogue. For this purpose, the specified metadata for the identifier to be queried must be available for each subelement.
fieldList
The blacklist
and whitelist
modes are available here. If the whitelist
mode is selected, the metadata fields that are to be updated by a catalogue query can be defined here. If the blacklist
mode is used, several metadata fields can be defined that should not be changed by a catalog query under any circumstances. This is especially useful for those fields that do not come from a catalogue query and were therefore previously recorded in addition to the catalogue data. Typical examples of such fields include singleDigCollection
, accesscondition
and pathimagefiles
. Please note that this parameter only applies if the value for mergeRecords
is set to true
.
alwaysExecuteStepList
Here the titles of the automatic steps can be specified, which are to be executed with a run of the datapoller. The titles are located in a step
element. Several steps can be specified.
filter
By means of the filter, one or more Goobi projects can be defined for which the rules defined here should apply. By using *
the rule will apply to all projects. Spaces contained within the filter must be enclosed in quotation marks, just as they are within the Goobi interface.
path
Here you must specify the path of the hotfolder where the files to be imported are located.
createMissingProcesses
If this switch is activated, new tasks are created for files that cannot be assigned to an existing task.
workflow
Here, it can be specified which templates can be used for new processes.
fileHandling fileFilter
Here a regex filter can be specified to filter the filenames of the files in the hotfolder. A simple filter would be e.g. *\.xml
. This filter would ensure that only XML files in the folder are processed.
Identifier
intranda_administration_copymasteranchor
Repository
Licence
GPL 2.0 or newer
Last change
20.07.2024 19:04:29
Insert new node
This button can be used to add a new node as a sub-node to the end of the existing sub-nodes.
Insert several subnodes at this point
Opens a pop-up in which any number of nodes can be created.
Update references
Checks whether processes exist for the nodes in the inventory. This action updates the references if necessary.
Create missing processes
Generates processes for the selected node and all child nodes if no processes exist for these nodes.
Delete node
This allows you to delete the selected node including all sub-nodes. Attention: This function cannot be used at the level of the main node.
Perform validation
This function can be used to validate the selected node. Violations of the configured validation specifications are listed accordingly.
Move upwards
This button allows you to move the selected node upwards within the same hierarchy level.
Move downwards
This button allows you to move the selected node down within the same hierarchy level.
Move down the hierarchy
This button can be used to move the selected node to a lower hierarchy level.
Moving up the hierarchy
This button can be used to move the selected node to a higher hierarchy level.
Move node to another position
This function opens another editing screen that allows you to move the currently selected node to a completely different position in the hierarchy tree. The entire hierarchy is displayed so that the node within which the selected node is to be inserted as a sub-node can be selected.
Duplicate node
Opens a popup in which a prefix or suffix can be specified for selected metadata (attributes visible
and showField
). The action duplicates the selected node and all child elements and adds the specified prefixes and suffixes to the new metadata.
After the export directories have been created, the task folders can be copied from the source system to the target system. Depending on the amount of data involved, different methods can be used for the transfer.
If an external hard disk is to be used for the transfer, the cp
command can be used to copy from the source system to the hard disk and later back from the hard disk to the target system.
Example call for the copy operation from the source system to the external hard disk:
Example call for the copy operation from the external hard disk to the target system:
If a network connection can be established between the source system and the target system, data transfer is possible using the commands scp
or rsync
. The advantage of transferring data using rsync
is that any interruption of the connection can be resumed without having to start the entire transfer again from the beginning.
An example of such a call is as follows:
If the call should only transfer certain directories, use a maximum bandwidth and also exclude other data, such a call could also become a bit more extensive:
To export to an S3 Bucket to AWS you can use the script s3sync.py
.
This is a plugin for Goobi workflow that allows you to edit all the important configuration files of Goobi workflow.
Identifier
intranda_administration_config_file_editor
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:43:22
This plugin is used to directly edit the various configuration files of Goobi workflow directly from the user interface within the web browser.
The plugin consists in total of the following files to be installed:
These files must be installed in the correct directories so that they are available under the following paths after installation:
This plugin has its own permission level for use. For this reason, users must have the necessary rights.
Therefore, please assign the following right to the user group of the corresponding users:
After installation, the plugin can be found in its own entry in the Administration
menu, from where it can be opened.
After opening, all Goobi configuration files are listed on the left-hand side. These can be opened by clicking on the respective icon in order to edit them.
Please note that the configuration file of this plugin does not appear in the list by default for security reasons and is editable only by super administrators.
Also, no hidden files and no files from hidden folders are displayed.
If you open a file, a text editor appears on the right-hand side in which the file can be edited. If you edit and save a file, a backup is automatically created in the defined backup directory.
According to the value set in the configuration file, a certain number of older backups are retained here before they are replaced by newer ones.
If a file has been changed and an attempt is made to change to another file without saving it, the operator is asked how to proceed with the changes.
Within Goobi, help texts can be defined for configuration files, which can be helpful for editing in this editor. The stored help texts are displayed in the left-hand area depending on the file currently open and also have the option of working with formatting here.
The plugin is configured via the configuration file plugin_intranda_administration_config_file_editor.xml
and can be adapted during operation. The following is an example configuration file:
The parameters within this configuration file have the following meanings:
configFileDirectories
This is the list that contains all selected configuration file paths. The configuration file path preset in Goobi Workflow is always used.
directory
Configuration files from the absolute path specified here are displayed in the user interface. The path is ignored if it does not exist.
backupFolder
This parameter specifies a relative path in directory
where the backup files should be stored. By default, backup/
is used if the parameter is not specified. To have backup files stored in the same directory as directory
, override the value with backupFolder=""
.
backupFiles
This integer value specifies how many backup files are kept per configuration file before they are overwritten by new backups. The default value is 8.
fileRegex
This parameter enables filtering of the displayed configuration files in the corresponding folder. Any regex expression can be entered. If this parameter is not used or an empty text is specified, all files are displayed.
If help texts for individual configuration files are to be displayed, these must be stored within the messages files. For this purpose, an adjustment is made in these files, for example:
For each configuration file, a value like the following can be entered there in the respective file.
German version within the file messages_de.properties
:
English version within the file messages_en.properties
:
Note that the prefix plugin_administration_config_file_editor_help_
is added before the name of the configuration file.
Administration plugins for migrating from one Goobi workflow system to another Goobi workflow system
Identifier
intranda_administration_goobi2goobi_export intranda_administration_goobi2goobi_import_infrastructure intranda_administration_goobi2goobi_import_data
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:11:13
The two plugins described here can be used to transfer data from one Goobi workflow system to another Goobi workflow system (Goobi-to-Goobi
). This documentation explains how to install, configure and use the associated plugins.
Before the export and import mechanism can be used, various installation and configuration steps must be completed. These are described in detail here:
The mechanism for transferring data from one Goobi workflow system to another Goobi workflow system (Goobi-to-Goobi
) is divided into three major steps.
These three steps are as follows:
The first step involves enriching the data within the file system on the source system with the information that Goobi has stored internally in the database for each process. When this step is performed, an additional xml file containing the database information on the workflow and some other necessary data is written to the folder for each Goobi process.
Erzeugung der Export-Verzeichnisse
After the complete creation and enrichment of the export directories on the source system, they can be transferred to the server of the target system. This can be done in different ways. Due to the amount of data involved, a transfer using rsync
has proven to be the most suitable.
Transfer der Export-Verzeichnisse
After the export directories have been successfully transferred to the target system, the data can be imported there. To do this, the data must be stored in the correct place in the system and some further precautions regarding the infrastructure must also be prepared.
Documentation for the plugins of the Open-Source-Software Goobi workflow from intranda
On the following pages you will find documentation for various plug-ins and extensions for Goobi workflow. Please select the desired plugin from the table of contents on the left to access the documentation.
Please note that within Goobi workflow there are different types of plugins for the respective application scenarios.
Export plugins are used to export data from Goobi workflow to another system. They are executed either automatically as part of the workflow or manually by clicking on the corresponding icon in the process list. They are usually installed within this path:
Export plug-ins within Goobi are set up in such a way that they are selected from the list of step plug-ins within a workflow for a step and the Export
checkbox is also activated. Usually, the checkbox Automatic task
is also selected in order to have the exports executed automatically in the course of the workflow.
Some export plugins have their own configuration file. This file is generally named like the plugin itself and is usually located at the following path:
Step plugins are used to extend tasks within the Goobi workflow. Such plugins can be used, for example, to integrate individual functionality into the workflow that Goobi does not provide out-of-the-box. Examples of such plug-ins include special conversion plug-ins, entry masks, image manipulations, etc.
Such step plugins are installed in the folder:
If a plugin also has a user interface in addition to the actual functionality, the part of the user interface must also be installed in this folder:
Step plugins in Goobi are set up in such a way that they are selected as plugins within a task.
Please note that there are currently three different types within Step Plugins:
No GUI
The plugin does not have its own user interface and is executed in the background on the server side. Example: A plugin for the automatic conversion of images into another file format.
Part GUI
The plugin brings along a part for a user interface and is visually integrated within a processed task as if it were part of the Goobi core. Here the user can interact with the user interface. Example: A plugin for uploading images within a task.
Full GUI
The plugin comes with a complete user interface. This is not directly integrated into the task. Instead, the user is offered a button to enter the plugin so that he can interact with it. Example: Plugin for image control.
Some Step Plugins have their own configuration file. This file is generally named like the plugin itself and is usually located at the following path:
Opac plugins are used for communication with external data sources. Typical examples are plugins for the connection of library catalogues or databases. Depending on the data source, different implementations exist for this in order to correctly address the respective interface to be used.
Opac plugins are usually installed in this path:
After installing such a plugin, it is available in the Search in Opac
field within the Create Processes in Goobi screen.
Import plugins are used for the execution of larger mass imports. Unlike Opac plugins, they do not query from a single data source, process by process. Instead, import plugins usually import hundreds or thousands of data at the same time, often in different formats. Common examples here include import plugins for importing SQL dumps, Excel tables or other proprietary data sources.
The import plugins are installed in the folder:
These plugins are used in a separate mask for mass imports in which you select the different import mechanism and the desired plugin before selecting the data.
Some import plugins have their own configuration file. This is generally named like the plugin itself and is usually located at the following path:
Administration plugins are available for some special use cases. The special feature is that these plugins are not functionally restricted. They are not explicitly integrated at a given point within the workflow nor are they executed at a defined moment. Instead, they usually have their own user interface and offer independent functionality as an extension of Goobi. Examples of this include administrative intervention in process data or the administration of controlled vocabularies.
The installation of the administration plugins takes place in the folder:
Since most administration plugins have a user interface in addition to the actual functionality, this must also be installed into the following folder:
Some administration plugins have their own configuration file. This file is generally named like the plugin itself and is usually located at the following path:
The workflow plugins are technically very similar to the administration plugins. They can also offer an independent user interface for the provision of additional functionality. In contrast to the administration plug-ins, however, access to these plug-ins is also possible without administrative rights within Goobi, so that a larger group of users usually has access to these functions.
The workflow plug-ins are installed in the folder:
Since most workflow plugins have a user interface in addition to the actual functionality, it must also be installed in the following folders:
Some administration plugins have their own configuration file. This file is generally named like the plugin itself and is usually located at the following path:
With the Dashboard Plugins it is possible to provide a special Dashboard with additional functionality instead of the standard start page. This could, for example, already display some statistical information that shows integration with other systems and also give an insight into the current monitoring.
The Dashboard Plugins are installed in the folder:
The user interface of the dashboards must also be installed in the following folders:
Some Dashboard plugins have their own configuration file. This is generally named like the plugin itself and is usually located at the following path:
Please also note that individual dashboards must always be activated within the main configuration file goobi_config.properties. This can be done as follows:
The statistics plugins are available for the provision of individual statistics. Depending on which of these plugins are installed, a wide variety of statistical evaluations can be carried out, either as diagrams, tables or downloads in various formats.
The installation of the statistic plugins takes place in the folder:
The user interface of the statistic plugins must also be installed in the following folders:
In Goobi, the validation plug-ins are used to ensure that data is available as required before completing a step. If the validation is not successful, the user cannot complete the task and therefore cannot remove it from his task list.
The validation plugins are installed in the folder:
The validation plug-in must then be selected accordingly in the Validation plug-in
field within the task for the required workflow step.
Some validation plugins have their own configuration file. This is generally named like the plugin itself and is usually located at the following path:
With the REST plugins, Goobi has another way for external systems to communicate with Goobi. In contrast to the Web API, however, communication here is via REST and takes place largely via JSON.
REST plugins are installed in the following folder:
Like the Web-API plugins, the REST plugins do not have their own user interface. Also the access permission is controlled by the same configuration file and controls the access from selected IP addresses and authentication. For the REST Plugins the configuration is done in the following file:
The export from the source system consists of up to three sub-steps. However, before the export can take place, it must first be specified within the role system of Goobi workflow that the user must have export permissions. Information on the configurations to be made can be found here:
After configuring the required user rights, the actual export can begin. In most cases, only the first of the following three steps will be necessary.
For most purposes, only this sub-step is required to generate the export files for all desired processes. For all selected processes within the file system, an xml file with all relevant information about the process is generated from the database in the folder of each selected process.
To perform such an export for several processes together, you can start it using GoobiScript. The following GoobiScript command is required for this:
When you run this GoobiScript, you will find the relevant export xml file (e.g. 5_db_export.xml
) in each process folder.
To perform such an export for a single process, it is possible to start it within the details of a process. To do this, simply click on the corresponding icon for the export.
Unlike exporting via GoobiScript, this starts a download of the xml file that contains the database information.
Notice: This substep is optional and is only required in rare cases.
If you want to transfer more than just processes from one Goobi workflow to another, you can also generate export data for process templates. However, as GoobiScript is not available within the process template area, this export can be done from the provided Goobi-to-Goobi Export
plugin within the Administration
menu.
Now click on the button Generate database files for process templates
. This will also save an xml file with the database information for each process template in the file system and can be used for the transfer to the target system.
Notice: This substep is optional and is only required in rare cases.
If, in addition to the actual Goobi processes, you also want to transfer more detailed information about the infrastructure from one Goobi workflow to another, you can also have this exported within the export plugin. To do this, select the checkboxes provided within the Goobi-to-Goobi Export
plugin to influence the export in a targeted manner. The following parameters are available for this:
LDAP groups
Exports the existing LDAP groups.
Users
Export of active users.
Include inactive users
In addition to the active users also export the deactivated users.
Create new passwords
Determines whether the existing passwords of the users should be exported as well. If the checkbox is set, new passwords must be set on the target system for the imported users after the import.
User groups
Export of user groups, permissions and additional roles.
User group assignments
Export all groups assigned to the user.
Projects
Export of the projects.
Project assignments
Export of all projects assigned to the user.
Rulesets
Export of rule set information.
Dockets
Export of the docket information.
Include files
Determine whether the exported zip file should include the rulesets and dockets.
Once you have selected the desired information and clicked on the Download infrastructure as a zip file
button, Goobi generates a zip file and offers it for download under the name goobi-to-goobi-export.zip
. This zip file now contains all the information selected from the Goobi database for transfer to the target system.
Goobi Administration Plugin for checking ruleset compatibility for multiple processes
Identifier
intranda_administration_ruleset_compatibility
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:16:33
This documentation describes the installation, configuration and use of the Administration Plugin for automated checking of a large number of processes within Goobi workflow with the assigned rule set. Any incompatibilities with the respective rule sets are identified and a corresponding message about the specific incompatibility is displayed.
The plugin consists of the following files to be installed:
These files must be installed in the correct directories so that they are available in the following paths after installation:
To use this plugin, the user must have the correct role authorisation. Therefore, please assign the role Plugin_administration_ruleset_compatibility
to the user group.
If the plugin has been installed and configured correctly, it can be found under the menu item Administration
. After accessing it, the parameters described above can be customised again in the interface.
After clicking on the button Execute plugin
the check of the METS files starts. A progress bar informs about the progress. The processes already processed are listed within the table. Any incompatibilities are immediately displayed. In addition, it is possible to directly enter the metadata editor of individual processes.
The plugin is configured via the configuration file plugin_intranda_administration_ruleset_compatibility.xml
and can be adapted during operation. The following is an example configuration file:
filter
With this parameter, a filter can be set as the default. This is automatically pre-filled when entering the plugin, but can then be adjusted as desired each time the plugin is used within the user interface.
This is an Administration Plugin for Goobi workflow. It allows to edit ruleset xml files directly from the user interface.
Identifier
intranda_administration_ruleset_editor
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:09:29
This plugin is used to directly edit the ruleset files of Goobi workflow directly from the user interface within the web browser.
The plugin consists in total of the following files to be installed:
These files must be installed in the correct directories so that they are available under the following paths after installation:
This plugin has its own permission level for use. For this reason, users must have the necessary rights.
Therefore, please assign the following right to the user group of the corresponding users:
After installation, the plugin can be found in its own entry in the Administration
menu, from where it can be opened.
After opening, all Goobi ruleset files are listed on the left-hand side. These can be opened by clicking on the respective icon in order to edit them.
If you open a file, a text editor appears on the right-hand side in which the file can be edited. If you edit and save a file, a backup is automatically created in the defined backup directory.
According to the value set in the configuration file, a certain number of older backups are retained here before they are replaced by newer ones.
If a file has been changed and an attempt is made to change to another file without saving it, the operator is asked how to proceed with the changes.
The plugin is configured via the configuration file plugin_intranda_administration_ruleset_editor.xml
and can be adapted during operation. The following is an example configuration file:
The parameters within this configuration file have the following meanings:
rulesetBackupDirectory
This sets the path for the backup files where the backups of the ruleset files are to be saved after editing.
numberOfBackupFiles
This integer value specifies how many backup files remain stored per ruleset file before they are overwritten by new backups.
Goobi Administration Plugin for restoring image folders from external storage
Identifier
intranda_administration_restorearchivedimagefolders
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:16:07
This plugin for Goobi workflow restores image folders that were previously archived with the plugin goobi-plugin-step-archiveimagefolder
.
The plugin consists of the following files to be installed:
These files must be installed in the correct directories so that they are available in the following paths after installation:
To use this plugin, the user must have the correct role authorisation. Therefore, please assign the role Plugin_administration_restorearchivedimagefolders
to the user group.
The plugin offers a graphical user interface that can be opened via the menu Administration
. There, a search filter can be used, as it is used in other parts of Goobi workflow (e.g. for the task list). Clicking on 'Run Plugin' will then restore the images for the tasks found via the filter entered. The user interface updates automatically.
The configuration file is empty at the moment, but must still be present.
The information from where the data is to be fetched is stored in the respective process folder in an XML file by the archiving plug-in.
For authentication on ssh servers, public keys are searched for in the usual places ($USER_HOME/.ssh
). Other authentication methods such as username/password are not provided.
Dashboard plugin for the automatic acceptance or completion of workflow steps and for changing location details using a barcode scanner
Identifier
intranda_dashboard_barcode
Repository
Licence
GPL 2.0 or newer
Last change
21.09.2024 11:37:28
This dashboard plugin was developed to facilitate the use of a barcode scanner in the Goobi Workflow. On the right side is a form for various actions, such as accepting and completing tasks or tracking the location of objects.
In order to use the plugin, the following files must be installed:
To configure how the plugin should behave, various values in the configuration file can be adjusted. The configuration file is usually located here:
To use this plugin, the user must select the intranda_dashboard_barcode
value within the dashboard settings.
To use this dashboard plugin, you first need to activate it via Settings
-> General
-> Dashboard
and then log in again. If the plugin is correctly installed and configured, it should already be activated under the Dashboard
menu item.
On the right side, there is a form with various actions. You can select one by clicking on it. If the action Change Location Only
is chosen, an additional input field will appear, expecting the name of the new location.
For all actions, there is a mandatory input field where the title of the Goobi process is expected. This field is automatically focused after loading to facilitate the use of a barcode scanner. By clicking the Execute
button, the selected action will be performed, and messages regarding success will be displayed. The performed action and the input location are saved to facilitate further applications. They remain unchanged until a manual change is made.
Im Fall, dass Ortwechsel erfasst werden, sind diese auch zu einem späteren Zeitpunkt jederzeit innerhalb des Journals noch nachvollziehbar.
Der jeweils aktuelle Aufenthaltsort des Objektes wird darüber hinaus in einer eigenen Eigenschaft gespeichert.
The plugin is configured in the file plugin_intranda_dashboard_barcode.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
tasks-latestChanges-size
This parameter defines how many completed tasks should be displayed in the left table.
show-accept-option
This parameter determines whether the action button for accepting tasks should be enabled. Default is false
.
show-finish-option
This parameter determines whether the action button for finishing tasks should be enabled. Default is false
.
show-accept-and-finish-option
This parameter determines whether the action button for accepting tasks and completing them should be enabled. Default is false
.
show-change-location-option
This parameter determines whether the action button for changing the location should be enabled. Default is false
.
This export plugin for Goobi workflow generates a specific export of individual pages as several METS files per process. Each structural element results in a separate METS file.
This plugin is used for a special export of multiple METS files per process. From a single METS file within Goobi workflow, a separate METS file with the associated image files is created during the export for each structural element contained.
This plugin was developed for the Federal Office for the Protection of Monuments in Austria and is functionally geared to their needs and therefore may not be directly applicable to other use cases.
The plugin consists of the following files to be installed:
This file must be installed in the following directory:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_export_singleImage
from the list of installed plugins.
Since this plugin should usually be executed automatically, the workflow step should be configured as automatic
. In addition, the task must be marked as an export step.
Once the plugin has been fully installed and set up, it is usually run automatically within the workflow, so there is no manual interaction with the user. Instead, calling the plugin through the workflow in the background does the following:
For each structural element within the METS file, an independent METS file is created during the export, to which the respective image files are exported along with it. The entire METS file is not exported. The number of METS files created in this way differs accordingly from the number of Goobi processes and corresponds to the number of existing structural elements.
This plugin does not have its own configuration file.
Dashboard plugin for extended information display
This dashboard plugin provides an improved overview through detailed display options. For example, the most recently edited tasks or relevant statistics can be shown.
In order to use the plugin, the following files must be installed:
To use this plugin, the user must select the value intranda_dashboard_extended
within the dashboard settings.
If the plugin is installed correctly and users have set it as their dashboard, it will be visible after logging into Goobi workflow instead of the start page.
The plugin is configured in the file plugin_intranda_dashboard_extended.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
This is a technical documentation for the Heris Export Plugin. It enables the export of selected images and their associated metadata to an SFTP server.
This documentation describes the installation, configuration and use of the Heris export plugin in Goobi.
This plugin for Goobi workflow can be used to export the images and associated metadata selected in a previous step to a JSON file. The export is then carried out via SFTP to an external server.
The following files must be installed in order to use Heris Export:
An automatic step must be inserted in the workflow in which the intranda_export_heris
plugin has been selected. This step must be carried out after the step with the "Image Selection" plugin, in which the images to be exported are selected.
When the plugin is executed, it first checks whether at least one image has been selected in the step with the Image Selection
plugin. If this is the case, the following tasks are carried out:
Copying the selected images to a temporary folder
Check whether there is already an older export for the current HERIS ID
If yes, create a backup of the old JSON file, download the old JSON file
Check whether the selected files correspond to the image names in the old JSON file
For each image that remains the same, the old image identifier is determined so that it can be reused in the new JSON.
every image that has already been exported and is no longer present in the new export is deleted remotely
each new image that did not exist in the old export is treated as a completely new export
Determination of the metadata for the selected images
Creation of the JSON file from the determined metadata, retaining the old image IDs if necessary
Copy the generated data to the destination
Deleting the temporary data
The configuration takes place in the file plugin_intranda_export_heris.xml
as shown here:
The <config>
area can be repeated, allowing different exports for different projects or steps.
The <propertyName>
field defines the property in which the selected images are saved. This value must match the configuration of the image selection plugin.
The JSON file is then described. The <herisId>
field contains the metadata in which the HERIS ID is saved and the <jsonRootElement>
is used to configure the name of the JSON object in which the individual images are described.
The individual fields of the image objects are described in the <field>
list. Each field has three entries.
The name
attribute defines the name of the element within the JSON file.
The value is described in the element itself.
The type
is used to specify what type it is. The value is interpreted differently depending on the type.
The following specifications are possible:
static
: The value is written unchanged as text in the JSON.
filename
: The image name is saved here.
representative
: Can contain the values true/false
. The first image in the list is used as the representative.
identifier
: Contains the identifier of the image from the HERIS database. The previous identifier is reused during a re-export. The field remains empty for new exports.
metadata
: The value is interpreted as a metadata and determined from the metadata. The metadata is first searched for in the sub-element "photo" that was assigned to the image. If this does not exist, the metadata is expected in the main element 'Document'.
The SFTP connection is configured in the last block. Options are available here for authentication using user name and password, user name and key or user name and password-protected key.
This export plugin allows a very flexible export of Goobi processes based on individual configuration.
This documentation describes how to install, configure and use an export plugin in Goobi.
Using this export plugin for Goobi, Goobi operations can be exported to multiple locations simultaneously within one operation.
This plugin is integrated into the workflow in such a way that it is executed automatically. Manual interaction with the plugin is not necessary. For use within a work step of the workflow, it should be configured as shown in the screenshot below.
The plugin must first be copied into the following directory:
In addition, there is a configuration file that must be located in the following place:
The plugin is configured via the configuration file plugin_intranda_export_configurable-base.jar
and via the project settings. The configuration can be adjusted during operation. The following is an example configuration file:
The block <config>
is repeatable and can thus define different metadata in different projects. The block with <project>*</project>
is applied if no block with the project name of the project exists.
The includeFolders
block is located inside each config
element. It controls which directories are to be taken into account for the export.
If the attribute enabled
is set to false
, then no export of the corresponding folder will take place.
The configuration of the destination folder can be done within the project settings in the Goobi workflow user interface. If the checkbox for Create task folder
is set there, the task will be stored in a subfolder with its title as name in the target folder.
Export plugin for exporting PDF files with special folder and file naming for the National Library of Israel.
This documentation explains the plugin for exporting PDF files with special folder and file naming for the National Library of Israel. The plugin creates any required subfolders in a defined directory and saves an existing PDF file from the master folder with the desired name within the created folder structure.
In order to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be selected within the workflow for the respective workflow steps and thus executed automatically. A workflow could look like the following example:
To use the plugin, it must be selected in a workflow step:
The plugin is executed automatically during the workflow and reads the parameters from the configuration file. On this basis, the plugin then determines metadata from the respective process. The information thus determined is then used to generate a directory path for the export and to create the directory if it does not already exist. The plugin then generates a file name that ends with a counter. The file name generated in this way is checked to see whether it is already in use, so that the counter is adjusted if necessary to obtain a file name that is not yet in use. The first PDF document is then determined from the master directory of the Goobi process and saved under the previously generated file name within the directory path.
The plugin is configured in the file plugin_intranda_export_nli_pdf_to_folder_structure.xml
as shown here:
The parameters used are described here:
Goobi Export Plugin to create the METS structure for import into the DDB newspaper portal
The plugin is used to create the METS structure for the import into the newspaper portal of the German Digital Library. A METS anchor file is created for the complete record of a newspaper, for each exported volume another METS anchor file is created and linked within the complete record. The year contains further structures for month and day.
Each output is created as individual METS files and linked in the METS anchor file of the vintage. The issue may contain further structural data such as article descriptions or supplements. The digitised images are also referenced here.
The plugin consists of the following file:
This file must be installed in the correct directory so that it is available at the following path after installation:
In addition, there is a configuration file that must be located in the following place:
Once the plugin has been fully installed and set up, it is usually run automatically within the workflow, so there is no manual interaction with the user. Instead, the workflow invokes the plugin in the background and performs the following tasks:
A separate METS file is created for each output, linking the images and OCR data associated with the output. The output can have further sub-elements such as articles or inserts.
The individual issues are then combined into one METS file for the year. The METS files of the issues are linked within a structure for month and day.
The last step is to check whether a record with the metadata of the entire issue exists in the target directory. If not, a METS file is created, otherwise the year is entered into the structure data of the overall recording.
To put the plugin into operation, it must be activated for a task in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_export_newspaper
from the list of installed plugins.
Since this plugin is usually to be executed automatically, the task should be configured as automatic in the workflow. Furthermore, the task must be marked as an export step.
In addition, there must be another regular export step so that the linked images and ALTO files can be delivered via the Goobi viewer interfaces.
The configuration of the plugin is done via the configuration file plugin_intranda_export_newspaper.xml
and can be adjusted during operation. The following is an example configuration file:
In the first section <export>
some global parameters are set. Here it is determined whether images are to be exported in addition to the METS files (<images>
true
/false
), whether these are to be exported per issue or per year and linked in the data sets (<subfolderPerIssue>
true
/false
), to which directory the export should be made (<exportFolder>
) and which resolvers should be written for the METS file (<metsUrl>
) and the link to the published record (<resolverUrl>
).
In the second section <metadata>
a set of metadata is defined. These fields must exist in the rule set and are partly copied from the overall record to the individual outputs during the export.
The third section <docstruct>
defines some structural elements to be generated. These must also be configured in the rule set.
Goobi plugin for exporting Goobi processes to a Fedora repository
This documentation describes the installation, configuration and use of the Fedora Export Plugin in Goobi workflow.
The plugin must be installed in the following folder:
There is also a configuration file, which must be located at the following location:
An export step must be configured:
Export DMS
Automatic task
Plugin for step: FedoraExport
When the step is executed, the Goobi process is exported (in the same way as it is exported to the file system) to the configured Fedora Repository, taking into account the configuration (see above).
The process data can then be retrieved from the repository using the following URL pattern:
The configuration is done via the configuration file intranda_export_fedora.xml
and can be adapted during operation.
Export plugin for Goobi workflow to create special export formats in the software Imagen Media Archive Management
This documentation describes the installation, configuration and use of the export plugin for the creation of special export packages for the Imagen Media Archive Management software. Within the plugin, 5 special publication types are currently taken into account and processed individually.
To be able to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be selected within the workflow for the respective work steps and thus executed automatically. A workflow could look like the following example:
To use the plugin, it must be selected in one step:
This plugin is automatically executed as an export plugin within the workflow and generates the required data within a configured directory. Depending on the publication type, these are
Image files
Plain text files with OCR results
ALTO files with OCR results
METS files
METS anchor files
XML export files
The structure of the XML export files in particular varies greatly depending on the publication type. Here is an example of a Generic Print
publication type:
The plugin is configured in the file plugin_intranda_export_adm_bsme.xml
as shown here:
The parameters used are detailed here:
Für eine einfachere Inbetriebnahme befindet sich in install
-Ordner des Plugins einn Verzeichnis mit den zwei passende Regelsätze als Referenz, die zu der hier aufgeführte Konfigurationsdatei passen.
Goobi plugin for exporting Goobi processes to a Fedora repository for the Victoria Public Record Office
This documentation describes the installation, configuration and use of the Fedora Export Plugin in Goobi workflow.
An export step must be configured:
Export DMS
Automatic task
Plugin for step: FedoraExport
When the step is executed, the Goobi process is exported (in the same way as it is exported to the file system) to the configured Fedora Repository, taking into account the configuration (see above).
The following process properties are used to create container URLs or additional container attributes (and are mandatory):
barcode (containing either a 10-character barcode or a 36-character PID)
unit_Item_code (only mandatory when using a 10-digit barcode)
full_partial
The process data can then be retrieved from the repository using the following URL pattern:
barcode=“barcode123”
):barcode=“DB0027DB-F83B-11E9-AE98-A392051B17E6”
):The configuration is done via the configuration file intranda_export_fedora.xml
and can be adapted during operation.
The block
config is repeatable and can define different metadata in different projects. The workflow
sub-element is used to check whether the current block is to be used for the current step. The system checks whether there is an entry that contains both the workflow name and the current step. If this is not the case, the block is used with <workflow>*</workflow>
.
Goobi Plugin for the Export of Goobi Processes to the Stanford University Digital Library
The present documentation describes the installation, configuration, and use of the Stanford Export Plugin in Goobi workflow.
To be able to use the plugin, the following files must be installed:
In addition, there is a configuration file that must be located at the following location:
To use the plugin, it must be selected in a workflow step:
An export step must be configured:
Export DMS
Automatic task
Plugin for workflow step: intranda_export_stanford
During the execution of the step, an export of the Goobi process (similar to exporting to the file system) is performed into the configured directory.
Within this directory, subfolders are created based on the identifier. For example, the identifier qx797sg1405
would generate the following structure: /path/to/folder/qx/797/sg/1405
. Within this folder, two additional folders are created: metadata
and content
.
In the content
folder, all generated images, and if available, the ALTO files and single-page PDFs are stored. Additionally, a complete PDF file is generated from the single pages. The metadata
folder contains an XML file with information about the files within the content folder.
Finally, the configured URL to the REST API is called to initiate the ingest into the system.
The plugin is configured in the file plugin_intranda_export_stanford.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
This is a technical documentation for the VLM Export Plugin. It enables the export to a VLM instance.
This documentation describes the installation, configuration and use of the VLM export plugin in Goobi.
Using this plugin for Goobi, Goobi operations can be exported to the configured location for VLM within one step.
This plugin is integrated into the workflow in such a way that it is executed automatically. For use within a workflow step, it should be configured as shown in the screenshot below.
The plugin must first be copied to the following directory:
In addition, there is a configuration file that must be located in the following place:
The plugin is configured via the configuration file plugin_intranda_export_vlm.xml
. The configuration can be adjusted during operation. The following is an example configuration file:
Currently, there is support for only one type of condition, the variablematcher
condition. This type of condition checks any kind of variable, that is defined as field
, and performs a regular expression matching against the regular expression that is defined in matches
.
A sample condition
could look like:
This condition
has the type variablematcher
. It checks the field {meta.singleDigCollection}
, which corresponds to the singleDigCollection
value of the metadata file. The condition tries to match this field against the regular expression \d{20}
, i. e. checks if the field consists of 20 digits.
This is a technical documentation for the Goobi Export plugin to export to different directories for the Klassik Stiftung Weimar.
This documentation describes the installation, configuration and use of an export plug-in in Goobi, as required for the Klassik Stiftung Weimar within the digitization project. With the help of this Goobi export plug-in, Goobi processes can be simultaneously exported to several locations within a single workflow step. The special features of the Klassik Stiftung Weimar, such as the use of EPNs as identifiers and the merging of covers and sheet cuts into a common structural element, remain unchanged.
The plugin must first be installed in the following directory:
In the Goobi configuration directory, the additional plug-in configuration file must be made available under the following path during installation:
The content of the configuration file is structured as follows:
The exportFolder
list can be used to define various locations to which the export is to be made. Any number of folders can be defined. However, at least one folder must be defined at this point.
In order to be able to use the export plug-in within the workflow after successful installation, a work step must be defined in which the Export DMS function was activated. In addition, the value HaabExport
must be entered as the step plug-in.
Identifier
intranda_export_singleImage
Repository
Licence
GPL 2.0 or newer
Last change
05.09.2024 06:56:43
Identifier
intranda_dashboard_extended
Repository
Licence
GPL 2.0 or newer
Last change
04.09.2024 10:37:29
<itm-show>
This parameter defines whether the currently running jobs of the intranda Task Manager should be displayed.
<itm-cache-time>
This value is specified in milliseconds and defines how often the values from the intranda Task Manager should be updated.
<itm-url>
The URL at which the intranda Task Manager can be accessed is specified here.
<rss-show>
This parameter defines whether news that can be retrieved via RSS feed should be displayed.
<rss-cache-time>
This value is specified in milliseconds and indicates how often the RSS feed should be updated.
<rss-url>
This parameter specifies the website from which the RSS feed is to be loaded.
<rss-title>
The title that is to appear above the news items is defined here.
<search-show>
This parameter determines whether the Search
form should be displayed.
<tasks-show>
This parameter defines whether the Recently completed tasks
area should be displayed.
<tasks-show-size>
Here you can specify how many of the recently completed tasks should be displayed.
<tasks-history>
This can be used to display the history of the last tasks.
<tasks-history-title>
This parameter can be used to specify which task type is to be displayed.
<tasks-history-period>
This parameter defines the maximum length of time (in days) that may have passed since the last edit for it to still be displayed.
<tasks-latestChanges>
Here you can specify whether the most recently processed tasks should be displayed.
<tasks-latestChanges-size>
This parameter specifies the number of the last changes to be shown.
<statistics-show>
Here you define whether statistics are to be displayed.
<batches-show>
This parameter specifies whether the batches are to be displayed.
<batches-timerange-start>
Here you specify how many months ago the batches were started to be processed so that they are displayed.
<batches-timerange-end>
Here you can specify how many months after the start of processing the batches are displayed.
<processTemplates-show>
This parameter defines whether the production templates are to be displayed.
<processTemplates-show-statusColumn>
Here you can specify whether the status column should be displayed.
<processTemplates-show-projectColumn>
Here you can specify whether the project column should be displayed.
<processTemplates-show-massImportButton>
Here you can specify whether the bulk import button should be displayed.
<queue-show>
.
This parameter defines whether the dashboard should display how many processes are currently in the queue and what their status is.
Identifier
intranda_export_heris
Repository
Licence
GPL 2.0 or newer
Last change
13.08.2024 14:38:49
Identifier
intranda_export_configurable
Repository
Licence
GPL 2.0 or newer
Last change
07.09.2024 08:52:10
project
This parameter determines for which project the current block <config>
should apply. The name of the project is used here. The <config>
block with the project
*
is always used if no other block matches the project name.
target
This parameter has 3 mandatory attributes: In the key
parameter, a Goobi variable of the form {meta.metadata name}
should be used. The attribute value
can then be used to specify the desired value. If value=""
is set, the condition will be met if the metadata is empty or not set. The attribute projectName
should contain the name of the export project with whose settings the export is to take place. If an empty string is assigned to the attribute projectName=""
, the settings of the project of the operation will be used for export. If no target condition is set, a normal export will be performed. An export is triggered for each target condition that applies.
includeMarcXml
This parameter determines whether any existing MARC-XML data should be embedded in the exported metafile. The default value is false
.
media
Here you can define whether and how the media folder should be exported.
master
Here you can define whether and how the master folder should be exported.
ocr
Here you can define whether and how the ocr folder should be exported.
source
Here you can define whether and how the source folder should be exported.
import
Here you can define whether and how the import folder should be exported.
export
Here you can define whether and how the export folder should be exported.
itm
Here you can define whether and how the TaskManager folder should be exported.
validation
Here you can define whether and how the validation folder should be exported.
genericFolder
Here you can define a folder free configurable that should be exported.
sourceFolderSuffix
This sub-element of the ocr
element is needed when using OCR folders with different suffixes. It specifies which OCR folders should be exported. If not specified, then all OCR folders will be exported.
destinationFolder
This sub-element of all folder elements except ocr
allows you to configure which files are to be exported to which folder using its two attributes name
and exportFileRegex
Identifier
intranda_export_nli_pdf_to_folder_structure
Repository
Licence
GPL 2.0 or newer
Last change
18.07.2024 01:42:57
exportFolder
Main directory for the export (e.g. /opt/digiverso/export
)
metdataPublicationDate
Metadata for the publication date; the syntax for the VariableReplacer can be used here (e.g. $(meta.DateOfOrigin)
)
metdataPublicationCode
Metadata for the publication code; the syntax for the VariableReplacer can be used here (e.g. $(meta.Type)
)
dateReadPattern
Pattern for reading the publication date (e.g. yyyy-MM-dd
)
dateWritePattern
Pattern for writing the current and publication date (e.g. ddMMyyyy
)
Identifier
intranda_export_newspaper
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:03:52
Identifier
intranda_export_fedora
Repository
Licence
GPL 2.0 or newer
Last change
13.08.2024 14:26:51
fedoraUrl
REST Endpoint of the Fedora application
useVersioning
If true
, the versioning of Fedora is used. In this case, each time the export step is executed, a new version of the process is created in the repository. The default value is true
.
ingestMasterImages
If true
is set, the master images of the operation are exported to the subcontainer /master
. The default value is true
.
ingestMediaImages
If true
, the derivatives of the operation are exported to the /media
subcontainer. The default value is true
.
ingestMetsFile
If true
is set, a METS/MODS file is created and exported to the container. Default value is true
.
exportMetsFile
If true
is set, a METS/MODS file is created and written to the usual export folder (e.g. /hotfolder
). Default value is true
.
Identifier
intranda_export_adm_bsme
Repository
Licence
GPL 2.0 or newer
Last change
14.10.2024 09:27:36
targetDirectoryNewspapers
Target directory for Newspapers
targetDirectoryMagazines
Target directory for Magazines
targetDirectoryPositives
Target directory for Positives
targetDirectoryNegatives
Destination directory for Negatives
targetDirectorySlides
Target directory for Slides
targetDirectoryGeneric
Target directory for Generic Prints
pdfCopyNewspapers
Target directory for generating PDF files for Newspapers
pdfCopyMagazines
Target directory for generating PDF files for Magazines
viewerUrl
URL for the Goobi viewer
rightsToUse
Indication of rights of use
rightsDetails
Details about the rights of use
source
Indication of the source of the digitised material
mediaType
Type of media
sourceOrganisation
Organisation responsible for the content
frequency
Frequency of publication
eventName
Naming the documented event
eventDate
Indication of the date when the event took place
eventTime
Indication of the time when the event took place
subject
General keywords
subjectArabic
Indication of keywords in Arabic
subjectEnglish
Specification of keywords in English
photographer
Information about the photographer of the picture
personsInImage
People shown in the picture
locations
Information on the location of the recording
description
Explanations and descriptions of the recording
editorInChief
Responsible Editor
format
Format information
envelopeNumber
Identifier of the envelope in which the documents are stored
backprint
Information about contents on the back
Identifier
prov_export_fedora
Repository
Licence
GPL 2.0 or newer
Last change
13.08.2024 14:28:03
fedoraUrl
REST Endpoint of the Fedora application
useVersioning
If true
, the versioning of Fedora is used. In this case, each time the export step is executed, a new version of the process is created in the repository. The default value is true
.
userName, password
Optional Basic HTTP Authentication. Both values must be set for authentication to take place.
ingestMaster
If true
is set, the master images of the operation are exported. The default value is true
.
ingestMedia
If true
is set, the derivatives of the transaction are exported. The default value is true
.
ingestJp2
If true
, the JPEG2000 images of the operation are exported to the /media
subcontainer. The default value is true
.
ingestPdf
If true
, the PDFs of the operation are exported to the /media
subcontainer. The default value is true
.
ingestMetsFile
If true
is set, a METS/MODS file is created and exported to the container. Default value is true
.
exportMetsFile
If true
is set, a METS/MODS file is created and written to the usual export folder (e.g. /hotfolder
). Default value is true
.
externalLinkContent
External URL using a 10-character barcode and the unit item code.
externalLinkContentPID
External URL using a 36-character PID.
fullPartialContent
availableMetadataQuery
Optional SPARQL query to add the publication date to the root container attribute of the plant. The process property available
must be set for this.
imagesContainerMetadataQuery
Optional SPARQL query to add additional attributes and links to the /images
container.
filesContainerMetadataQuery
Optional SPARQL query to add additional attributes and links to the /files
container.
imageFileMetadataQuery
Optional SPARQL query to write additional attributes for all image files in the repository (under e.g. ../00000001.tif/fcr:metadata
).
Identifier
intranda_export_standford
Repository
Licence
GPL 2.0 or newer
Last change
04.09.2024 08:59:05
tempDestination
If this element is present and not empty, the metadata will be written to this folder as dor_export_{objectId}
.xml`.
destination
Root directory for the exported data.
metadataFileName
Name of the metadata file, containing entries for each exported file.
dela
If this element is present and contains a number greater than 0, the configured number of seconds will be waited after successful export before calling the REST API.
apiBaseUrl
Base URL for the REST API.
endpoint
Endpoint for the REST API.
accessToken
Contains the token required for authenticating the REST API.
queryParameter
Contains a query parameter in the attributes name
and value
, which is appended to the URL as &name=value
. This field is repeatable.
Identifier
intranda_export_vlm
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:03:37
identifier
This parameter determines which metadata is to be used as the folder name. It has two optional attributes @anchorSplitter
and @volumeFormat
which will be used for the case when the value of this identifier
itself contains both main folder's name as well as volume's name, separated by this configured @anchorSplitter
. The attribute @volumeFormat
is used in this case as the left padding for the volume's name.
volume
This parameter controls with which metadata the subdirectories for volumes are to be named.
path
This parameter sets the export path where the data is to be exported. An absolute path is expected.
condition
This element is optional and can be present multiple times to define additional conditions under which this configuration can be used. The format of condition
elements is described down below. A configuration section can only be processed, if all conditions apply. In case multiple configuration sections exist and more than one applies, the configuration section with the highest number of conditions is selected (more specialized conditions have a higher priority). If this is still not unique, any of the applying configurations can be chosen. In this case, an error message will be shown to the user.
subfolderPrefix
This parameter describes the prefix to be placed in front of each volume of a multi-volume work in the folder name. (Example T_34_L_
: Here T_34
stands for the recognition for the creation of a structure node of the type volume
and the L
indicates that a text comes after it.).
sftp
This parameter determines whether to use SFTP for the export process or not.
useSshKey
This parameter determines whether to use a SSH key file for the connection to the remote host.
knownHosts
This parameter determines where the file known_hosts
is. If left empty, then the default setting {user.home}/.ssh/known_hosts
will be used. Otherwise, it is an absolute path expected here.
username
This parameter determines the user name to log into the remote host.
hostname
This parameter determines the name of the remote host or its IP address.
port
This parameter determines the port number of the remote host that is to be used for the connection. The default value for this is 22.
password
This parameter determines the password to be used to log into the remote host as username
@hostname
.
keyPath
This parameter determines the path to the SSH key file to be used to log into the remote host as username
@hostname
.
Identifier
HaabExport
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:45:20
This is a technical documentation for the plugin for exporting selected images. It enables the export of selected images to the configured location in the file system or via SCP.
Identifier
intranda_export_selected_images
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:03:45
This documentation describes the installation, configuration and use of the plugin for exporting selected images in Goobi workflow.
With the help of this plugin for Goobi workflow, Goobi processes can export the previously selected images and, if desired, the associated METS file to a configured location either in the file system or via SCP within a single work step.
This plugin is integrated into the workflow in such a way that it is executed automatically. For use within a workflow step, it should be configured as shown in the screenshot below.
The plugin must first be copied to the following directory:
In addition, there is a configuration file that must be located in the following place:
The plugin is configured via the configuration file plugin_intranda_export_selected_images.xml
. The configuration can be adjusted during operation. The following is an example configuration file:
Import plugin for importing legacy data for the Federal Monuments Office in Austria
Identifier
intranda_import_bka_bda
Repository
Licence
GPL 2.0 or newer
Last change
26.08.2024 11:04:47
This documentation describes the installation, configuration and use of the plugin for the mass import of existing old data of the Federal Monuments Office in Austria. The starting point for the import are existing Excel files as well as provided directories with image files. The special structure of the Excel file made a significant revision of the standard Excel import plugin necessary, so that this plugin differs significantly from it.
To be able to use the plugin, the following files must be installed:
To use the import, the mass import area must be opened in the production templates and the plugin intranda_import_bka_bda
must be selected in the File Upload Import tab. An Excel file can then be uploaded and imported.
The import is then carried out line by line. A new process is created for each object and the configured rules are applied. If a valid data record has been created and the generated process title has not yet been assigned, the process is actually created and saved. Within the Excel file, subsequent lines belonging to the Goobi process to be generated are created with the desired structure type depending on the configuration. Associated images are also automatically transferred and assigned to the generated structural elements and processes.
The configuration is done via the file plugin_intranda_import_bka_bda.xml
. This file can be adapted during operation.
It is possible to create a global configuration for all production templates as well as individual settings for single production templates. To do this, the config
element can be repeated in the XML file. If mass import is selected in Goobi, the system searches for the configuration block that contains the name of the selected production template in the template
element. If such an entry does not exist, the default
configuration is used. This is indicated by *
.
The following parameter can be used to globally define the publication type to be used:
Every process that is created in Goobi with this plugin receives the application type defined here.
The special feature of this plugin is that structural elements are to be generated from the partially repeating Excel table rows, which are to be created as sub-elements for the previously created publication type. The type to be used for this is specified with this parameter:
With the optional element collection
it is possible to define a collection to be inserted in all records. In addition, collections can also be selected from the interface, or the collection can be imported as part of the Excel file or from the catalogue.
The following elements describe the structure of the Excel file to be imported.
In rowHeader
it is defined in which row the column headers are entered that are later relevant for the mapping. Usually this is the first line. However, this can also deviate in the case of multi-line entries.
The elements rowDataStart
and rowDataEnd
describe the area that contains the data. Usually these are the lines that directly follow the rowHeader
, but in the case of special formatting there may also be empty lines that can be removed via this.
The entry identifierHeaderName
contains the heading of the column that contains an identifier. This field is used internally to identify the rows. In an OPAC query, this value is used. In addition, this value is also used for generating the case title if no other generation for case titles has been specified.
The processTitleRule
element is used to generate the process title. The same options are available here that can be used in the Goobi configuration file goobi_projects.xml
.
With the help of the elements imageFolderHeaderName
, imageFolderPath
and moveImages
, images can be imported in addition to the metadata. In imageFolderHeaderName
the column name is entered for this purpose, in which the folder names containing the images can be found in the Excel file. Either an absolute path or a relative path can be entered. If a relative path is specified, the element imageFolderPath
must contain the root
path to the images.
The element moveImages
can be used to control whether the images are to be copied or moved.
To import images from an S3 storage, the <imageFolderHeaderName>
parameter described above must also be set. The other two elements when importing images relate to file system operations and are therefore not necessary. The following area is used instead:
The element runAsGoobiScript
controls whether an import should be processed asynchronously in the background via the GoobiScript queue or whether the import should be processed directly within the user session. Here you have to decide which setting makes sense. If an import is to include images or if the Excel file contains a large number of data records, it is probably more sensible to perform this import as a GoobiScript.
Attention: If the column identifierHeaderName
does not contain a unique identifier or has not been configured, the option runAsGoobiScript
cannot be used.
The fields metadata
, person
and group
can be used to import individual columns as metadata or as transaction properties. For this purpose, each field contains a number of attributes and sub-elements.
The element metadata
is used to create descriptive metadata.
headerName
Attribut
Column titles in the Excel file
ugh
Attribut
Name of the metadata
property
Attribut
Name of the property
docType
Attribut
anchor
or child
normdataHeaderName
Attribut
Column title of a column with corresponding identifiers
opacSearchField
Attribut
Definition of which search field is to be used for the catalogue query. This is necessary for the use of the JSON opac plugin.
The attribute headerName
contains the column title. The rule only applies if the Excel file contains a column with this title and the cell is not empty. At least one of the two attributes ugh
and name
must exist. The field ugh
can contain the name of a metadatum. If this is the case (and the metadatum is allowed for the configured publication type), a new metadatum is created. A property with this name is created using name
.
The attribute docType
becomes relevant if a multi-volume work or a journal has been imported from the catalogue. It can be used to control whether the field should belong to the complete record or to the volume.
If, in addition to the content, another column with standard data identifiers or URIs exists, this column can be added in the attribute normdataHeaderName
.
This is a technical documentation for the ZOP Export Plugin. It enables the export into the ZOP instance of the ZB Zürich.
Identifier
intranda_export_zop
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:03:26
This documentation describes the installation, configuration and use of the ZOP export plugin in Goobi.
Using this plugin for Goobi, Goobi operations can be exported to the configured location for ZOP within one step.
This plugin is integrated into the workflow in such a way that it is executed automatically. For use within a workflow step, it should be configured as shown in the screenshot below.
The plugin must first be copied to the following directory:
In addition, there is a configuration file that must be located in the following place:
The plugin is configured via the configuration file plugin_intranda_export_zop.xml
. The configuration can be adjusted during operation. The following is an example configuration file:
identifier
This parameter determines which metadatum is to be used as the folder name.
volume
This parameter controls with which metadata the subdirectories for volumes are to be named.
path
This parameter sets the export path where the data is to be exported. An absolute path is expected.
sftp
This parameter determines whether to use SFTP for the export process or not.
username
This parameter determines the user name to log into the remote host.
hostname
This parameter determines the name of the remote host or its IP address.
keyPath
This parameter determines the privte key to be used to log into the remote host as username
@hostname
.
Import plugin for card catalogues from folder structures of the KatZoom system
Identifier
intranda_import_katzoom
Repository
Licence
GPL 2.0 or newer
Last change
13.07.2024 14:38:25
This documentation describes the installation, configuration and use of the plugin for transferring card catalogues from the KatZoom system to Goobi workflow.
The following files must be installed in order to use the plugin:
The archive management plugin must also be installed and configured. Instructions for this can be found at the following address: https://docs.goobi.io/goobi-workflow-plugins-en/administration/intranda_administration_archive_management
This plugin is a so-called Import plugin
. If you open the mass import area, you can then select the plugin in the Import from folder
tab.
The plugin expects the following structure for its execution within the configured import folder:
The user can now select the main folder to be imported in the lower area of the plugin in order to import the corresponding card catalogue. Please note that a complete card catalogue is always imported at once. Partial imports are not provided.
A *.ind
file and optionally a *.lli
file are expected within the selected folder. The ind file contains the number of the first data record for each letter. The lli file, on the other hand, contains the number of the first data record for a drawer. As the shops do not exist for all card catalogues, this file is optional. Furthermore, a folder structure with up to 3 subfolders is expected, in which the individual files are located. The files always begin with a letter followed by a consecutive number and the file extension. Different derivatives can exist for each object, which then have the same name apart from the file extension. An exception is a downsampled preview image, which starts with a different letter.
All file names are collected and sorted in ascending order by number. For each card catalogue, you can specify whether only the front (e.g. hhn HHStA Nominal
) or the front and back were scanned (e.g. ank to 45 Nominal
). In the first case, a data record is generated from each number, in the second case from each odd number. The following even number is then the reverse of the data record.
For each data record, the corresponding letter, if available, the drawer and the positions within the catalogue, the letter and the drawer are determined. This information is saved with the original folder structure as metadata.
In addition, a stock is created in archive management for each card catalogue. The main node corresponds to the catalogue, followed by the letters. Within the letters, there are optional individual shops, followed by the individual data records.
The fonds are named after the individual catalogues.
The plugin is configured in the file plugin_intranda_import_katzoom.xml
as shown here:
Firstly, the production templates for which the import is to apply are defined within <template>
.
The archive stock is then configured within the archive management plugin and the import folder in which the folders for the individual card catalogues are expected is specified. The element <backsideScan>
contains the names of the card catalogues for which the backside has also been digitised. If a catalogue is missing from this list, the import assumes that only the front side exists.
The name of the collection can be specified in the <collection>
element. This information is written to each data record. The <doctype>
element contains the structure type to be generated and the other information contains the names of the individual metadata.
This import plugin for Goobi workflow allows data to be imported without a catalogue query, as is required for ETH Zurich, especially for multi-volume works.
Identifier
intranda_import_eth_no_catalogue
Repository
Licence
GPL 2.0 or newer
Last change
16.02.2025 11:25:03
This import plugin allows data to be imported without a previous catalogue query. It inserts data into the user interface that has previously been copied from an Excel file and where the columns are separated from each other using TAB
.
To be able to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be accessed from the overview of production templates by using the second blue button next to the selected production template.
Once the plugin has been entered, a user interface is available in which the data to be imported can be selected or uploaded.
After selecting the correct plugin, the data that is either available as TAB-separated CSV data or copied from an Excel file can be inserted into the ‘Data records’ field in the user interface. The data has the following structure:
If four columns are used, they have the following structure:
1
MMS-ID
If this contains an underscore, a multi-volume work is created, otherwise a monograph. This is a mandatory entry.
2
Signature
This is a mandatory entry.
3
Collection
Specification of the collection to be assigned. This is a mandatory entry.
4
Title
This is an optional specification.
If two columns are used, they have the following structure:
1
Signature
This is the mandatory signature information.
2
Date
This is the mandatory entry of the digitisation date.
If eight columns are used, they have the following structure:
1
Identifier
This is the mandatory specification of the identifier.
2
Signature
This is the mandatory signature information.
3
Collection
This is the mandatory information for the collection.
4
Date
This is the mandatory entry of the digitisation date.
5
Units
This is the mandatory specification of the units.
6
Scans
This is the mandatory information for the scans.
7
dpi
This is the mandatory information for the resolution.
8
Remarks
This is the mandatory information with comments.
Immediately after inserting the data and clicking on ‘Save’, the creation of the processes starts without a catalogue being requested.
The plugin is configured in the file plugin_intranda_import_eth_no_catalogue.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
template
This can be used to define which production template the respective config
block should apply to.
runAsGoobiScript
This parameter can be used to specify whether the import should take place as GoobiScript in the background.
This is a technical documentation for the import plugin of archive data from a hierarchically organised Excel file.
Identifier
intranda_import_crown
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:03:18
This documentation describes the installation, configuration and use of the import plugin for archive data from a hierarchically organised Excel file.
This plugin can be used to import data from an Excel file. The individual lines are converted to Goobi processes and images can be imported automatically. In addition, a hierarchical EAD tectonics is also created.
To be able to use the plugin, the following files must be installed:
In addition, the XML database BaseX
must be running in the background and set up correctly. The installation is described in detail here.
To use the import, the mass import area must be opened in the production templates and the plugin intranda_import_crown
selected in the file upload import tab. An Excel file can then be uploaded and imported.
The Excel file to be imported must contain the following structure as an example:
Shelfmark
Comment
CR_1
Reichskrone
CR_1
comment
CR_1_A-H
Kronreif
CR_1_A-H
another comment
CR_1_A
Platte A, Stirnplatte
CR_1_A_GrPl
Grundplatte
CR_1_A_GrPl_1
Riss in Grundplatte (?)
CR_1_A_GrPl_2
Riss in Grundplatte und Grundplattenperldrahtumsäumung
CR_1_A_GrPl_3
Riss in Grundplatte
CR_1_A_GrPl_4
Riss in Grundplatte und Grundplattenperldrahtumsäumung
CR_1_A_GrPl_5
Deformierung von Grundplatte
CR_1_A_GrPl_6
Steg durch Öffnung in Grundplatte hinter Fa_4
CR_1_A_GrPl_7
4 Löcher in Grundplatte
CR_1_A_GrPl_8
Löcher in Grundplatte
CR_1_A_GrPl_9
4 Löcher in Grundplatte
CR_1_A_GrPl_10
angelöteter Span auf Grundplatte
CR_1_A_SchS
Scharnierstift
CR_1_A_SchR
Scharnierrohre
CR_1_A_SchR_1
Scharnierrohr
CR_1_A_SchR_2
Scharnierrohr
CR_1_A_SchR_3
Scharnierrohr
CR_1_A_GrUm
Grundplattenperldrahtumsäumung
CR_1_A_GrUm_1
Grundplattenperldrahtumsäumung
CR_1_A_GrUm_2
Grundplattenperldrahtumsäumung
CR_1_A_GrFi
Grundplattenfiliigrandekor
CR_1_A_RoeG
Röhrchen mit Granalien
CR_1_A_RoeG_1
Röhrchen mit Kugelpyramide
CR_1_A_RoeG_2
Röhrchen mit Kugelpyramide
This Excel file is read and analysed line by line during the import. It first checks how deeply the current row has been indented. If there is no indentation, the root element of the tectonics is present. Otherwise, these are sub-elements. The parent element of each row is the last element with a lower indentation.
Next, the content of the cells is read. Both the hierarchically indented cells and any existing fixed columns are taken into account.
Which content is imported for which EAD or metadata field is defined in the corresponding configuration file.
If the first piece of information within the Excel file is formatted bold, a process is also created for this row and a search is carried out for associated images. These images are expected within a configured folder in subfolders named after the inventory number. These can either be organised flat in a folder list or follow the same hierarchical structure as the tectonics.
If a folder is found, all the files it contains are listed and checked according to the following rules:
ignore all data that is not a tif
, jpg
or wmv
.
ignore all files that contain the word compressed
.
if a file without the suffix _edited
is found, check if there is a file with the same name and the suffix _edited
. If so, ignore the current file and use the version with _edited
.
if an jpg
file was found, check if there is a tif
with the same name, if yes, ignore the jpg
file and use the tif
The configuration is done in the file plugin_intranda_import_crown.xml
:
The <template>
field defines the production template for which the current configuration is to be used. As the <config>
element is repeatable, different configurations are possible for different production templates. For example, there may be a different configuration for the imperial crown than for the imperial orb.
The <runAsGoobiScript>
field controls whether the import is executed directly in the user session or in the background as GoobiScript. The use of GoobiScript is recommended for larger Excel files.
<startRow>
determines which row is the first data row of the Excel file. This allows further information such as headers, descriptions or help texts to be specified above it, which are then ignored by the import.
The <basex>
area defines where the EAD tectonics are saved. The sub-element <database>
contains the name of the BaseX database, which must already exist. The name of the EAD file is defined in <filename>
. If this name is already used, existing data is overwritten.
The root folder of the images is defined in the <images>
element. <metadata>
contains the metadata to be used. The structure type is defined using <doctype>
and the fields <title>
, <identifier>
and <description>
contain the names of the metadata for title, inventory number and description text.
The mapping of the metadata takes place within the <metadata>
block. The publication type to be used for the individual METS files is defined here in <doctype>
.
The node type to be used can then be defined if it is available as an Excel column. This is done in <nodetype>
. If this is not the case, the field can be left empty. In this case, file
is used for all nodes for which a process has been created; all other nodes are assigned the type folder
.
The generation of task titles is configured in <title>
. The same rules apply here as in the normal creation mask. In addition, the two keywords first
and second
are available to access the content of the two hierarchical fields.
The metadata mapping to EAD and METS/MODS is then configured. The first hierarchical field is defined in <firstField>
, <secondField>
optionally contains the content of the second field. If only one field is used, it can be deactivated using enabled="false"
. Additional, permanently defined columns can be configured using <additionalField>
. Here, the heading of the column must be specified in the column
attribute. The other configuration options are identical to the other two. The metadataField
field defines the metadata to be used within the METS/MODS file. The corresponding field in the EAD node is defined in eadField
and level
specifies the area in which the metadata is located.
In addition, a field must be marked as identifier="true"
. The content of this field must be unique for each line within the document and is used for the id
of the EAD nodes and the metadata NodeId
. It is used to link EAD nodes and Goobi processes.
This import plugin for Goobi workflow allows data to be imported with a subsequent catalogue query from CMI, as required for the Zentralbibliothek Zürich.
Identifier
intranda_import_zbz_cmi
Repository
Licence
GPL 2.0 or newer
Last change
23.08.2024 11:12:49
This import plugin allows you to import data with an CMI catalogue query. Data that was previously copied from an Excel file is inserted into the user interface.
To be able to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be accessed from the overview of production templates by using the second blue button next to the selected production template.
Once the plugin has been entered, a user interface is available in which the data to be imported can be selected or uploaded.
After selecting the correct plugin, the data, which is either available as TAB-separated CSV data or copied from an Excel file, can be inserted into the Records
field in the user interface. The data has the following structure:
1
CMI-ID
If this contains an underscore, a multi-volume work is created, otherwise a monograph. This is a mandatory entry.
Immediately after inserting the data and clicking on Save
, the creation of the processes starts without a catalogue being requested.
The plugin is configured in the file plugin_intranda_import_zbz_cmi.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
template
This can be used to define which production template the respective config
block should apply to.
runAsGoobiScript
This parameter can be used to specify whether the import should take place as GoobiScript in the background.
catalogue
The catalogue to be used for the query is defined here. This must be defined within the configuration file goobi_opac.xml
.
searchField
This parameter defines in which field of the catalogue the search for the identifier should take place.
This import plugin for Goobi workflow allows data to be imported with a subsequent catalogue query from ALMA, as required for the Zentralbibliothek Zürich.
Identifier
intranda_import_zbz_alma
Repository
Licence
GPL 2.0 or newer
Last change
23.08.2024 11:12:32
This import plugin allows you to import data with an ALMA catalogue query. Data that was previously copied from an Excel file is inserted into the user interface.
To be able to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be accessed from the overview of production templates by using the second blue button next to the selected production template.
Once the plugin has been entered, a user interface is available in which the data to be imported can be selected or uploaded.
After selecting the correct plugin, the data, which is either available as TAB-separated CSV data or copied from an Excel file, can be inserted into the Records
field in the user interface. The data has the following structure:
1
MMS-ID
If this contains an underscore, a multi-volume work is created, otherwise a monograph. This is a mandatory entry.
Immediately after inserting the data and clicking on Save
, the creation of the processes starts without a catalogue being requested.
The plugin is configured in the file plugin_intranda_import_zbz_alma.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
template
This can be used to define which production template the respective config
block should apply to.
runAsGoobiScript
This parameter can be used to specify whether the import should take place as GoobiScript in the background.
catalogue
The catalogue to be used for the query is defined here. This must be defined within the configuration file goobi_opac.xml
.
searchField
This parameter defines in which field of the catalogue the search for the identifier should take place.
This is a technical documentation for the plugin to import newspaper articles including merging with existing processes.
Identifier
intranda_import_endnote
Repository
Licence
GPL 2.0 or newer
Last change
15.08.2024 06:16:52
This documentation describes the installation, configuration and use of the plugin to import journal articles from an Excel file exported from Endnote.
The plugin must be installed in the following folder:
There is also a configuration file, which must be located at the following location:
To use the import, the mass import area must be opened in the process templates and the intranda_import_endnote
plugin must be selected in the File upload import
tab. An Excel file can then be uploaded and imported.
The import takes place line by line. For each line, the process title is generated from the configured fields and checked to see whether the volume already exists in Goobi. If this is not the case, the process is created again and the configured metadata for anchor
and volume
is imported.
Now it is checked whether an issue should be created. This is done on the basis of the Issue
column. If the field is empty, the article is appended directly to the year, otherwise the correct issue is searched for. If it does not exist yet, it will also be created. The sorting of the issues is based on the number of the Issue
column.
The article is then created and added to the issue or volume. If several articles exist, the sorting is done by specifying the start page from the Pages
column.
The configuration is done via the configuration file plugin_intranda_import_endnote.xml
and can be adapted during operation.
The configuration basically allows different configurations for different process templates. For this purpose, only the name of the desired template must be entered in the template
field. The entry with the value *
is used for all templates for which no separate configuration exists.
The processTitleGeneration
element defines the rules with which the process title is to be generated. The same conventions apply as in goobi_projects.xml
. The two values ATS
(author-title-key) and TSL
(title-key) are automatically generated from the available metadata, for the use of further metadata the column names from the Excel file can be used.
The elements anchorDocType
, volumeDocType
, issueDocType
and articleDocType
define the structural elements to be used for the elements journal, volume, issue and article. They must exist in the ruleset.
This is followed by the mapping of the metadata. The metadata
element is used for this purpose. Three attributes are allowed, in ugh
the metadata name from the ruleset is stored, in headerName
the heading of the column from the Excel file and in docType it is defined whether the metadata should be added in the journal title (anchor
), volume
or article (child
).
Import plugin for translating MAB2 and SGML data into METS-MODS
Identifier
intranda_import_mab
Repository
Licence
GPL 2.0 or newer
Last change
25.08.2024 10:43:08
The program examines the provided MAB2 file and translates the fields into metadata for a METS-MODS file. If available, an SGML file is also examined to specify the structural data.
To use the plugin, the following files must be installed:
The goobi-plugin-import-mab.jar file contains the program logic and is an executable file.
The goobi-plugin-import-mab.xml file is the configuration file.
The mappings mapMVW and mapChildren are generated. To do this, the jar file is started with the path to the configuration file as the first parameter, and the path(s) to the MAB files to be processed as additional parameters. This generates and saves the mapping files. This only needs to be done once unless new MAB files are added.
The program is opened as a JAR with the path to the config file as the only parameter. The paths to the mab2 file, etc., are read from the configuration file, and the MAB2 file is processed.
For each dataset in the file, a MetsMods document is generated with appropriate metadata. The translation of individual fields occurs using the tags file.
If withSGML
is set to true
, the program searches the sgmlPath
folder for SGML files named after the CatalogID. The METS document then receives the structure from these files.
For each page in the document, the program searches for images in the imagePathFile
folder in subfolders named after the CatalogID. These are then copied to the image folder, and references are created in the StructMap.
NOTE: Currently, the images are NOT copied with the correct permissions. This means that before importing into Goobi, all generated folders and files must be assigned to the tomcat8
user using sudo chown -R tomcat8 *
.
Afterward, the processes can be imported using the Goobi Folder Import.
The configuration of the plugin is done in the goobi-plugin-import-mab.xml file as shown below:
The following table contains a summary of the parameters and their descriptions:
project
Define for which project this configuration shall be used.
rulesetPath
Provides the path to the ruleset for the METS files.
imagePathFile
Specifies the path to the image files, which are located in a subfolder named after the CatalogID.
outputPath
Specifies where the finished MetsMods folders are copied, with subfolders named after the CatalogID.
mabFile
Specifies the MAB2 file to be read.
tags
Specifies the translation file that translates MAB2 codes into METS metadata.
withSGML
If set to true
, the program searches the sgmlPath
folder for SGML files named after the CatalogID. These files are used to give structure to the METS document.
defaultPublicationType
Specifies the type of the document in METS if it has no children or parents. A document with children is imported as a MultiVolumeWork, and the children are imported as Volumes.
singleDigCollection
Specifies the singleDigCollection
metadata for the METS files.
mapMVW
Specifies the path to the JSON file where the MultiVolumeWork IDs along with a list of all associated volume IDs are stored.
mapChildren
Specifies the path to the JSON file where the MultiVolumeWork IDs along with a list of all associated volume IDs are stored.
importFirst
Specifies how many processes should be created. If set to 0
, all are created.
listIDs
Specifies the path to a text file containing a list of IDs. If the file exists and is not empty, ONLY processes with these IDs will be created.
allMono
Set this to true
for the special case where all documents to be imported should be stored as "Monograph" and not as Volume, even if they are children.
This is technical documentation for the plugin to import Sisis SunRise files to processes in Goobi workflow.
Identifier
intranda_import_sisis_sunrise_files
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:03:00
This documentation describes the installation, configuration and use of the plugin to import Sisis SunRise files.
The plugin must be installed in the following folder:
There is also a configuration file, which must be located at the following location:
Additionally there is a tags
file, whose location is specified in the configuration file:
To use the import, the mass import area must be opened in the process templates and the intranda_import_sisis_sunrise_file
plugin must be selected in the File upload import
tab. A Sisis SunRise file can then be uploaded and imported.
The import takes place in several steps. First the whole file is read, and the maps child-parent and parent-children are created and saved (as JSON files) in the Goobi temp
folder for the current user. These maps are used to create anchor files in the next step.
The Sisis SunRise file is then broken into individual records. For each record, the process title is generated from the Catalogue Identifier (and any prefix specified in the configuration file) and checked to see whether the process already exists in Goobi. If this is not the case, the process is created and the configured metadata for anchor
and volume
is saved temporarily in a folder in the output path specified in the configuration. Any images are copied into an ´images` subfolder.
In the next step all these folders, containing the MetsMods files and the images, are imported into Goobi workflow as processes, and moved to the appropriate folders in Goobi.
The configuration is done via the configuration file plugin_intranda_import_sisis_sunrise_file.xml
and can be adapted during operation.
The configuration allows different configurations for different process templates. For this purpose, only the name of the desired template must be entered in the template
field. The entry with the value *
is used for all templates for which no separate configuration exists.
rulesetPath
This is the path to the ruleset for the MetsMods files.
imagePathFile
This parameter defines the path to the image files, which are located either in the folder itself or in subfolders with the name of the Catalogue identifier.
tags
This parameter defines the translation file that translates the codes into metadata.
withSGML
If this parameter is set to true
, then SGML files are used. Note that this is currently not in use, but intended for a later version.
sgmlPath
If SGML files are used, this is the folder in which they are found.
defaultPublicationType
With this parameter the Type of the document is defined if it has no children or parents. A document with children is imported as MultiVolumeWork, the children are imported as Volumes.
collection
This specifies the metadata singleDigCollection
for the MetsMods files, the name of the collection to which the works belong.
listIDs
Here you define the path to a text file containing a list of Catalogue Identifiers. If this field is not empty, then only datasets with these Catalogue Identifiers will be imported from the Sisis SunRise file.
A tags file may look something like this:
Each line contains a code, followed by the name of the metadata which it should be translated to. Every metadata type in the list must be defined in the ruleset used for the project into which the file is to be imported, and the CatalogIDDigital
must be defined, as it is used to create the process ID.
This import plugin for Goobi workflow allows data to be imported without a catalogue query, as is required for the Zentralbibliothek Zurich, especially for multi-volume works.
Identifier
intranda_import_eth_no_catalogue
Repository
Licence
GPL 2.0 or newer
Last change
23.08.2024 11:12:56
This import plugin allows data to be imported without a previous catalogue query. It inserts data into the user interface that has previously been copied from an Excel file and where the columns are separated from each other using TAB
.
To be able to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be accessed from the overview of production templates by using the second blue button next to the selected production template.
Once the plugin has been entered, a user interface is available in which the data to be imported can be selected or uploaded.
After selecting the correct plugin, the data, which is either available as TAB-separated CSV data or copied from an Excel file, can be inserted into the Records
field in the user interface. The data has the following structure:
1
MMS-ID
If this contains an underscore, a multi-volume work is created, otherwise a monograph. This is a mandatory entry.
2
Shelfmark
This is a mandatory entry.
3
Title
This is an optional specification.
Immediately after inserting the data and clicking on Save
, the creation of the processes starts without a catalogue being requested.
The plugin is configured in the file plugin_intranda_import_zbz_no_catalogue.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
template
This can be used to define which production template the respective config
block should apply to.
runAsGoobiScript
This parameter can be used to specify whether the import should take place as GoobiScript in the background.
This is the technical documentation for the plugin for importing Excel files.
Identifier
intranda_import_excel
Repository
Licence
GPL 2.0 or newer
Last change
13.08.2024 14:33:43
This documentation describes the installation, configuration and use of the plugin for mass importing data sets from Excel files.
The plugin must be installed in the following folder:
There is also a configuration file, which must be located at the following place:
To use the import, the mass import area must be opened in the production templates and the plugin intranda_import_excel
selected in the File upload import tab. An Excel file can then be uploaded and imported.
The import then takes place line by line. A new process is created for each line and the configured rules are applied. If a valid data record has been created and the generated task title has not yet been assigned, the task is actually created and saved.
The configuration is done via the file plugin_intranda_import_excel.xml
. This file can be adapted during operation.
It is possible to create a global configuration for all production templates as well as individual settings for individual production templates. The element config
can be repeated in the XML file. If mass import has been selected in Goobi, the system always searches for the configuration block with the name of the selected production template in the template
element. If such an entry does not exist, the default
configuration is used. This is marked with *
.
With the optional element collection
it is possible to define a collection to be inserted into all records. In addition, collections can also be selected from the interface, or the collection can be imported as part of the Excel file or from the catalog.
The next four elements use-Opac
, opacName
, opacHeader
and searchField
control whether a catalogue query should be performed during the import. If useOpac
contains the value true
, such a query is performed. The catalogue and the search field configured in the fields are used for this. The name of the catalogue must correspond to an entry in the Goobi configuration file goobi_projects.xml
. It can either be permanently defined in the opacName
parameter or used dynamically from a line of the relevant record (the opacHeader
). The structure type is automatically recognised by the OPAC data.
However, if no OPAC is used, the structure type of the operations to be created must be specified in the publicationType
field. The name used here must exist within the ruleset. If the OPAC is to be used, this field is not evaluated.
The following elements describe the structure of the Excel file to be imported.
rowHeader
defines the row in which the column headings that are later relevant for the mapping were entered. This is usually the first line. However, this can also differ for multi-line entries.
rowDataStart
and rowDataEnd
describe the area that contains the data. Usually, these are the lines that follow the rowHeader
directly, but special formatting can also contain blank lines that can be removed using this.
The identifierHeaderName
entry contains the heading of the column in which an identifier is contained. This field is used internally to identify the rows. This value is used for an OPAC query. In addition, this value is also used to generate the transaction title if no other generation has been specified for the transaction title.
The element processTitleRule
is used to generate the operation title. The same options are available here that can also be used in the Goobi configuration file goobi_projects.xml
.
The processTitleRule
can be provided with the additional parameter replacewith
. The character specified here (e. g. replacewith="_"
) replaces all special characters with this character.
The elements imageFolderHeaderName
, imageFolderPath
and moveFiles
can be used to import images in addition to metadata. In imageFolderHeaderName
the column name is entered, in which the folder names containing the images can be found in the Excel file. Either an absolute path or a relative path can be specified there.
If a relative path is specified, the element imageFolderPath
must contain the root path to the images. The element moveFiles
can be used to control whether the images should be copied or moved.
The element runAsGoobiScript
controls whether an import should be processed asynchronously in the background via the GoobiScript queue or whether the import should be processed directly within the user session. Here you have to consider which setting makes sense. If an import including images is to take place or if the Excel file contains a lot of data records, it probably makes more sense to perform this import as GoobiScript.
Note:
If the identifierHeaderName
column does not contain a unique identifier or has not been configured, the runAsGoobiScript
option cannot be used.
The fields metadata
, person
and group
can be used to import individual columns as metadata or as process properties. Each field contains a number of attributes and sub-elements.
The metadata
element is used to generate descriptive metadata..
headerName
Attribut
Column header in the Excel file
ugh
Attribut
Name of the metadata
property
Attribut
Name of the property
docType
Attribut
anchor
or child
normdataHeaderName
Attribut
Column header of a column with associated identifiers
opacSearchField
Attribut
Definition of which search field should be used for the catalogue query. This is necessary for the use of the JSON-Opac-Plugin.
The headerName
attribute contains the column header. The rule only applies if the Excel file contains a column with this title and the cell is not empty. At least one of the two attributes ugh
and name
must exist. The ugh
field can contain the name of a metadata. If this is the case (and the metadata is allowed for the configured publication type), a new metadata is created. name
creates a property with this name.
The docType
attribute becomes relevant if a multi-volume work or journal has been imported from the catalog. This can be used to control whether the field should belong to the entire recording or to the volume.
If, in addition to the content, there is another column with standard data identifiers or URIs, this column can be added to the normdataHeaderName
attribute.
The person
element can be used to automatically create persons.
ugh
Attribut
Name of the person role
docType
Attribut
anchor
or child
normdataHeaderName
Attribut
Column header of a column with associated identifiers
firstnameFieldHeader
Element
Column header of field for first name
lastnameFieldHeader
Element
Column header for surnames
nameFieldHeader
Element
Column header for the complete name
splitName
Element
Defines whether the value in nameFieldHeader
should be split.
splitChar
Element
Element at which splitting takes place. Default is the first space character.
firstNameIsFirstPart
Attribut
Defines the order in which the data was entered.
Persons differ from normal metadata in that they consist of first and last names. This specification can be in two different columns, then the elements firstnameFieldHeader
and lastnameFieldHeader
are used. If the names are only in one column, the field nameFieldHeader
is used. In this case, the system checks whether the specifications should only contain the surname or whether the content must be split. With splitChar
you can set the character/sequence at which the splitting should take place. The attribute firstNameIsFirstPart
contains the information whether the name is to be imported as First name Last name
or Last name First name
.
Metadata groups can be created using the group
element.
ugh
Attribut
Name of the metadata group
docType
Attribut
anchor
or child
metadata
Element
Metadata within the group
person
Element
Person within the group
A metadata group consists of several metadata and persons. The configuration of the individual sub-elements is identical to that of the individual metadata and persons.
OPAC Plugin for data transfer of XML data records from an OPAC
This documentation describes the installation, configuration and use of the plugin. This plugin can be used to retrieve data from an external system and transfer it to Goobi. The catalog must have an API that can be used to deliver data records as XML.Details
The plugin consists of two files:
The file plugin_intranda_opac_xml-base.jar
contains the program logic and must be installed in the following directory, readable for the user tomcat
:
The file plugin_intranda_opac_xml.xml
must also be readable by the user tomcat
and be installed in the following directory:
Once the plugin has been fully installed, it is available in the creation screen.
When an identifier is searched for in Goobi, a request is made to the configured URL or to the filesystem in the background:
If a valid record is found here, it is searched for the field in which the document type is to be found. If the query is not defined, the document type is read from the configuration file instead. The required structure element is then created with the determined type.
All XPath expressions that have been configured are then evaluated. If data is found with an expression, the corresponding metadata is generated. For persons, the system checks whether the value contains a comma. In this case, first and last names are separated by commas, otherwise the value is interpreted as last name.
The configuration is done in the following files, located in the directory /opt/digiverso/goobi/config/
.
In the file goobi_opac.xml
the interface to the desired catalogue system must be made known. This is done by an entry that looks like this:
The attribute title
contains the name under which the catalog can be selected in the user interface, address
the URL to the API endpoint and opacType
the plugin to be used. In this case the entry must be intranda_opac_xml
.
Only one search query can be configured. Therefore the other search options can be hidden. This happens within the block <searchFields>
. In the configuration described above, only one identifier can be searched for.
The value of the address
- attribute must contain the string {pv-id}, so that the plugin inserts the search value at the right place. E.g. to filter in a hotfolder based on the file name e.g. /import/hotfolder/{pv.id}.xml
.
The plugin can also read files from the file system if needed. For example from a hotfolder where files are stored. To do this, the following must be observed. The string in address
must begin with file://
and the file must have a unique name that corresponds to the process title, for example.
The contents of the XML record are mapped to Goobi metadata in the plugin_intranda_opac_xml.xml
file:
The first step is to define the XML namespaces that are required to read the XML document. This is done in the <namespaces>
area, which contains all the namespaces used in <namespace>
elements. Each namespace is defined by the two attributes prefix
and uri
. If the XML can be read without namespaces, the area can remain empty or missing. The configuration shown here as an example refers to the conversion of EAD files obtained via an OAI interface.
The type to be used can be specified in the <docstructs>
area. This is done by using <documenttype>
. If the document type is to be configurable, there must be an element with the attribute isanchor="false"
. If multi-volume works or journals are to be created, a second element isanchor="true"
is required, in which the anchor type is defined.
Alternatively, the document type can also be read from the XML record. Then the element <doctumenttypequery>
is used, in which an XPath expression is defined that describes which field is to be used. In addition, there are a number of <docstruct>
elements that describe possible field contents. The attribute xmlName
contains the value from the XML document, rulesetName
contains the structure type to be created. If it is a multi-volume work, anchorName
must also be specified with the name of the higher-level structure type.
The mapping is then configured for persons and metadata in the <element>
area. Here is a list of <element>
with the attributes xpath
, level, xpathType
and name
. In xpath
an XPath expression is configured, which describes in which part of the XML document the content is expected, in name
the name of the metadata is defined, in which the content is to be written afterwards. The specification in level
can be used to control whether the metadata for multi-volume works is to be written to the data record of the anchor or the volume. xpathType
specifies the type of the result of the XPath query. This can be an Element
, Attribute
or String
.
Plugin for changing the publication type in the Goobi workflow
This plugin allows the modification of the publication type within the metadata editor of Goobi workflow.
To use the plugin, the following files must be installed:
After installation, the functionality of the plugin is available within the REST API of Goobi workflow.
Once the plugin is installed, a new function will appear in the metadata editor's menu, listing all installed and configured plugins. To use the plugin for changing the publication type, templates must first be created in the configured project. These templates need to be pre-populated with the desired metadata, and the process property for the label must be assigned. Once the templates are created, they will be available in a selection list.
When the user selects the plugin, a dialog window will open, listing the available templates for different publication types. The user can select the desired publication type and save the change.
When the publication type is switched, a backup of the existing metadata file is created first. Then, the metadata from the selected template is copied into the process. If the old record already contains pagination and page assignments, this data will also be transferred.
Finally, each configured metadata field is checked to see if it existed in the old record. If so, this metadata, including persons or groups, will be transferred to the new record. If a corresponding field with a default value already exists in the new record, it will be overwritten with the original data.
The plugin is configured in the file plugin_intranda_metadata_changeType.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
OPAC plugin for data transfer from Ariadne
This documentation describes the installation, configuration and use of the plugin. With the help of this plugin, data from the Mecklenburg-Vorpommern Ariadne archive portal can be retrieved and transferred to Goobi. The portal has an OAI interface through which the plugin obtains the data in a special EAD Goobi format.
The plugin consists of two files:
The files must be installed readable for the user tomcat
at the following paths:
A normal OPAC query can now be carried out in Goobi. To do this, the catalogue Ariadne
must be selected and the desired identifier entered. Please note that the identifier needs a prefix obj-
, e.g. obj-5602376
.
In the file goobi_opac.xml
the interface to the desired catalogue system must be made known. This is done by an entry that looks like this:
The mapping of the metadata takes place in the file plugin_intranda_opac_ariadne.xml
.:
In the field <ariadneUrl>
the URL to the OAI interface is configured.
The field <doctype>
contains the name of the structure element. The name used must be defined in the file goobi_opac.xml
. If the collection is to be generated from the EAD document, then it can be configured in the <collection>
element. To do this, the generate
attribute must be set to true
. Within prefix
a fixed prefix can be set, which will be prefixed to the collection name. Alternatively, the collection can be defined like a normal metadata.
Metadata is defined within the <metadatalist>
. There the repeatable <metadata>
element is allowed. This can have the following attributes:
Metadata extension for the creation of structural elements per image
This documentation describes the installation, configuration and use of the plug-in for creating structural elements per image within the metadata editor.
The following files must be installed in order to use the plug-in:
This plug-in is a so-called metadata editor plugin
. It can be selected in the metadata editor in the menu item for plug-ins under the name Generate structure elements
.
When it is selected, a pop-up opens in which the desired type of structural elements to be generated can be selected. All structural elements that are permitted in the rule set for the publication type in question are automatically available here.
You can also define how many images should be assigned to the respective structure element before the next structure element is created and whether a title should be created for the structure element. If this option is activated, the file name without extension is entered as the title for each structure element, provided that the main title is permitted in the selected type.
The generation of the structure elements will overwrite all existing elements.
The plug-in is configured in the file plugin_intranda_metadata_createStructureElements.xml
as shown here:
The configuration can be restricted to projects or to specific publication types. The fields <project>
and <doctype>
can be used for this purpose. In <defaultType>
you can define which structural element should already be preselected in the list. If the element defined here does not exist in the list of the current publication type or is empty, no element is preselected. In <numberOfImagesPerElement>
a value for the number of images per structure element can also be preset. This must be a positive, whole number. Both values can be changed by the user in the interface.
OPAC Plugin for the data transfer of MARC data records
This documentation describes the installation, configuration and use of the plugin. You can use this plugin to retrieve data from an external system and transfer it to Goobi. The catalog must have an API or URL that can be used to deliver records as OPACs.
The plugin consists of one file:
This file must be installed readable for the user tomcat
at the following path
When you search for an identifier in Goobi, a request is sent to the configured URL in the background.
After retrieving the actual record from the catalog, the metadata is mapped according to the rules configured in the rule set.
The plugin itself does not have its own configuration. Instead, all configuration is carried out by making adjustments within Goobi workflow or the associated rule sets.
In the file goobi_opac.xml
, the interface to the desired catalogue system must be made known. This is done by means of an entry that looks like this:
The title
attribute contains the name under which the catalog can be selected in the user interface, address
the URL to the API endpoint and database
the database to be used. The attribute opacType
must be set to the value GBV-MARC
.
The content of a MARC data record is mapped within the Goobi workflow ruleset used in each case. For more information on how to configure this mapping, see the UGH documentation here:
OPAC plugin for data transfer from Kalliope
This documentation describes the installation, configuration and use of the plugin. With the help of this plugin, data from the Kalliope database can be retrieved and transferred to Goobi. To transfer the data, the data from the Kalliope database is retrieved in MODS format and translated into Goobi's data format using a dedicated mapping file.
The plugin consists of a Java jar file, a Goobi configuration file and a metadata mapping file:
These files must be installed readable for the user tomcat
at the following paths:
When searching for an identifier in Goobi, a request is made in the background to the URL configured in the file goobi_opac.xml
. After retrieving the record in MODS format, the mapping of the metadata is done according to the rules configured in the file mods_map_kalliope.xml
.
The configuration file of the plug-in has the following structure:
The option <charset>
specifies the character set in which the data is delivered by the Kalliope interface. <mapping>
designates the file path to the metadata mapping file. The fields <defaultDocType>
and <defaultPicaType>
specify the publication type to be used for the document.
In addition to the configuration file of the plugin, the Kalliope catalogue must be made known in the file goobi_opac.xml
. This is done by an entry that looks like this:
OPAC Plugin for the data transfer of JSON data records
This documentation describes the installation, configuration and use of the plugin. You can use this plugin to retrieve data from an external system and transfer it to Goobi. The catalog must have an API that allows records to be delivered as JSON.
The plugin consists of three files:
The file plugin_intranda_opac_json-base.jar
contains the program logic and must be installed readable for the user tomcat8
at the following path:
The file plugin_intranda_opac_json-gui.jar
contains the user interface and must be installed readable for the user tomcat8
at the following path:
The file plugin_intranda_opac_json.xml
must also be readable by the user tomcat8
and must be located under the following path:
When you search for an identifier in Goobi, a request is sent to the configured URL in the background.
According to the configuration described above, this corresponds approximately to the following URL:
If further fields are defined for the catalogue query, these are also displayed in the user interface:
If a valid record is found under this URL, it will be searched for the fields defined within recordType
in which the document type should be located. If no fields are defined or they are not found, the type from the configured element defaultPublicationType
is used instead. The required structure element is then created with the determined type.
The configured expressions of the metadata
and person
are then evaluated in sequence. If data is found with an expression, the corresponding specified metadata is generated.
The configuration of the plugin is done in the following files located in the directory /opt/digiverso/goobi/config/
.
In the file goobi_opac.xml
the interface to the desired catalog system must be made known. This is done with an entry that looks like the following:
The attribute title
contains a unique name and opacType
the plugin to be used. In this case the entry must be intranda_opac_json
. The other fields are not required.
The mapping of the contents of the JSON dataset to Goobi metadata is done within the file plugin_intranda_opac_json.xml
. The definition of the fields within the JSON record is done using JSONPath
, the XPath equivalent for JSON.
The available catalogues are defined in individual <config name="XYZ">
blocks. The attribute name
contains the name under which the catalogue can be selected.
Different field types can be used within the catalogue:
The element <field>
is identified by the attribute id
. Within the entries, the element <type>
can be used to define which fields are available in the input mask. There are the different types text
, select
and select+text
. The type text
creates a simple input field, select
a selection list and select+text
both. The element <label>
contains the name under which the field is displayed in the interface and the entries in <select>
define which contents are contained in the selection list. Optionally, a default value can be specified. This is done with the element <defaultText>
.
The element is repeatable, so that the input mask can also contain several input fields.
One of the fields must contain the URL to the catalogue. This is defined within the element <url>
. To access the user input, the variables {id.select}
and {id.text}
are available, whereby id
must be replaced by the desired field identifier.
By means of <authentication>
it can be defined how the authentication towards the catalogue is to take place. The element can be missing or remain free if the catalogue allows anonymous access.
Otherwise two types are available. If only <username>
and <password>
are specified, a basic authentication takes place.
The second possibility is a login. Here the API defined in the field <loginUrl>
is called to get a valid session ID. Here the session ID is searched for in the field that is configured in <sessionid>
. The session ID is then used as a header parameter for the actual request. The parameter is set in <headerParameter>
.
The element <recordType>
contains the attributes field
, docType
and anchorType
. In field
a JSONPath expression is specified that is applied to the record. If the type is a multi-volume work or newspaper/magazine, the anchor
type to be used must be specified in the anchorType
field. If a field with such an expression exists, the document type defined in docType
is created. If not, the next configured recordType
will be checked.
There are a number of characters that are masked in this file. This includes characters such as < > & "
, which have a special meaning in XML and must therefore be specified as < > & "
. Also affected is the comma
, which must also be masked as \,
using a backslash.
If none of the definitions apply, a document can be created with the type from <defaultPublicationType>
. If this field is missing or empty, no record is created instead.
The two fields <metadata>
and <person>
are used to import individual content from the JSON record into the respective metadata. A number of attributes are available for this purpose:
The following URLs could be of further help for the installation or especially for the configuration of the plugin:
OPAC Plugin for the data transfer of EAD records using the example of the university archive of HU Berlin
This documentation describes the installation, configuration and use of an XML-based database to manage EAD files and integrate them into Goobi.
The concept of an EAD memory was chosen as the technical solution for the data transfer of EAD files. This is an XML database that can be repeatedly supplied with several updated EAD files and subsequently serves as a data source similar to a queryable catalog, which can be queried both by Goobi workflow and by the Goobi viewer in order to be able to query information about tectonics as well as detailed information about a data record.
The interposition of this EAD memory ensures that the EAD files can also be updated at any time and that the individual data records are always displayed with the current context, even if this has changed since the first data transfer.
BaseX is an XML database in which the EAD files can be managed, analyzed and queried. Java 1.8 is required to install BaseX.
First you have to download the database:
To install BaseX on a Linux system, first download the zip file and install it on the server. For example, this could be done in this path:
The Jetty configuration must then be adapted so that the application can only be accessed on localhost. To do this, make sure in the configuration file /opt/digiverso/basex/webapp/WEB-INF/jetty.xml
that the host
is set to 127.0.0.1
:
Then the Systemd Unit File is installed to this path:
This has the following structure:
The daemon must then be reloaded, the unit file activated and the database restarted:
To make the admin interface externally accessible, it can be configured in Apache
with the following section, for example:
Afterwards the Apache module proxy_http
must be activated and Apache must be restarted for the adjustments to take effect:
The XML database can be accessed under the following URL after installation:
The access data is admin
/admin
. After the first login a new password should be assigned first. The menu entry Users
must be opened for this. Here you can click on the account name and set the new password.
Then a new database for the EAD files can be created. For this purpose the menu entry Databases
must be selected. Click Create
to open the dialog. Here you have to assign a title for the database. All other settings can remain like this.
After the database has been created, EAD XML documents can now be added. The created database can be selected under Databases
. This opens a window in which the files belonging to the database can be managed. New files can be selected and uploaded via the Add
dialog. Here you can select an EAD file in the Input
field. With Add
the file is added and the overview page is loaded. Files can also be removed here. To do this, they must be marked with a checkbox and then deleted with Delete
. Updating an EAD file is only possible by deleting and adding it again.
Therefore a new file eadRequest.xq
must be created in the directory /opt/digiverso/basex/webapp/
.
This xquery module is executed when requests are sent via GET
to /search/{$identifier}
. If another endpoint
is to be used, this can be adjusted in the declare
area. When a request is made, the page:getRecord
function is executed. In the first line of the function, the database name to be used must be defined. If the information has been split between several databases, several files must be used with this function. The variable rest:path
must be uniquely defined.
Whether the configuration is correct can be tested with a query to the database:
Changes to the files or the databases can be made at any time during operation.
Once the database has been set up, it can be configured in Goobi. Since the metadata differs significantly from the bibliographic metadata of libraries, Goobi should use its own project and rule set. In addition, the OPAC plugin goobi-plugin-opac-ead
must be installed.
The file goobi_opac.xml
must be extended by two more entries. On the one hand, the document type to be used must be defined. This happens in the doctypes
area:
In this example, the file type (SingleRecord
in the rule record) is used.
The data source must also be defined:
The title
attribute contains the name under which the data source can be selected in Goobi. The config
element contains the URL to the previously defined REST interface in address
and the name of the database in database
.
This file is located in the /opt/digiverso/goobi/config/
folder and contains the mapping of the EAD elements to Goobi metadata.
The available namespaces are defined in the upper area, followed by the structure type to be generated. The attribute isanchor="true/false"
can be used to define whether a multi-volume object or an independent object is to be created. The metadata is then mapped in the mapping
area. Since EAD does not distinguish between persons and other metadata, only normal metadata can be created here. Each metadata
in name
contains the metadata as defined in the rule set. In level
, you specify where the metadata is to be created. Possible values are physical
, topstruct
and anchor
.
The file goobi_projects.xml
needs a new definition for the publication type and the new metadata.
Once this configuration has been completed, a new data source is available within Goobi within the creation mask for processes. This can now be queried using identifiers in the same way as other data sources and catalogs.
In the case of String
, manipulations such as concat, substring can also be used. The possible functions are described here:
JSONPath Online Evaluator:
JSONPath Description:
In order to set up the query interface for Goobi, the database must be made aware of what a query looks like, what is to be done with it, and what the result should look like. BaseX offers several options for this. We chose because it does not require authentication, unlike the interface.
In xpath
is the expression that is applied to the record to determine the value of the metadata. The xpathType
attribute describes the return value of the XPath expression. This can be either Element
, Attribute
, or String
.
Identifier
intranda_opac_xml
Repository
Licence
GPL 2.0 or newer
Last change
15.08.2024 06:23:25
Identifier
intranda_metadata_changeType
Repository
Licence
GPL 2.0 or newer
Last change
04.09.2024 10:07:12
<section>
is repeatable and thus allows different configurations for various projects.
<project>
Specifies for which project(s) the current section applies. The field is repeatable to allow a common configuration for multiple projects.
<titleProperty>
Contains the name of the process property where the label to be used is stored.
<templateProject>
Name of the project from which the templates should be read. All processes from the project that have a label will be listed.
<metadata>
List of metadata to be transferred from the original file to the new file.
Identifier
Ariadne
Repository
Licence
GPL 2.0 or newer
Last change
14.08.2024 18:40:13
ruleset
Name of the field in the ruleset
xpath
XPath expression with which the value can be found in the EAD document.
element
Name of the field in which the XPath expression is applied. Allowed are c
, did
, parentC
, parentDid
and record
.
doctype
Defines where the value is entered, possible assignment is logical
or anchor
.
xpathType
Determines whether the value is in an attribute or element.
replace
Regular expression to manipulate the found value
Identifier
intranda_metadata_createStructureElements
Repository
Licence
GPL 2.0 or newer
Last change
13.07.2024 14:38:34
Identifier
intranda_opac_marc
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:02:11
Identifier
goobi-plugin-opac-kalliope
Repository
Licence
GPL 2.0 or newer
Last change
14.08.2024 18:45:15
Identifier
intranda_opac_json
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:02:24
field
This configuration can be used to define additional query fields to be listed within the user interface.
authentication
Enter the access data for accessing the catalogue interface here.
recordType
This type is used to detect the document type of the JSON record.
defaultPublicationType
This type is used when no document type has been detected before.
metadata
This type is used to map JSON fields to metadata.
person
This type is used to map JSON fields to persons.
showResultList
This parameter can be used to specify that a selection list should be displayed after a catalogue query, allowing selection of the subrecord to be imported from a list.
urlForSecondCall
The URL specified here is used to have the ID of the selected sub-record for the query appended to the URL specified here.
metadata
Contains the name of the metadata or person
field
Path to the content within the JSON object
docType
May have the value anchor
or volume
. The default value is volume
. Fields marked with anchor
are only checked and imported for multi-volume works.
validationExpression
Regular expression, which checks if the found value matches the defined expression. If this is not the case, the value is ignored.
regularExpression
A regular expression to manipulate the value. This is applied after the validationExpression
check.
firstname
A regular expression that determines the first name of a person from the field contents.
lastname
A regular expression that determines the last name of a person from the field contents.
followLink
Defines whether the contained value is imported directly or contains a link to another data record.
templateName
Contains the name of the <config>
block to be used to analyse the new record.
basisUrl
Contains the base URL to be used if the link to the record is a relative path.
Identifier
intranda_opac_ead
Repository
Licence
GPL 2.0 or newer
Last change
14.08.2024 18:41:40
Statistics plugin for visualising user throughput
Identifier
intranda_statistics_user_througput
Repository
Licence
GPL 2.0 or newer
Last change
23.08.2024 13:53:00
This documentation describes the installation and usage of the User Throughput Plugin.
To install the plugin, the following files need to be installed:
This plugin does not require any additional configuration.
To limit the evaluation period, you can use the Start Date
and End Date
fields to specify the start and end dates. A date in the format YYYY-MM-DD
can be entered. Both fields are optional. If the start date is not specified, the date when the first step was completed will be used. If the end date is not specified, the current time will be used.
In the Unit
field, you can specify the time intervals in which the evaluation should be summarized. You can choose from Years
, Months
, Weeks
, or Days
.
In the Display
field, you can specify which figures should be displayed. You can choose from Pages
or Processes
.
After clicking the Calculate Statistics
button, the user throughput will be displayed in detailed tables. Below each table, there is also a link to download the table as an Excel file.
OPAC Plugin for the data transfer of PICA data records
Identifier
intranda_opac_pica
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:02:03
This documentation describes the installation, configuration and use of the plugin. You can use this plugin to retrieve data from an external system and transfer it to Goobi. The catalog must have an API or URL that can be used to deliver records as OPACs.
The plugin consists of one file:
This file must be installed readable for the user tomcat
at the following path
When you search for an identifier in Goobi, a request is sent to the configured URL in the background.
After retrieving the actual record from the PICA-catalog, the metadata is mapped according to the rules configured in the rule set.
The plugin itself does not have its own configuration. Instead, all configuration is carried out by making adjustments within Goobi workflow or the associated rule sets.
In the file goobi_opac.xml
, the interface to the desired catalogue system must be made known. This is done by means of an entry that looks like this:
The title
attribute contains the name under which the catalog can be selected in the user interface, address
the URL to the API endpoint and database
the database to be used.
The content of a PICA record is mapped within the Goobi workflow ruleset used in each case. For more information on how to configure this mapping, see the UGH documentation here:
Plugin for Automatic Update of the HERIS Vocabulary
Identifier
intranda_quartz_heris
Repository
Licence
GPL 2.0 or newer
Last change
04.09.2024 09:40:43
This documentation describes the installation, configuration, and usage of the plugin for the automatic, regular update of the HERIS vocabulary.
A prerequisite is Goobi version 23.03 or newer. Additionally, the following two files must be installed:
After installation, the functionality of the plugin is available within the REST API of Goobi workflow.
The import occurs regularly at the times specified in the goobi_config.properties
file. Alternatively, the import can also be manually triggered at any time. To do this, an administrator can open the Scheduled Tasks
section and execute the HERIS Import once.
When the plugin is executed, it connects to the SFTP server and searches for a JSON file. If multiple files exist, the file with the latest timestamp is used. The file is downloaded, opened, and the JSON array is split into individual objects. For each object, the identifier is searched and compared with existing records. If the identifier already exists in a record, the record is updated; otherwise, a new record is created.
Subsequently, the configured fields are iterated over, and the individual values are imported.
Finally, the downloaded file is deleted by the Goobi system. No data is changed on the SFTP system.
The plugin is configured in the file plugin_intranda_quartz_heris.xml
as shown here:
The following table contains a summary of the parameters and their descriptions:
<username>
The username for SFTP access.
<password>
The password for SFTP access.
<hostname>
The hostname of the SFTP server.
known_hosts
File with the server's fingerprint, required for authentication.
sftpFolder
Path to the JSON file on the SFTP server (use .
if stored in the home directory).
<herisFolder>
Local folder where the JSON file is downloaded.
<vocabulary>
Name of the vocabulary to be updated.
fieldName
Name of the field in the vocabulary to be overwritten.
jsonPath
JSONPath expression for extracting the field from the JSON file.
identifier
Identifier field for matching with the vocabulary.
To ensure the update is automatically executed, the execution time must be configured in the goobi_config.properties
file. This is done by specifying the cron syntax for when it should run. For a daily execution at midnight, the following can be used:
Time-controlled plugin for the repeated import of folder structures from an S3 storage for the import of housing subsidy files in Austria.
Identifier
intranda_quartz_bka_wohnbau
Repository
Licence
GPL 2.0 or newer
Last change
24.07.2024 19:50:31
This documentation describes the installation, configuration and use of the time-controlled plugin for importing housing subsidy files in Austria into Goobi workflow. The metadata is taken from a provided JSON file and the associated PDF files are extracted. The files are provided via an S3 memory in several deliveries, each of which is taken into account within the METS files.
To be able to use the plugin, the following files must be installed:
After installation, the plugin is available under the menu item Administration
- Periodic tasks
.
This plugin is a so-called Quartz plugin
for repeated automatic execution. With each call, the plugin assumes that configured Collections
within an S3 bucket contain directories. Each directory corresponds to a delivery
for an existing file
. The following example corresponds to the "second delivery" for the file ST-1431
.
Such a delivery contains several data: - a json file
with metadata - one or more PDF files
and full text files
for each document
of a `delivery
When the plugin is executed, all existing 'deliveries' are run through and a check is made to see whether they have already been imported into Goobi. If they have not yet been imported, the file is created as a new process if it does not already exist. The process is created on the basis of the configured 'production template' and within the configured 'project'. All metadata is transferred from the json file to the METS file as defined in the configuration file.
A new structural element is created for the respective delivery within the existing or newly created file, to which the metadata of the delivery is then assigned. Within the 'delivery', a 'document' is then created for each PDF file provided, to which the document metadata is assigned. Each 'document' is converted from the supplied PDF file into image files and the full texts are extracted in ALTO format. The image files imported are given a prefix to indicate the delivery number and a suffix for the respective page number within the PDF file from which they originate.
The image file is saved within the master
directory of the process. The full text files are stored in the alto
directory in the ocr
folder. The json file
is saved within the import
directory.
The configuration covers two areas. On the one hand, the function of the plugin is defined in its configuration file. On the other hand, a central Goobi configuration is used for time control, which defines when this plugin should be started regularly in order to run automatically.
The plugin is configured in the file plugin_intranda_quartz_bka_wohnbau.xml
as shown here:
The plugin can be repeated automatically or executed manually. Manual execution is possible by calling it within the menu item Administration
- Periodic tasks
. Automatic execution, on the other hand, must take place within the configuration file goobi_config.properties
. To do this, the configuration must look like this if the plugin is to be executed once every hour:
As an example, some further configurations for a different execution time are listed here (cron syntax):
This statistics plugin determines the activity of edits to translations within specific metadata fields.
Identifier
intranda_statistics_sudan_memory_activity_by_user
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 13:57:17
This statistics plugin enables statistical recording of the activity of translators and editors who edit specific metadata fields within the METS file. In particular, the translation work in the metadata fields Title (Arabic)
, Title (English)
, Description (English)
and Description (Arabic)
is taken into account.
To install the plugin, the following two files must be installed:
In addition, the following function must be created within the database:
A UTF8-encoded text can be passed to this function. The text is checked character by character. If the current character is an alnumeric character (letters, numbers, full stop, comma, letters with diacritics, brackets) but the previous character is not (nothing, space, newline, tab), a new word starts at this point and the word counter is incremented. At the end, the word counter is returned.
To use this plugin, the user must have the correct role authorisation.
Therefore, please assign the role view_translation_activity
to the group.
Afterwards, the menu item Translation and Editing Activity
can be selected in the section Management
.
In order to limit the period of the evaluation, the two fields Period from
and Period to
can be used for the start date and end date. A date in the form YYYY-MM-DD
can be entered here. Both entries are optional. If the start date is not filled in, the date on which the first step was completed applies. If the end date is missing, the current date is used.
In the Unit
field, you define the time periods in which the evaluation is to be summarised. Here you can choose from the values days
, months
, quarters
or years
.
After specifying the required information, two different evaluations can be generated by this plugin:
The evaluation Overview
lists for each period within the start and end date which user has processed how many work steps Translation of Arabic content to English
or Translation of English content to Arabic
. It also shows how many words were entered in the fields Title (Arabic)
, Title (English)
, Description (English)
and Description (Arabic)
in these steps.
The Detailed View
lists each workflow step Translation of Arabic content to English
or Translation of English content to Arabic
that was completed within the specified start and end date. For each step, the user, the associated process, and the content and number of words from the four fields Title (Arabic)
, Title (English)
, Description (English)
and Description (Arabic)
are also displayed.
The two evaluations can also be downloaded as Excel files.
The following are some SQL statements that may be useful for working with the data in the context of this plugin.
SQL query via a general overview:
SQL query for a detailed report:
This is a technical documentation for the plugin to automatically create a
Identifier
intranda_step_imagename_analyse
Repository
Licence
GPL 2.0 or newer
Last change
15.08.2024 06:28:53
This documentation describes the installation, configuration and use of the plugin. This plugin can be used to automatically prepare METS-files, create a basic structure and set a pagination.
The plugin consists of two files:
The file plugin_intranda_step-imagename-analyse-base.jar
contains the program logic and must be installed to the tomcat
user readable in the following directory:
The file plugin_intranda_step_imagename_analyse.xml
must also be readable by the tomcat
user and installed into the following directory:
Once the plugin has been installed and configured, it can be used by Goobi within a single step.
To do this, the intranda_step_imagename_analyse
plugin must be selected within the desired task. In addition, the Automatic task
checkbox must be set.
The way the plugin works within the correctly configured workflow looks like this:
If the plugin was called within the workflow, it opens the METS file and first checks whether a pagination already exists.
If this is the case, based on the configured value in skipWhenDataExists
the step is either completed without further changes or the existing pagination and structuring is removed from the METS file.
Then the files are read from the master folder and sorted alphanumerically.
For each file it is now checked whether it corresponds to the configured regular expression.
If this is the case, a new page is created. The physical order corresponds to the sorting in the file system, the logical page number is taken from the first group of the regular expression.
If the regular expression does not apply, the system then runs through the list of configured items and checks whether the file name ends with the expression followed by an optional number and an optional recto-verso specification (r or v). If this is the case, the configured structural element is created and the page is assigned to this element. By specifying a count, new structural elements of the same type can be defined. If two or more files have no count or the same count, they are assigned to the same structural element.
If neither the regular expression nor the list of structural elements apply to the file names, a page with the logical sorting "uncounted" is created and an entry is written in the process log.
The configuration file plugin_intranda_step_imagename_analyse.xml
used to configure the plugin and must have the following structure:
The element skipWhenDataExists
defines how the plugin behaves if a pagination already exists. With the value true
the execution is skipped, with false
the existing structure and pagination is removed and a new one is created.
The element paginationRegex
contains a regular expression which tries to extract the logical page number from the filename. The value from the first group is copied to the METS file.
If the regular expression was not successful, the system then checks whether the file name describes a special structure such as Cover
, Titlepage
or Contents
. This structure is defined within structureList
. A (partial) string, which must occur in the file name, is defined within the item element in the filepart
attribute. In the docstruct
attribute, the structural element is defined that is to be created in this case.
The following examples are based on the configuration defined above:
BxSem-A02_010v.tif
Page 010v
BxSem-A02_146r.tif
Page 146r
BxSem-A02_NSr.tif
first page of the Postscript structural element
BxSem-A02_NSv.tif
second page of the Postscript structural element
BxSem-B04_Farbkarte_Einband.jpg
not configured, therefore no assignment possible, is taken over as "uncounted"
BxSem-A22_VS1r.jpg
first end sheet, first page
BxSem-A22_VS1v.jpg
first end sheet, second page
BxSem-A22_VS2.jpg
second end sheet
BxSem-B08_SV.jpg
single picture of the FrontSection
OPAC Plugin for data transfer from Soutron data sets
Identifier
intranda_opac_soutron
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:01:54
This documentation describes the installation, configuration and use of the plugin. This plugin can be used to retrieve data from a Soutron system and transfer it to Goobi. Access to the Soutron catalogue must be guaranteed for this purpose.
The plugin consists of two files:
The file plugin_intranda_opac_soutron-base.jar
contains the program logic and must be installed readable for the user tomcat
at the following path:
The file plugin_intranda_opac_soutron.xml
must also be readable by the user tomcat
and must be located under the following path:
When you search for an identifier in Goobi, a request is sent to the configured URL in the background:
If a valid record is found, the record is searched for the field /soutron/catalogs_view/ct/cat/rt/@name
. The value is compared with the configured <docstructs>
list. If there is a match, the required structure element is created.
The configured XPath expressions configured for <metadata>
and <person>
are then evaluated.
The expressions apply to the element /soutron/catalogs_view/ct/
. If data is found with an expression, the corresponding specified metadata is generated. For persons, the system checks whether the value contains a comma. If this is the case, first name and surname are separated by a comma, otherwise the value is interpreted as a surname.
The configuration of the plugin is done in the following files located in the directory /opt/digiverso/goobi/config/
.
In the file goobi_opac.xml
the interface to the desired catalog system must be made known. This is done with an entry that looks like the following:
The attribute title
contains the name under which the catalog can be selected in the user interface, address
the URL to the GetCatalogue endpoint and opacType
the plugin to be used. In this case the entry must be plugin_intranda_opac_soutron
.
Only a search for an identifier is possible, so the other search options can be hidden. This is done within the <searchFields>
block.
The contents of the Soutron record are mapped to the metadata in Goobi in the plugin_intranda_opac_soutron.xml
file:
In the area <docstructs>
the mapping of the individual document types is defined. For each value that can occur in soutron, a <docstruct>
must exist. In the attribute soutron
the name that is contained in the soutron record is entered, in ruleset
the corresponding structure element from the ruleset is entered.
Then the mapping for persons and metadata is configured in <metadata>
and <person>
. Here there is a list of <element>
with the two attributes xpath
and metadata
. In xpath
an XPath expression is configured, which describes in which part of the XML document the content is expected. In metadata
the name of the metadata is defined, in which the content should be written afterwards.
Goobi Step plugin for updating existing METS files with content from a catalogue query
Identifier
intranda_step_catalogue_request
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:00:56
This documentation describes the installation, configuration and use of the Step plugin for the catalogue query to update records in Goobi workflow.
The plugin consists of the following file:
This file must be installed in the correct directory so that it is available at the following path after installation:
In addition, there is a configuration file that must be located in the following place:
The plugin is usually executed fully automatically within the workflow. It first determines whether there is a block in the configuration file that has been configured for the current workflow with regard to the project name and work step. If this is the case, the other parameters are evaluated and the catalogue query is executed with the field content of the METS file specified within the configuration file as identifier.
This plugin is integrated into the workflow in such a way that it is executed automatically. Manual interaction with the plugin is not necessary. For use within a workflow step, it should be configured as shown in the screenshot below.
The plugin is configured via the configuration file plugin_intranda_step_catalogue_request.xml
and can be adapted during operation. The following is an example configuration file:
project
This parameter determines for which project the current block <config>
should apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
should apply. The name of the workflow step is used here. This parameter can occur several times per <config>
block.
catalogue
Here it is possible to define which catalogue is to be used for querying new data. This is the name of a catalogue as defined within the global Goobi catalogue configuration within goobi_opac.xml
.
catalogueField
This determines which field is to be used for the catalogue identifier query.
catalogueIdentifier
Definition of the metadata from the METS file that is to be used for the catalogue query. Usually this is the identifier that was also used for the initial catalogue query and is usually stored within the metadata ${meta.CatalogIDDigital}
.
mergeRecords
If the value true
is set, the existing METS file is updated with the current data from the catalogue. Any additional metadata can be excluded for the update. Also, the logical and physical structure tree within the METS file remains unchanged.If the value is set to false
, then the existing METS file is completely replaced by a new METS file generated using the catalogue query.
ignoreMissingData
This parameter can be used to define whether the workflow step of the plugin should continue in the case of missing catalogue data or switch to an error status.
ignoreRequestIssues
Here you can define how the plugin should behave in the event of a query error, for example in the event of network problems. In this way, it can be defined that the workflow should be interrupted or nevertheless continued.
analyseSubElements
This parameter can be used to define whether metadata for structural elements already existing within the METS files should also be queried by the catalogue. For this, the specified metadata for the identifier to be queried must be available for each sub-element.
skipField
Several metadata fields can be defined here that are not to be changed by a catalogue query under any circumstances. This is particularly useful for those fields that do not come from a catalogue query and were therefore previously recorded in addition to the catalogue data. Typical examples of such fields include singleDigCollection
,accesscondition
and pathimagefiles
. Please note that this parameter only applies when the value for mergeRecords
is set to true
.
Step plugin for assigning the process to an existing or new batch
Identifier
intranda_step_batch_assignment
Repository
Licence
GPL 2.0 or newer
Last change
25.02.2025 11:03:03
This documentation explains the plugin for assigning a single process to a batch. This assignment is made directly from an accepted task. A new batch can either be created there or selected from a list of existing waiting batches.
To be able to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be selected within the workflow for the respective workflow steps. Please note that two workflow steps must be scheduled in the workflow:
One workflow step is used by the user as the step in which the batch assignment takes place.
Another workflow step serves as a kind of ‘waiting zone’ in which all processes already assigned to a batch remain and only switch to the subsequent step when the batch is complete.
A workflow could therefore look like the following example:
To use the plugin, it must be selected in the first of the two steps:
Once the user has accepted the task, they can first decide in the plugin whether a new batch should be created or whether a selection should be made from the existing batches that are still waiting. If the user wants to define a new batch, they can define the title for the batch here and, if required, also enter properties that were defined via the configuration:
Alternatively, the user can select a batch from the list of currently waiting batches. Once the desired batch has been selected, the task can be completed as normal.
After assignment to an existing or newly created batch, the workflow for the process moves on to the subsequent workflow step, which can be regarded as a kind of ‘waiting zone’. All processes in a batch initially remain there and do not yet pass through the subsequent steps.
If the user decides in the workflow step of a process that the batch with this process is now complete, he can click on the button for ‘Close batch’. This opens a dialogue window in which a batch docket can be downloaded and the batch can be closed:
By closing the batch, the workflow step for waiting for all processes in the batch to be complete is finished and the workflow step of the currently open workflow step is also finished. This means that all processes assigned to a batch simultaneously switch to the next subsequent workflow step so that they can be processed further together.
The plugin is configured in the file plugin_intranda_step_batch_assignment.xml
as shown here:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
In addition to these general parameters, the following parameters are available for further configuration:
batchWaitStep
Name of the workflow step in which the processes are to remain until the last process is added to the batch
property
Names of those properties of the process that are to be editable when the batch is created and that are to be adopted for all associated processes
Goobi Step Plugin for copying image folders to external storage
Identifier
intranda_step_archiveimagefolder
Repository
Licence
GPL 2.0 or newer
Last change
26.08.2024 10:44:14
This step plugin for Goobi workflow copies image folders to an external storage connected via sftp (ssh) and creates a file that causes Goobi workflow to display a warning in the task. The warning is displayed in the task details and in the metadata editor. As an alternative to sftp, an s3 bucket can also be used.
The plugin consists of the following file:
This file must be installed in the correct directory so that it is available at the following path after installation:
In addition, there is a configuration file that must be located in the following place:
To put the plugin into operation, it must be activated for one or more desired automatic tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_archiveimagefolder
from the list of installed plugins.
The plugin copies the files from the configured folder to the SSH server or the S3 bucket, writes an XML file with the info where the files are located to the operation folder and then (if so configured) deletes the images on the Goobi storage and closes the step.
The images can be restored afterwards with the plugin goobi-plugin-administration-restorearchivedimagefolder
.
The configuration of the plugin is done via the configuration file plugin_intranda_step_archiveimagefolder.xml
and can be adjusted during operation. The following is an example configuration file:
For authentication on the ssh server, public keys are searched for in the usual places ($USER_HOME/.ssh
). Other authentication methods such as username/password are not yet provided.
When using s3, the s3 endpoint and the bucket name to be used must be specified. Optionally, a prefix can be set if archiving is not to take place directly in the root of the bucket. S3AccessKeyID and S3SecretAccessKey contain the credentials to access the bucket.
The setting <deleteAndCloseAfterCopy>false</deleteAndCloseAfterCopy>
is intended for the case that the files on the SSH server are first stored in a buffer and then written to a tape. In this case, the step can remain open and be closed by a callback of the tape storage system. The callback has not yet been implemented for any tape storage, but can be mapped using the standard Goobi workflow REST API.
This Step Plugin for Goobi workflow allows to let all processes of one batch come to the same progress to trigger some REST call and let all processes move on their work again in parallel.
Identifier
intranda_step_batch_progress
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:01:04
This step plugin for Goobi workflow allows multiple Goobi processes that belong to a batch but have different progress in their workflows to all wait for each other at a defined workflow step. Only when the last associated process reaches the defined workflow step does a call to a specified REST URL take place, so that all processes can then continue with their respective next workflow steps.
The initial purpose of this plugin is to call AEON REST URLs to log the progress of Goobi workflows. Other uses for this plugin are possible, but may require customisation of the plugin.
The plugin consists in total of the following files to be installed:
These files must be installed in the correct directories so that they are available under the following paths after installation:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_batch_progress
from the list of installed plugins.
Since this plugin should usually be executed automatically, the workflow step should be configured as automatic
.
Once the plugin has been fully installed and set up, it is usually run automatically within the workflow, so there is no manual interaction with the user. Instead, calling the plugin through the workflow in the background does the following:
First, it is checked whether the process belongs to a batch. If this is not the case, the workflow step is closed and the further workflow is started.
Otherwise, it is checked whether the current workflow step has already been reached in all processes of the batch (the status must not be Locked
). If this is not yet the case, the step remains in the status In Work
.
However, if all other processes in the batch have reached the workflow step or there is only the current process in the batch, a new status is set in AEON if this has been activated with the parameter updateQueue
. To do this, a search is made in the properties of the process for the transaction identifier
property with which the processes were initially created. This record is then called up in AEON to set the configured queueName
as the new status.
The current workflow step in all processes of the batch is then closed and the rest of the workflow continues.
The plugin is configured via the configuration file plugin_intranda_step_batch_progress.xml
and can be adapted during operation. The following is an example configuration file:
Various parameters can be configured within the configuration file. The file is divided into two areas. In the <global>
area, generally valid information such as the access data to AEON is managed. The following parameters are available here:
url
Enter the URL for the API of AEON here.
apiKey
A key can be specified here that is to be used instead of the login and password.
username
Define the user name to be used here.
password
Enter the password for accessing the API here.
In addition, there is the second area <config>
, in which different specifications can be made for individual workflow steps. Here it can be specified for individual projects and steps into which queue the data set is to be written.
The block <config>
can be repeated for different projects or workflow steps in order to be able to carry out different actions within different workflows and also to be able to set a different status in AEON for different steps. The other parameters within this configuration file have the following meanings:
project
This parameter determines for which project the current block <config>
should apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which workflow steps the <config>
block should apply to. The name of the workflow step is used here. This parameter can occur several times per <config>
block.
<updateQueue>
Here you can define whether an update of a queue in AEON should take place or not. If the parameter is missing, false
is assumed.
<queueName>
Name of the Aeon Queue to be updated
This is a technical documentation for the integration of Libsafe long-term archiving.
Identifier
intranda_step_bagcreation,intranda_step_bagsubmission
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:50:00
This documentation describes the installation, configuration and use of the plugin for ingesting into the Libsafe long-term archiving system.
Using this plugin for Goobi, the metadata objects available in Goobi and additional descriptive documents can be combined into an E-ARK-BagIt and transferred to the Libsafe server.
The following files must be installed in order to use Libsafe Ingest:
Two new steps must be added to the workflow. Firstly, an automatic step that creates the E-ARK-based BagIt Submission Information Package (SIP)
, where intranda_step_bagcreation
must be selected as the plugin. A second automatic step is then required to handle the actual data delivery. The intranda_step_bagsubmission
plugin is required for this.
This plugin is integrated into the workflow so that it is executed automatically. Manual interaction with the plugin is not necessary. To use it within a workflow step, it should be configured as shown in the screenshot below.
The Long-term archiving consists of several sub-steps:
Firstly, the file and folder structure required for the SIP is created.
A metadata
folder and a representations
folder are created within a root folder. Within the metadata
folder there are the subfolders descriptive
and other
to store MODS files and other formats such as the DFG viewer extensions. Within representations
there are subfolders for different formats, each containing a subfolder data
in which the files are located.
Each format has a METS file in which the files are listed in the data
folder. Each format is described in its own METS file, each of which contains a fileGroup
and a structMap
.
The metadata is described in MODS. There is a separate file for each structural element in the descriptive
folder. This file contains all metadata for which an export mapping has been defined in the rule set. As there may also be metadata that should not be exported in the regular export but must also be archived during long-term archiving, there is the option of defining additional export parameters in the configuration file that are only used for the Libsafe export.
Technical or administrative metadata is stored in the other
folder. A METS file is then created in the root folder, which refers to the other created METS, MODS and AMD.
The prepared data is now summarised in a SIP BagIt
. For this purpose, all files are provided with a checksum and listed in the file manifest-sha256.txt
. bagit.txt
contains information about the bag version and the encoding and bag-info.txt
contains information about the creator of the bag, the size, payload and the creation date, as well as some information about the transmission of the ingest status back to Goobi.
Finally, the tagmanifest-sha256.txt
file is created. This contains the names and checksums of the 3 files mentioned above.
The previously prepared folders and files are combined into a tar file and saved in the process folder.
Data is delivered via SFTP upload. For this purpose, the previously created SIP file is uploaded to the remote server. Alternatively, the data can be exported to a local directory on the server or a network share. The file name corresponds to the bag name and the suffix _bag.tar
.
The status message back to Goobi is sent via Rest API calls. There are various endpoints for providing the individual pieces of information. The Rest API can handle XML or JSON. To do this, the Accept
header must be set for GET requests and Content-Type
must be set to application/xml
or application/json
for other requests. If this is not specified, the default JSON is used.
Authentication can be carried out in 2 ways. The necessary methods can be enabled in goobi_rest.xml
for an IP address, in which case the requests from this one server work, or an API token can be generated. Individual methods can then be authorised for this API token without IP address restrictions. Authentication then takes place via the HTTP header Authorisation: Basic <TOKEN>
.
The processid
is required for all requests. This information is transmitted in two places. Firstly, it is part of the metadata and can be found in the MODS file in the field <mods:identifier type="GOOBI">
, alternatively it is transmitted in the field Process-ID
in bag-info.txt
.
To make the generated Libsafe ID known in Goobi, a POST
request must be sent to /process/<process id>/metadata
.
A message in the process journal can be created via a POST
request to /process/<process id>/journal
.
The variables USERNAME
and MESSAGE
can contain any text, TYPE
must be a value from the list error
, warn
, info
or debug
.
To complete the ingest process in Goobi, the ID of the step to be closed must be known. This ID can be determined via the Rest API by making a GET
request after all steps of the process.
The correct step and its ID can be found from the response using either steptitle
or status
. A PUT
request can then finalise the step:
The plugin is configured in the file plugin_intranda_step_bagcreation.xml
, which is explained here:
The <config>
area can be repeated as often as required and therefore allows different metadata configurations or ingest to different destinations for different projects.
The sub-elements <project>
and <step>
are used to check whether the current block should be used for the current step. The system first checks whether there is an entry that contains both the project name and the step name. If this is not the case, the system searches for an entry for any projects marked with *
and the step name used. If no entry is found for this either, a search is carried out for the project name and any steps, otherwise the default block is used, in which both <project>
and <step>
contain *
.
The various <mets:fileGrp>
elements are defined here. Each filegroup
corresponds to a file format that is taken into account during delivery. Each defined element contains the attributes folder
, fileGrpName
, prefix
, suffix
and mimeType
, as well as useOriginalFileExtension
.
The folder to be used is specified in folder
. First, a check is made to see whether the folder exists and contains files. If this is the case, a folder is created in the SIP folder structure that corresponds to the fileGrpName
. This specification is also used as USE
within the METS file. The individual <mets:file>
specifications within the fileGroup
are composed of prefix
, the actual file name and suffix
:
Optionally, useOriginalFileExtension="true"
can be used to specify that file Extension
and MIMETYPE
are automatically determined individually for each file. This works both for files directly in the specified folder and for files in subfolders.
The individual parameters, which are also known from the project configuration, are then configured. As different entries may be required here than in the regular export to the Goobi viewer, different entries can be made here:
The individual parameters and their function are described in the Goobi workflow manual.
The <submissionParameter>
section contains information about the owner of the data, which is written to bag-info.txt
.
In addition to these fields, the bag-info.txt
file also contains a range of other information, such as creation date, size of the set and Oxum, which do not need to be configured as these are determined automatically.
The <additionalMetadata>
section is used to extend the rule set. A mapping can be added here for metadata, corporate bodies, persons or groups for which no export mapping is provided in the rule set because this information should not be published in the regular export to the Goobi viewer.
The syntax is identical to the MODS mapping in the rule set.
The last step is to configure the access data for the SFTP transfer.
Authentication can be carried out using either a username and password or a private/public key. To authenticate using a password, the <keyfile>
field remains empty. Otherwise, the key configured there is used.
<hostname>
and <port>
describe the access to the remote server. A target folder on the server can be specified using <remoteFolder>
if the upload is not to take place in the root directory. <knownHostsFile>
contains the path to a known_hosts file, which must contain a fingerprint of the host.
This step plugin allows an automatic selective deletion of content from a process.
The plugin is used to automatically delete data from a process. For this purpose, a configuration file can be used to define very granularly which data exactly should be deleted.
To install the plugin, the following file must be installed:
To configure how the plugin should behave, various values can be adjusted in the configuration file. The configuration file is usually located here:
To use the plugin, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_deleteContent
from the list of installed plugins.
Since this plugin is usually to be executed automatically, the step in the workflow should be configured as automatic.
Once the plugin is fully installed and set up, it is usually executed automatically within the workflow, so there is no manual interaction with the user. Instead, the workflow calls the plugin in the background and starts the deletion of the configured data. In doing so, the configured folders and data are deleted, if they exist. Data that does not exist will be skipped. If it has been configured that the process is to be deactivated, all workflow steps are run through and checked whether they have already been closed regularly within the workflow. If this is not the case, the steps are deactivated.
When the deletion is complete, a message is added to the process log to inform you that this plugin has been called and the data was deleted.
The configuration of the plugin is structured as follows:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
In addition to these general parameters, the following parameters are available for further configuration:
This plugin was originally implemented to communicate with ALMA and process returned responses. However, thanks to its general design, it can also be used to connect to other systems via REST.
This plugin is used to send requests to REST APIs, e.g. ALMA, and process the returned responses. Multiple commands can be configured to compile a complicated task. The plugin executes these commands one after the other in a defined order.
To use the plugin, the file plugin_intranda_step_alma_api-base.jar
must be saved in the following location:
The configuration file plugin_intranda_step_alma_api.xml
is expected under the following path:
This plugin is integrated into the workflow in such a way that it is executed automatically. Manual interaction with the plugin is not necessary. For use within a workflow step, it should be configured as shown in the screenshot below.
The configuration of the plugin is structured as follows as an example:
The plugin is configured as described here:
The configuration within the command blocks is carried out as described here:
This is the technical documentation for the Goobi plugin for automatically creating PDF files out of images.
This documentation describes how to install and configure this plugin to create PDF files out of images.
To use the plugin, it must be copied to the following location:
The configuration of the plugin is expected under the following path:
Once the plugin has been installed correctly, it can be configured in the user interface for use within the workflow for the desired work step. To do this, the value intranda_step_createfullpdf
must be selected as the plugin and the filling should be set as automatic.
The plugin is configured in the file plugin_intranda_step_zbz_order_delivery.xml
as shown here:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
In addition to these general parameters, the following parameters are available for further configuration:
Step Plugin for managing the delay of workflow status changes
This documentation explains the installation, configuration, and use of the plugin. This plugin checks if a workflow has reached a specific status. Only if this is the case, a defined work step will be closed, and the next step will be opened.
To use the plugin, the following files must be installed:
To use the plugin, it must be selected in a workflow step with the following settings:
When the process reaches the configured step, a check is performed to see if the conditions are met. If this is the case, the step is closed immediately, and the next task can be processed. If not, the task remains in progress. The condition is checked again every night until it is fulfilled.
The condition is only considered met if all configured rules have been fulfilled.
The plugin is configured in the file plugin_intranda_step_delay_workflowstatus.xml
as shown here:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
In addition to these general parameters, the following parameters are available for further configuration:
The <condition>
field contains the rules to be checked. Both properties and steps can be checked. The fields within are repeatable to define multiple rules. In this case, all rules must be met for the condition to be considered fulfilled.
In the <property>
field, the properties to be checked are defined. The name
attribute specifies the property name, and value
specifies the value to be checked. The type of check can be defined in type
. There are four types available:
Delay Plugin for pausing the workflow
This documentation explains the plugin that allows a workflow to be paused for a certain period of time.
To be able to use the plugin, the following files must be installed:
After installing the plugin, it can be selected within the workflow for the respective steps and will be executed automatically.
To use the plugin, it must be selected in a workflow step:
This plugin pauses the workflow for as long as specified in the configuration file. Once the configured time has been reached, the relevant workflow step is automatically closed and the workflow continues.
The plugin is configured in the file plugin_intranda_delay_configurable.xml
as shown here:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
In addition to these general parameters, the following parameters are available for further configuration:
This is the technical documentation for the Goobi plug-in for displaying any metadata in a workflow task.
This documentation describes the installation, configuration and use of a plug-in to display metadata in a workflow step. The plugin can display any metadata in one step. The configuration of prefixes and suffixes is also possible.
To use the plugin, the two artifacts must be copied to the following locations:
The configuration of the plugin is expected at the following path:
In Goobi, the plugin in the workflow must then be configured. To do this, you must select intranda_step_displayMetadata
as the step plug-in in the step configuration.
If the step is then opened after successful configuration, all metadata - if available in the process - are displayed:
Several metadata can be configured for display, additionally a prefix and a suffix can be displayed. The key
attribute is used for the translation of the labels of the metadata:
This is the technical documentation for the Goobi plugin for automatically modifying workflows based on task properties.
This documentation describes the installation, configuration and use of a plugin for automatically changing workflows at runtime. The plugin can open, close or deactivate (depending on configuration) steps. User groups can be assigned and production templates can also be completely exchanged. The decision as to what exactly should happen in each case is made on the basis of process properties.
To use the plugin, it must be copied to the following location:
The configuration of the plugin is expected under the following path:
After the plugin has been installed and configured, it can be configured in the user interface in a workflow step. Make sure that the name of the step is the same as in the configuration file. In addition, a check mark should be set for Automatic task
.
The following is a sample configuration with comments:
Each config
block is responsible for a certain project and a certain step, whereby wildcards *
and multiple answers of processes or steps are also possible. If a step in the workflow is executed with this plugin, the system searches for a config
block that matches the currently opened step. For example, if in the project "PDF Digitization" the step with the title "Change workflow after PDF extraction" is configured and executed with this plugin, the plugin looks for a config
block that looks like this:
In each <change>
element it is then configured which process property is checked (<propertyName>
) and which value is expected (<propertyValue>
). Please note that the specification for defining which property is to be used for checking a value must be specified with the syntax for the so-called variable replacer. Accordingly, when defining the field to be checked, the specification must be as in the following examples:
Further explanations about the use of variables can be found here:
After defining how the properties are to be evaluated, the action to be performed is determined. The following possibilities exist here:
Depending on existing properties, the status of defined steps within the workflow can be changed automatically. Workflow steps can be opened type="open"
, deactivated type="deactivate"
, closed type="close"
or locked type="lock"
.
Depending on existing properties, the priority of defined steps within the workflow can be changed automatically. Possible choices for priority are Standard value="0"
, Priority value="1"
, High priority value="2"
, Highest priority value="3"
or Correction value="10"
. If any title
is configured with *
, then its priority value will be applied to all steps of this process. If more than two title
s are configured with *
, then only the value of the first one in the order 0, 1, 2, 3, 10 will be used.
Depending on existing properties, the responsible user groups can be defined for several workflow steps. The configuration is done as shown here:
With a configuration like the following example, the process template can be exchanged while the workflow is running. Depending on existing properties, a workflow can thus be replaced by another workflow during execution. Workflow steps that are also present in the new workflow are automatically set to the correct status.
Goobi Step Plugin for the creation of Archival Resource Keys (ARK) with metadata according to the DataCite schema.
This documentation describes the installation, configuration and use of the Step Plugin for the generation of ARK identifiers in Goobi workflow.
The plugin consists of the following file:
This file must be installed in the correct directory so that it is available at the following path after installation:
In addition, there is a configuration file that must be located in the following place:
The plugin is usually executed fully automatically within the workflow. It first determines whether an Archival Resource Key (ARK) already exists. If there is no ARK yet, a new ARK is registered. If an ARK already exists in the metadata, it attempts to update the metadata of the ARK.
This plugin is integrated into the workflow in such a way that it is executed automatically. Manual interaction with the plugin is not necessary. For use within a workflow step, it should be configured as shown in the screenshot below.
The configuration of the plugin is done via the configuration file plugin_intranda_step_ark.xml
and can be adjusted during operation. The following is an example configuration file:
ATTENTION: This update can only be used to a very limited extent from 01.01.2024. ARCs were generated in the plugin via the Rest API of , a service of the Geneva University of Applied Sciences, which will be shut down in the course of 2024. Arketype is a fork of and could theoretically be used with it. However, the operation of a local ARK service using the of the Lucerne Central Library for is recommended. This Python script generates local identifiers, registers them in the global resolver and enters them into the processes in Goobi Workflow.
Identifier
intranda_step_deleteContent
Repository
Licence
GPL 2.0 or newer
Last change
06.09.2024 11:40:45
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
deleteAllContentFromImageDirectory
Specify whether to delete all data from the images
folder.
deleteMediaDirectory
Specify whether to delete the media
folder. This option is not evaluated if deleteAllContentFromImageDirectory
is enabled.
deleteMasterDirectory
Specify whether to delete the master
folder. This option is not evaluated if deleteAllContentFromImageDirectory
is enabled.
deleteSourceDirectory
Specify whether to delete the source
folder. This option is not evaluated if deleteAllContentFromImageDirectory
is enabled.
deleteFallbackDirectory
Specify whether to delete the configured fallback folder. This option is not evaluated if deleteAllContentFromImageDirectory
is enabled.
deleteAllContentFromThumbsDirectory
Specify whether to delete all data from the thumbs
folder.
deleteAllContentFromOcrDirectory
Specify whether to delete all data from the ocr
folder.
deleteAltoDirectory
Specify whether to delete the alto
folder. This option is not evaluated if deleteAllContentFromOcrDirectory
is enabled.
deletePdfDirectory
Specify here whether the pdf
folder is to be deleted. This option is not evaluated if deleteAllContentFromOcrDirectory
is enabled.
deleteTxtDirectory
Specify whether to delete the txt
folder. This option is not evaluated if deleteAllContentFromOcrDirectory
is enabled.
deleteWcDirectory
Specify whether to delete the wc
folder. This option is not evaluated if deleteAllContentFromOcrDirectory
is enabled.
deleteXmlDirectory
Specify whether to delete the xml
folder. This option is not evaluated if deleteAllContentFromOcrDirectory
is enabled.
deleteExportDirectory
Specify whether to delete the export
folder.
deleteImportDirectory
Specify whether to delete the import
folder.
deleteProcesslogDirectory
Specify whether to delete the folder where the files uploaded in the operation log are managed.
deleteMetadataFiles
Specify here whether the metadata and associated backups should be deleted.
deactivateProcess
When this option is enabled, all steps of the process are disabled if they have not been completed previously.
deleteMetadata
Here a specific metadata can be deleted that is at the level of the work in the metadata file. The item is repeatable and must use a valid name for a metadata type from the rule set.
deleteProperty
Here a specific operation property can be deleted., The element is repeatable and must list the name of the property.
Identifier
intranda_step_alma_api
Repository
Licence
GPL 2.0 or newer
Last change
07.09.2024 08:46:07
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur multiple times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur multiple times per <config>
block.
url
The base URL of the REST API is specified here.
api-key
The API key for the connection to the REST API is configured here.
variable
This tag can be used to define a variable that can be used by all subsequent commands. This tag has two attributes, where @name
defines the name and @value
the value. @value
expects a simple text value or a Goobi variable.
command
A command block defines a command that is to be executed in the job. It has two mandatory attributes itself, where @method
specifies the method to be used and @endpoint
specifies the path to the endpoint, where all placeholders are not replaced. It also has two optional attributes, @accept
and @content-type
, which are used to specify the request parameters accept
and content-type
. Both expect either json
or xml
. If one of the two parameters is omitted, the default value json
is used. Further details can be found in the table below and in the example configuration above.
save
An optional save
element defines a value to be saved after all commands have been executed. It has three mandatory attributes, where type
specifies whether the value is to be saved as an operation property or as a metadata. The attribute @name
defines the name of the process property or metadata type. The @value
attribute determines the value, which can be a simple text value or a previously defined variable. It has two optional attributes, where @choice
specifies which value should be saved if several are found, and @overwrite
determines whether a previously created process property or a metadata of the same name should be reused.
filter
This specifies which parts of the JSON response should be used to search for the target
values. It has four attributes, where @key
and @value
are mandatory, while @fallback
and @alt
are optional. Further details can be found in the comments in the sample configuration.
target
This specifies which values are to be saved as variables for later use. The parameter has two attributes, where @var
specifies the variable name and @path
specifies the JSON path to retrieve the values.
parameter
A parameter is specified here that is to be sent to the REST API together with a request. It has two attributes, where @name
is used for the parameter name and @value
for the parameter value, which can consist exclusively of pure text values.
body
The request body is defined here. It has three attributes, whereby one of @src
and @value
must be specified. If @src
is set, @wrapper
is also applicable. The file whose content is to be used as the request body is specified with @src
, while @value
specifies the value of a variable that has been received from previous commands. When using @wrapper
, it is advisable to consider the comments in the sample configuration.
update
This element is used to save the JSON object of the response as a variable. It has an attribute @var
that specifies the name of the variable. Each command
tag can have at most one update
sub-element. Within the update
subelement, there can be multiple entry
subelements, each of which specifies a change to the JSON response object.
Identifier
intranda_step_createfullpdf
Repository
Licence
GPL 2.0 or newer
Last change
17.09.2024 16:58:45
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
imageFolder
This parameter expects the name of the image folder. Possible values are media
and master
, anything else will be regarded as media
.
singlePagePdf
The enabled
attribute of this parameter determines whether single-page PDFs are to be generated.
fullPdf
The enabled
attribute of this parameter determines whether a complete PDF is to be generated. The mode
attribute controls how this PDF is to be generated. The value ‘mets’ is available for this in order to generate the PDF based on the METS file. Alternatively, the value ‘singlepages’ can be used to generate the overall PDF from the previously created single-page PDFs. The single-page PDFs are only generated temporarily if they have not already been activated within the configuration. The pdfConfigVariant
attribute, on the other hand, is optional and determines which configuration variant is to be used. If it is not set, default
is used.
exportPath
This optional parameter can be used to specify a path for exporting the PDF files. If it is used, an absolute path is expected. If it is not specified, the PDF files are created within the ocr
directory of the process.
Identifier
intranda_step_delay_workflow_status
Repository
Licence
GPL 2.0 or newer
Last change
04.09.2024 09:02:52
Automatic Task
Yes
Plugin for Workflow Step
intranda_step_delay_workflowstatus
Plugin for Delay
Yes
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
is
The status of the step must match the configured status
not
The step must not be in the configured status
atleast
The step must have at least reached the configured status. This option does not work with deactivated
or error
.
Identifier
intranda_step_delay
Repository
Licence
GPL 2.0 or newer
Last change
21.11.2024 11:45:52
Automatic Task
Yes
Plugin for Workflow Step
intranda_step_delay
Plugin for Delay
Yes
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
delayInDays
Pause the workflow for the specified days.
Identifier
intranda_step_displayMetadata
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:59:59
Identifier
intranda_step_changeWorkflow
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:00:43
type
Determine which status the workflow steps are to receive.
title
Define here the name of the workflow steps that are to be set to the desired status.
value
Determine which priority the workflow steps are to receive.
title
Define here the name of the workflow steps that are to be set to the desired priority. Use *
if the value should be applied to all steps of this process.
step
Determine for which workflow step you want to enter the user groups.
usergroup
Define here the name of the user group that is to be entered as responsible for the configured step.
workflow
Define here the name of the process template to be used for the process.
Identifier
intranda_step_ark
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 12:01:12
project
This parameter determines for which project the current block <config>
should apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
should apply. The name of the workflow step is used here. This parameter can occur several times per <config>
block.
uri
The URL of the API must be stored in this parameter. As a rule, the standard entry https://www.arketype.ch
can be used.
naan
NAAN is an acronym for Name Assigning Number Authority. It is a unique identifier to which the account is assigned.
apiUser
Name of the API user
apiPassword
Password of the API user
shoulder
Name of the sub-namespace in which the new ARKs are to be created.
metadataCreator
Corresponds to the datacite.creator
field and should name the persons who created the data. Usually the default value {meta.CreatorsAllOrigin}
can be kept.
metadataTitle
Corresponds to the datacite.title
field and should contain the name by which the publication is known. As a rule, the default value {meta.TitleDocMain}
can be retained.
metadataPublisher
Corresponds to the datacite.publisher
field. As a rule, the default value {meta.PublisherName}
can be retained.
metadataResourceType
Corresponds to the datacite.publicationyear
field. As a rule, the default value {meta.PublicationYear}
can be retained.
metadataResourceType
Corresponds to the datacite.resourcetype
field. Only the values Audiovisual
, Collection
, Dataset
, Event
, Image
, InteractiveResource
, Model
, PhysicalObject
, Service
, Software
, Sound
, Text
, Workflow
, and Other
are allowed. In addition, specific subtypes can be specified. An example would be Image/Photo
. The subtype, i.e. the part after the /
, is not subject to any restriction.
publicationUrl
URL under which the digitised work will be available in the future. As a rule, the publication URL will follow a pattern, e.g. https://viewer.example.org/{meta.CatalogIDDigital}
. In this case, it is assumed that the works will be published in the future under a URL containing the metadate 'identifier'.
metadataType
Specifies the metadata type under which the URN is to be recorded. The default should not be changed here.
This step plug-in allows you to automatically duplicate a work step within the workflow several times according to a process property.
Identifier
intranda_step_duplicate_tasks
Repository
Licence
GPL 2.0 or newer
Last change
15.08.2024 11:20:05
This plugin reads the value of a process property and can automatically duplicate a defined workflow step several times depending on the contents of the property. In addition, the originally analysed property can be split and saved as separate new properties that refer to this duplication.
To install the plugin, the following file must be installed:
The configuration file is usually located here:
After a successful installation, the plugin is integrated into the workflow as shown in the following screenshot.
the plugin retrieves the value of the configured process property and splits it into parts using the possibly configured separator (or if not).
for each part of the original property, the possibly configured process step (or the next process step of the current process step if it is not configured) is duplicated again. The names of these duplicated new steps are given the name of the original step plus an incremented value.
a new task property or metadata is created for each duplicated new task, depending on how the @target
attribute is configured. The value of this new task property or metadata corresponds to the part of the original property on the basis of which this work step was duplicated.
if duplicates are created for each part of the original property, the original step is deactivated.
the plugin gets the value of the configured task property and splits it into parts using the possibly configured separator (or if not).
a new task property or metadata is created for each part of the original property, depending on how the @target
attribute is configured. The value of this new process property or metadata corresponds to the part of the original property.
The content of this configuration file looks as follows as an example:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows.
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
property
This value determines which process property should be used to check the desired duplication. It accepts four attributes, whereby only @name
is mandatory. Details of the possible configuration are listed in the sample configuration.
stepToDuplicate
This optional parameter can be used to specify the name of the work steps that are to be duplicated. If this value is not configured, the work step that follows next in the workflow is used for the duplication. The parameter also accepts an optional attribute @enabled
with a default value true
, which controls whether there is a work step to be duplicated.
This step plug-in allows you to ingest processes into the EWIG long-term archive.
Identifier
intranda_step_lza_ewig
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:50:43
This documentation describes the installation, configuration and use of a plugin for creating a METS file for long-term archiving EWIG.
The plugin consists of two files:
The file plugin_intranda_step_LZA_EWIG-base.jar
contains the program logic and must be installed in the following directory so that it is readable for the tomcat user:
The plugin_intranda_step_lza_ewig.xml
file must also be readable by the tomcat user and installed in the following directory:
After the plugin has been installed and configured, it can be used within one step. To do this, the intranda_step_lza_ewig
plugin must be selected within the desired task. The Automatic Task checkbox must also be selected.
This step within Goobi workflow exports all the files required for EWIG Ingest. The upload itself is carried out via the intranda TaskManager. This is useful to avoid several upload processes running in parallel having conflicts with each other and slowing down the system. For uploads, see chapter 4.17 in the intranda TaskManager documentation.
The configuration file plugin_intranda_step_lza_ewig.xml
must be structured as follows:
The <config>
block is repeatable and can therefore define different parameters in different projects. The subelements <project>
and <step>
are used to check whether the present block should be used for the current step. First, the system checks whether there is an entry containing both the project name and the step name. If this is not the case, an entry for any project marked by the arbitrary project name and the step name used is searched for. If no entry is found, a search is performed for the project name and any steps, otherwise the default block is used, which contains both <project>
and <step>
.
The element <exportFolder>
defines where in the file system the exported METS files are stored.
With <exportXmlLog>
you can determine whether the XML log should also be exported and written to the METS file. The log contains information about the workflow.
The <createManifest>
element controls whether a submission manifest should be created. If this is the case, the <manifestParameter>
must also be configured.
Each <manifestParameter>
consists of two parts, the name
attribute, which contains the name of the parameter, and the text in which the desired field contents are configured. Both static texts and all variables known in Goobi can be used. Several parameters can be specified separated by semicolons. If the first value is not known because, for example, the configured metadata has not been filled in, the next value is then tried.
This step plugin allows the registration of handle and DOI as persistent identifiers via the ePIC service of the GWDG.
Identifier
intranda_step_epic_pid
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:59:30
The plugin allows digital copies to be registered with the ePIC Service of the GWDG. Both the generation of Handle IDs and the registration of DOIs are possible. The Handles can be generated for each logical and physical element of a METS file and stored as metadata in each case.
To install the plugin, the following file must be installed:
In order to configure how the plugin should behave, the following two configuration files must also be installed:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_epic_pid
from the list of installed plugins.
Since this plugin is usually to be executed automatically, the work step should be configured as automatic in the workflow.
After the plugin has been fully installed and set up, it is usually executed automatically within the workflow, so there is no manual interaction with the user. Instead, the plugin is called by the workflow in the background and starts generating an identifier depending on the selected configuration. The plugin proceeds as follows:
The way the plugin works within the correctly configured workflow is as follows:
First, the plugin opens the METS file of the operation.
For each logical and physical element of this METS file, a handle in the form PREFIX-CLIENT-OBJECTID
is created. If the planned OBJECTID
is already assigned as a handle, an incrementing suffix (e.g.: -1
, -2
, etc.) is added at the end.
Finally, the generated handle is stored within the METS file as a metadatum for the respective logical or physical structural element. The metadata type _urn
is usually used for this purpose.
If the registration of DOIs has been activated, a new DOI identifier is generated for each logical top-level element in addition to the handle generation and stored within the METS file.
The configuration of the file plugin_intranda_step_epic_pid.xml
is structured as follows:
The block <config>
can occur repeatedly for different projects or workflow steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
project
This parameter determines for which project the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
should apply. The name of the workflow step is used here. This parameter can occur several times per <config>
block.
certificate
The certificate
element defines the path to the private key used for authentication.
user
This parameter sets the user name for authentication.
base
This sets the base name for the generation of the handles.
url
This parameter defines the final URL for the handle resolver. The beginning of the URL is defined at this point. The subsequently formed handle ID is placed at the end, so that the final URL will be structured as follows: url
+ handle ID
prefix
The actual handle is composed of several parts and usually has this structure: prefix
+ separator
+ name
+ separator
+ objectId
. The parameter prefix
defines the prefix with which the handle should begin. This parameter is optional.
name
The parameter name
defines the content of the handle to which the object IDs are subsequently appended. This parameter is optional.
separator
This parameter defines the separator to be used between the individual elements of the generated handle.
doiGenerate
This parameter determines whether a DOI identifier should also be generated in addition to the handle.
doiMapping
At this point a mapping file is named where the mappings of the metadata from the METS file to the DOI metadata are defined.
The configuration of the file plugin_intranda_step_epic_pid_mapping.xml
is structured as follows:
This configuration file defines how the available metadata from the METS file are to be used for registering the DOI. At least one metadata is defined for each field of the DOI that is to be used.
field
This parameter defines the DOI metadatum to be generated.
metadata
This parameter names the metadatum to be read from the METS file in order to use its value for the creation of the defined DOI field.
altMetadata
If the metadata defined with the metadata
parameter is not available, an alternative metadata can be defined here to be used instead. This parameter is optional and repeatable.
default
If the metadata defined by metadata
and altMetadata
cannot be found, a default value can be set here.
If a handle is registered, the following contents result from the communication with the ePIC service:
This information is then used by the GWDG's ePIC service to automatically generate a DOI identifier with the same ID: BASE/go-goobi-1296243265-17
.
This step plugin makes it possible to download files and verify them with checksums that exist as process properties. The validation result is saved within the journal.
Identifier
intranda_step_download_and_verify_assets
Repository
Licence
GPL 2.0 or newer
Last change
07.09.2024 14:12:25
This plugin reads URLs or hash values from several configured process properties, downloads the files from the defined URL and then compares them with the corresponding hash value. Finally, several responses can be given, depending on whether the status is success
or error
. These responses can be sent to another system via REST or simply logged within the journal.
To install the plugin, the following file must be installed:
The configuration file is usually located here:
The content of this configuration file looks like the following example:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows.
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
maxTryTimes
This value defines the maximum number of attempts to be made before feedback must be given. This parameter is optional and has the default value 1
.
fileNameProperty
This parameter controls the part for downloading and verifying the files. It accepts three attributes. @urlProperty
defines the name of the process property that contains the URL of the file. @hashProperty
defines the name of the process property that contains the checksum of the file. The attribute @folder
is optional and has the default value master
. It controls where the downloaded files are to be saved.
response
This optional parameter can be used to provide multiple responses after downloading and verifying the files. It accepts four attributes and a JSON text for REST requests with a JSON body. More details and examples can be found in the comments of the sample configuration file.
This is a Goobi Step plug-in to enable the registration of digital objects with the DataCite DOI service.
Identifier
intranda_step_doi
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:59:52
This documentation describes the installation, configuration and use of a plugin for registering DOIs via the DataCite API.
ATTENTION: It should be noted that this plugin is a new implementation of the datacite-doi plugin, which works using XSLT. This implementation has so far been limited to allowing DOIs to be registered for stand-alone works (e.g. monographs and journal volumes). Registering DOIs for structural elements (e.g. for journal articles) is not yet possible with this plugin.
The plugin consists of the following files:
The file plugin_intranda_step_doi-base.jar
contains the programme logic. It must be installed under the following path:
The file doi.xsl
is the transformation file that represents the basic framework of the DataCite metadata, into which the plugin inserts the individual metadata of the respective transaction in order to subsequently register the DOIs with it. It must be installed under this path:
The file plugin_intranda_step_doi.xml
is the main configuration file for the plugin. It must be installed under this path:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_doi
from the list of installed plugins.
Since this plugin should usually be executed automatically, the workflow step should be configured as automatic in the workflow. Since the plugin writes the DOI to the metadata file of the operation, the checkbox for Update metadata index when finishing
should also be activated.
This plugin first reads its configuration file and tries to fill the field variables with those contents of the METS file that were defined in the configuration. The field variables are run through from top to bottom. As soon as a value has been determined in a defined field, it is assigned to the variable. If no value was determined in any of the fields, the default value is used instead. If no default value is defined for a field variable, it remains empty.
After the creation of the field variables, they are transferred to the transformation file as an xml file. The transformation file uses the defined field variables to insert the contents from the METS file. The DataCite xml file generated in this way is then used for registering or updating the DOIs at DataCite, using the access data and URL information from the configuration file.
The configuration is done via the configuration file plugin_intranda_step_doi.xml
and can be adjusted during operation. It is structured as follows:
The block <config>
can occur several times for different projects or workflow steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meaning:
project
This parameter determines for which project the current block <config>
should apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
should apply. The name of the workflow step is used here. This parameter can occur several times per <config>
block.
serviceAddress
This parameter defines the URL for the DataCite service. In the example above, it is the test server.
debugMode
With this parameter, the debug mode can be activated. This allows the XML file with the defined field variables (doi_in.xml
) as well as the transformed DataCite XML file (doi_out.xml
) to be stored within the tmp
directory of Goobi workflow. This allows insight into the actual metadata used or customised for DOI registration.
draft
This parameter can be used to specify that the DOIs are reserved as drafts but not yet officially registered. They are therefore not yet publicly accessible and are not yet invoiced by DataCite.
base
This parameter defines the DOI base for the facility registered with DataCite.
viewer
username
This is the username used for DataCite registration.
password
This is the password used for DataCite registration.
prefix
This is the prefix to be given to the DOI before the name and ID of the document.
name
This is the name to be given to the DOI before the ID of the document.
separator
Define here a separator to be used between the different parts of the DOI.
metadata
This parameter specifies under which metadata name the DOI should be stored in the METS-MODS file. Default is DOI
.
xslt
This parameter sets the transformation file to be used for DOI registration.
field
- name
The parameter name
can be used to name a field variable that is to be available for mapping.
field
- default
This parameter can be used to specify a value that the field variable should receive if none of the listed metadata can be found from the elements data
.
field
- repeatable
This can be used to control that values that occur more than once (queried e.g. by using {metas.SubjectTopic}
instead of {meta.SubjectTopic}
) are separated by a semicolon and used as single values.
field
- data
- content
Within this element, metadata or even static texts can be defined that are to be assigned as values of the field variable. The order of the listed data
elements is decisive here. As soon as a field with the content could be found, the following data
elements are skipped. This is therefore a descending priority of the listed elements.
The transformation file doi.xsl
looks something like this:
Within this transformation file, the DataCite XML is listed as the basic framework. Contents of the individual xml elements are automatically inserted from the field variables defined in the main configuration file. Besides these usable definable field variables there are also some additional variables that can be used:
//GOOBI-ANCHOR-DOCTYPE
This variable contains the internal name of the publication type of the parent anchor element (e.g. Periodical).
//GOOBI-DOCTYPE
This variable contains the internal name of the publication type of the work (e.g. Monograph).
//GOOBI-DOI
This variable contains the DOI to be used.
DataCite documentation: https://support.datacite.org/docs/getting-started
Metadata schema overview: https://schema.datacite.org/
Metadata schema for version 4.4 with sample files: https://schema.datacite.org/meta/kernel-4.4/
Admin area for DataCite customers: https://doi.datacite.org/
Admin area in the test system for Datacite customers: https://doi.test.datacite.org/
Example of a DataCite XML file from Goobi:
This Step Plugin enables the enrichment of metadata within a METS file based on data from an Excel file.
Identifier
intranda_step_excelMetadataenrichment
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:59:22
This plugin allows metadata to be read from an Excel file and added to existing structural elements.
To install the plugin, the following file must be installed:
To configure how the plugin should behave, various values can be adjusted in the configuration file. The configuration file is usually located here:
To put the plugin into operation, it must be activated for a task in the workflow. This is done as shown in the following screenshot by selecting the plugin plugin_intranda_step_excelMetadataenrichment
from the list of installed plugins.
Since this plugin should usually be executed automatically, the work step should be configured as automatic in the workflow.
After the plugin has been fully installed and set up, it is usually executed automatically within the workflow so that there is no manual interaction with the user. Instead, the workflow invokes the plugin in the background and performs the following tasks:
First, a suitable Excel file is searched for. The configured path is searched. If a single Excel file exists there, it is opened regardless of its name. If there are several Excel files, it is expected that the Excel file is named after the process name.
If an Excel file is found, the metadata is read. All existing structure elements are listed and checked whether they contain a metadatum that corresponds to the configured value in the field <docstructIdentifier>
. If this is the case, the Excel file is searched for a row in which the metadatum was used in the column configured in the field <excelIdentifierColumn>
. If it is found, the metadata of the row is added to the structure item.
The configuration of the plug-in is structured as follows:
The block <config>
can occur repeatedly for different projects or work steps in order to be able to carry out different actions within different workflows.
The field <excelFolder>
defines where the Excel file is searched for. The Goobi internal variables can be used to define e.g. the process folder or the master folder. Alternatively, an absolute path can be specified where all excel files to be imported are located. If there is more than one excel file in the configured directory, a file 'PROCESSNAME.xlsx' is expected.
The fields <docstructIdentifier>
and <excelIdentifierColumn>
are used to define the name of the metadatum and the Excel column via which the individual lines of the Excel file can be assigned.
The configuration of the metadata and personal data to be imported is already described here:
https://docs.goobi.io/goobi-workflow-plugins-en/import/intranda_import_excel#import-metadata
https://docs.goobi.io/goobi-workflow-plugins-en/import/intranda_import_excel#import-of-persons
This is a technical documentation for the plugin for the automatic enriching the process with images based on metadata with the file names in the process.
Identifier
intranda-step-fetch-images-from-metadata
Repository
Licence
GPL 2.0 or newer
Last change
30.01.2025 09:40:27
This documentation describes the installation, configuration and use of the plugin. This plugin can be used to copy or move images from a configured folder or from specific URLs to the desired folder in the process using the file name stored in the process.
The plugin consists of two files:
The file plugin_intranda_step_fetch_images_from_metadata-base.jar
contains the programme logic and must be installed in the following directory so that it can be read by the tomcat
user:
The configuration file plugin_intranda_step_fetch_images_from_metadata.xml
must also be readable for the tomcat
user and must be installed in the following directory:
This plugin is integrated into the workflow so that it is executed automatically. Manual interaction with the plugin is not necessary. To use it within a workflow step, it should be configured as shown in the screenshot below.
The plugin is usually executed fully automatically within the workflow. It first determines whether the metadata specified in the configuration exists and then analyses it. The file specified in the metadata is then copied or moved to the media folder of the process based on its name and file extension. The plugin checks the existing images in the media
folder of the process to see whether the desired image has already been imported, and if not:
In the following two cases, the order of the imported images is updated and saved in the Mets file:
if useUrl
is set to true
, the plugin will download the image from the specified URL
if useUrl
is set to false
or not at all, the name of each file is checked to determine whether it should be treated as the first file in the directory, while the other images are simply sorted by their names.
The plugin is configured via the configuration file plugin_intranda_step_fetch_images_from_metadata.xml
and can be customised during operation. An example configuration file is listed below:
The individual parameters have the following function:
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
useUrl
This parameter determines the source location of the images to be retrieved. If it is set to true
, the images are retrieved from the registered URLs in the mets file, if it is set to false
or not set at all, the images are retrieved from the following configured folder.
clearExistingData
This parameter determines whether existing images should be deleted before a run. In addition to the images, the pagination and page assignment is also removed.
filenameMetadata
The name of the metadata field (usually from the METS file) that contains the file name of the file to be imported is specified here.
fileHandling
The @mode
attribute defines whether the images are to be imported by copying or moving. The @ignoreFileExtension
attribute controls whether the file extension should be ignored for the copying process or must be exactly correct. The @folder
attribute specifies the folder in which the files to be imported are located.
export
The @enabled
attribute defines whether the process is to be exported or not, while the @exportImages
attribute defines whether the images are to be taken into account.
This is a Goobi step plugin to allow the registration of digital objects at the DataCite DOI service.
Identifier
intranda_step_datacite_doi
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:46:41
This documentation describes the installation, configuration and use of the plugin for registering DOIs.
ATTENTION: There is a newer plugin for this functionality that allows a higher degree of freedom for DOI registration by means of XSL transformation. Documentation of the new plugin can be found here: https://docs.goobi.io/goobi-workflow-plugins-en/step/intranda_step_doi
The plugin consists of these files:
The file plugin_intranda_step_datacite_doi-base.jar
contains the program logic. it needs to be installed at this path:
The file plugin_intranda_step_datacite_mapping.xml
is the mapping file, defining how local metadata should be translated to the form required for the DOI registration. It needs to be installed at this path:
The file plugin_intranda_step_datacite_doi.xml
is the main configuration file for the plugin. It needs to be installed at this path:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_datacite_doi
from the list of installed plugins.
Since this plugin should usually be executed automatically, the workflow step should be configured as automatic in the workflow. Since the plugin writes the DOI to the metadata file of the operation, the checkbox for 'Update metadata index when finishing' should also be activated.
The programme examines the metadata fields of the METS/MODS file from the Goobi operation. If a <typeForDOI>
is specified, then it goes through every structure element of that type in the file. If not, it takes the top structure item. From this it creates the data for a DOI, using the mapping file to translate it. Then it registers the DOI using DataCite's MDS API, specifying the DOI by <base>
along with any <prefix>
and <name>
and the ID of the document (its CatalogIDDigital
) plus an incremented counter if more than one DOI was created for the given document. The record is given a registered URL defined by <url>
followed by the DOI. The generated DOI is stored in the METS/MODS file under the metadata specified in <doiMetadata>
. For example, if the value for <typeForDOI>
is Article
, then each article in the METS/MODS file will have a DOI stored in the metadata under <doiMetadata>
for each article.
The configuration is done via the configuration file plugin_intranda_step_datacite_doi.xml
and can be adapted during operation. It is structured as follows:
The block <config>
can occur repeatedly for different projects or workflow steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
project
This parameter determines for which project the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
should apply. The name of the workflow step is used here. This parameter can occur several times per <config>
block.
serviceAddress
This parameter defines the URL for the Datacite service. In the example above, it is the test server.
base
This parameter defines the DOI base for the institution, which has been registered with Datacite.
url
username
This is the username that is used for the DataCite registration.
password
This is the password that is used for the DataCite registration.
prefix
This is the prefix that may be given to the DOI before the name and ID of the document.
name
This is the name that may be given to the DOI before the ID number of the document.
separator
Define here a separator that shall be used between the different parts of the DOI.
doiMetadata
This parameter specifies under which metadata name the DOI is to be saved in the METS-MODS file. Default is DOI
.
doiMapping
In this parameter the path to the mapping file for the DOI registration is defined.
typeForDOI
With this parameter the DocStruct type can be defined which will be given DOIs. If this is empty or missing, the top DocStruct element only will be given a DOI. If the parameter contains the name of a sub-DocStruct, then these will be given DOIs.
The mapping configuration file looks something like this:
For each <map>
, the <field>
specifies the name of the DOI element, and the <metadata>
and <altMetadata>
entries specify from which metadata of the structure elements the value is to be taken in turn. If there is no such entry in the structure elements, then the <default>
value is taken. The value "unkn"
for "unknown" is recommended by Datacite for missing data.
The elements <listMap>
allow to create list elements within the generated Datacite structure, so that repeating values can be defined. Attributes can also be specified, which are taken over itentically with name and value for the list element to be created (e.g. alternateIdentifierType="Goobi identifier"
);
For mandatory fields a <default>
must be specified; for optional fields this is not necessary, but can be done if desired.
The default entry #CurrentYear
is a special case: it is replaced by the current year during DOI generation.
If, for selected structural elements, a reference is to be made to the work in which this element was published, several elements can be listed as publicationTypeWithRelatedItem
. For these, the block of elements <publicationData>
can also be evaluated. This could be used for scientific articles, for example.
Datacite documentation: https://support.datacite.org/docs/getting-started
Metadata schema overview: https://schema.datacite.org/
Metadata schema for the version 4.4 mit Beispieldateien: https://schema.datacite.org/meta/kernel-4.4/
Admin area for Datacite customers: https://doi.datacite.org/
Admin area in the test system for Datacite customers: https://doi.test.datacite.org/
Example of a Datacite XML file from Goobi:
This step plugin allows the upload of different files within tasks in the web interface.
Identifier
intranda_step_fileUpload
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:58:57
This plugin is used to upload files within the user interface of an accepted task in Goobi workflow.
To install the plugin the following two files must be installed:
To configure how the plugin should behave, different values can be adjusted in the configuration file. The configuration file is usually located here:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the intranda_step_fileUpload
plugin from the list of installed plugins.
After the plugin has been completely installed and set up, it is available for the users of the corresponding tasks. After entering a task, it is now possible to upload files in the right area of the user interface.
Files can either be uploaded to this area by drag & drop or alternatively selected by clicking the button and thus uploaded.
If you want to check which files are already present in the folder after the upload, you can switch the display in the upper right area. This enables the user to list all files already existing in the folder, download individual files or delete them.
The configuration of the plugin is structured as follows:
The block <config>
can occur repeatedly for different projects or workflow steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
project
This parameter determines the project for which the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
is to apply. The name of the step is used here. This parameter can occur several times per <config>
block.
regex
This parameter can be used to specify which file types should be allowed for upload. In the above example, multiple image formats as well as pdf files are allowed.
folder
With this parameter you can define where the upload of the files should take place. This parameter can occur repeatedly and thus allows an upload to several directories, between which the user can then choose. Possible values for this are e.g. master
, media
or also individual folders such as photos
and scans
.
This step plugin is used to generate missing ALTO IDs.
Identifier
intranda_step_generate_alto_ids
Repository
Licence
GPL 2.0 or newer
Last change
07.09.2024 14:15:03
This documentation explains the plugin for generating missing ALTO IDs. This is required for the ALTO editor to work properly. Some external OCR tools don't provide these ALTO IDs. This plugin can then be used to generate them afterwards.
To be able to use the plugin, the following files must be installed:
Once the plugin has been installed, it can be selected within the workflow for the respective work steps and thus executed automatically. A workflow could look like the following example:
To use the plugin, it must be selected in a workflow step:
When the plugin is started, all ALTO files are checked for missing IDs. If missing IDs are found, a backup of all OCR results including the ALTO files is created first. The missing ALTO IDs are then added to all files.
This plugin requires no configuration.
Step plugin for the dynamic customisation of data entry masks for metadata
Identifier
intranda_step_flex_editor
Repository
Licence
GPL 2.0 or newer
Last change
04.09.2024 09:22:13
This plugin enables dynamic customization of the user interface, allowing specific metadata management requirements to be efficiently implemented.
This plugin is delivered as a tar archive. To install it, the archive plugin_intranda_step_flex-editor.tar
must be extracted into the Goobi directory:
This plugin also comes with a configuration file named plugin_intranda_step_flex-editor.xml
. It must be placed in the following path:
To use the plugin, it must be selected in a workflow step:
The Flex Editor for Goobi Workflow allows flexible customization of the metadata input interface. Through an XML configuration file, you can define how metadata fields are organized and displayed in columns and boxes. Various field types, such as text fields, checkboxes, and dropdowns, provide different input options.
The plugin is configured using the file plugin_intranda_step_flex-editor.xml
as shown here:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
In addition to these general parameters, the following parameters are available for further configuration:
The configuration file describes the structure of the user interface as seen in Goobi. The configuration consists of multiple <column>
elements, each representing a column in the interface. Within the <column>
elements, there are <box>
elements that group multiple metadata fields into a box within the interface. Inside the <box>
elements are <field>
elements, representing a metadata field in the process. The <field>
elements can have different types, giving them specific functionality in the user interface:
INPUT
A single-line input field used for capturing simple text inputs. A metadata type must also be specified.
TEXTAREA
A multi-line input field. Specifying a metadata type is also required.
BOOLEAN
A checkbox used for yes/no decisions or binary options. A metadata type must also be specified.
DROPDOWN
A dropdown menu with values sourced from the predefined vocabulary. In addition to the metadata type, the name of the vocabulary to be used must be specified.
MODAL_PROVENANCE
Creates a metadata group that includes multiple fields. These fields can also be sourced from vocabularies. The field is repeatable and can use multiple vocabularies.
This step plugin allows you to export the metadata and content of a Goobi process to a configurable path
This plugin allows a flexible export of data of a process into a defined target directory. This plugin can be configured very granularly to include selected data in the export. In addition, a transformation of the internal and the export METS file via XSLT is also possible here and thus allows a wide range of usage scenarios.
To install the plugin, the following file must be installed:
To configure how the plugin should behave, various values can be adjusted in the configuration file. The configuration file is usually located here:
To use the plugin, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin 'intranda_step_exportPackage' from the list of installed plugins.
Since this plugin is usually to be executed automatically, the step in the workflow should be configured as automatic.
Once the plugin is fully installed and set up, it is usually executed automatically within the workflow, so there is no manual interaction with the user. Instead, the workflow calls the plugin in the background and performs the configured export to the target directory. The specified contents are all copied into a subdirectory of the defined export path.
Depending on the configuration, an XSLT transformation of the internal or the export METS file can be carried out in addition to the export of the data in order to bring it into a desired format. Depending on this transformation as well as the name of the transformation file, it is finally also saved in the folder of the exported processes.
The configuration of the plugin is structured as follows:
The block <config>
can occur repeatedly for different projects or workflow steps in order to be able to carry out different actions within different workflows. The other parameters within this configuration file have the following meanings:
This Step Plugin allows the extraction of metadata from image files in order to store them within the METS files.
With the help of this plugin, metadata can be extracted from image files and stored within Goobi's METS files. Here, a use of the Linux programme ExifTool takes place in the background in order to transfer its read image metadata according to individual configuration.
To install the plugin, the following file must be installed:
To configure how the plugin should behave, various parameters can be adjusted within the configuration file. The configuration file is usually located under the following path:
To put the plugin into operation, it must be activated in a task in the workflow. This is done by selecting the plugin intranda_step_imageMetadataExtraction
from the list of installed plugins. Since the plugin relies on a METS/MODS file, the step should take place after the metadata edition.
Once the plugin has been fully installed and set up, it is usually run automatically within the workflow, so there is no manual interaction with the user. Instead, the workflow calls the plugin in the background and automatically performs the extraction of the image metadata. This is done by opening the first image file from the media directory of the Goobi process, reading its metadata and storing it on the top logical level of the METS file as the configured metadata.
The configuration of the plugin is structured as follows:
The block <config>
can occur repeatedly for different projects or workflow steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
The definition of fields is done with the following parameters:
This step plugin for Goobi workflow performs configurable validation of files
This documentation describes the installation, configuration and use of the Step Plugin for validation with configurable checking profiles.
The plugin consists of the following file:
This file must be installed in the correct directory so that it is available at the following path after installation:
In addition, there is a configuration file that must be located in the following place:
The plugin is usually executed fully automatically within the workflow. It starts the configured checking process and then outputs whether the required checking level has been reached. If one of the checked documents does not reach the required level, the plugin fails.
This plugin is integrated into the workflow in such a way that it is executed automatically. Manual interaction with the plugin is not necessary. For use within a workflow step, it should be configured as shown in the screenshot below.
The configuration of the plugin is done via the configuration file plugin_intranda_step_file_validation.xml
and can be adjusted during operation. The following is an example configuration file:
The config_plugin
element can have two child element types: config
and global
. First, the functionality of the config
element is described here.
The global
element can have 3 child element types: profile
, namespaces
and tools
.
The namespace
element can have several children of the type namespace
. A namespace
here describes an XML namespace and has the following attributes:
The tools
element can have several children of the type tool
. The tool
element can be used to describe the parameters needed to execute a tool/script from the plugin.
The profile
element can have several children of the type tool
. It also has the attribute name
with whose value it can be referenced in the 'config' element profileName
. A profile has several elements of type level
. Each level can contain several check
and setValue
elements. The levels are numbered internally according to their order. The first level
element is level 0
, the second level 1
and so on.
A check
makes it possible to check a value in one of the generated xml-reports. A regular expression is used to check the value. If no regular expression is specified, only the existence of the specified xml element is checked. If a check fails, the level is considered to have failed. Unless the failed check is in a group, then all other checks in the group must also fail for the level to be considered failed.
The attributes of the check
element look like this:
Ein setValue
-Element ermöglicht es einen Wert aus einem der erzeugten Reports auszulesen und in den Prozesseigenschaften oder den Metadaten des obersten Strukturelements zu speichern. Die Attribute des setValue
-Elements sehen wie folgt aus:
A basic requirement of this plugin is that the tools used generate XML output. However, it often happens that the desired tool does not generate XML output. In this case we advise you to transform the output to XML with a GAWK script. The output of the file
command serves as an example here:
Instead of calling the tool directly, one would now create a shell script with the following content and store it in the cmd
attribute of the tool:
If we only need the second parameter from the output of the file
command, the (g)awk script could look like this:
The result would then be the following XML output:
Complete example of plugin configuration within the file plugin_intranda_step_file_validation.xml
:
Example for PDF validation call using pdfinfogawk.sh
:
Example file namedKeys.awk
:
Example of validation using file command via filegawk.sh
:
Example file fileFormat.awk
:
This step plugin retrieves monument information from a vocabulary database in order to update it in the METS file. It was developed for the Federal Monuments Office in Austria.
This plugin allows the data transfer of multiple metadata from a vocabulary into METS files. It was developed specifically for the Federal Monuments Office in Austria, so the metadata of the vocabulary are very individual and hard-coded. They originally come from the so-called HERIS database, which was imported within Goobi workflow as its own vocabulary.
The plugin consists in total of the following files to be installed:
This file must be installed in the following directory:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_herisimport
from the list of installed plugins.
Since this plugin should usually be executed automatically, the workflow step should be configured as automatic
.
Once the plugin has been fully installed and set up, it is usually run automatically within the workflow, so there is no manual interaction with the user. Instead, calling the plugin through the workflow in the background does the following:
The plugin searches the METS file for a metadata with the name HerisID
and subsequently imports a list of various metadata from the Heris vocabulary. The mapping of the metadata includes the following list:
There is no independent configuration of the plugin, as the metadata to be imported has been hard-coded.
Step plugin for the automatic creation of Handle IDs within METS files
The plugin generates a Handle on the Handle server of the GWDG for all logical and physical elements of a METS file. These Handles are then stored in the respective element itself as metadata _urn
.
If automatic DOI assignment is installed, a new DOI is generated and stored for each top-level logical element.
To use the plugin, the following files must be installed:
The file goobi-plugin-step-handle-mets.jar
contains the program logic and must be installed in the following directory, readable by the Tomcat user:
The file plugin_intranda_step_handle_mets.xml
must also be readable by the Tomcat user and installed in the following directory:
Once the plugin is installed and configured, it can be used within a Goobi workflow step. To do this, add the plugin plugin_intranda_step_handle_mets
within the desired task. Additionally, ensure that the Metadata and Automatic Task checkboxes are selected.
To utilize automatic DOI assignment, an additional file must be installed at the following path, readable by the Tomcat user:
This file is used to configure the plugin and is located in the mappings
folder.
The plugin operates within a correctly configured workflow as follows:
When the plugin is invoked within the workflow, it opens the METS file.
A Handle is generated for each logical and physical element of the METS file (in the form /goobi-Institution-objectId
, where objectId
is the object identifier, possibly supplemented with -1
, -2
, etc., if the Handle already exists).
The generated Handle is then written into the respective structural element as metadata of type _urn
.
When creating Handles for the top-level logical structural element of a METS file, additional metadata is stored alongside the generated Handle ID and its associated URL. An example of this information is as follows:
These details are used in the case of additional DOI registration to create a DOI with the same ID, for example 21.T119876543/goobi-go-1296243265-17
.
The plugin configuration is done in the file plugin_intranda_step_handle_mets.xml
as shown below:
The <config>
block can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The other parameters within this configuration file have the following meanings:
In addition to these general parameters, the following parameters are available for further configuration:
For DOI assignment, the file plugin_intranda_step_handle_mets.xml
must include the following additional configurations:
The DOIMappingFile
parameter defines the path to the plugin_intranda_step_handle_mets.xml
file.
In the DOI-Mapping.xml
file, each <map>
entry describes a mapping between a Dublin Core element and one or more metadata fields from the METS file. The file is structured as follows:
This step plugin allows you to generate configurable identifiers and save them within a metadata in the METS file.
The plugin allows the automatic generation of identifiers and the saving within a metadata in the METS file of the corresponding processes.
To install the plugin, the following file must be installed:
To configure how the plugin should behave, various values can be adjusted in the configuration file. The configuration file is usually located here:
To use the plugin, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_generateIdentifier
from the list of installed plugins.
Since this plugin is usually to be executed automatically, the step in the workflow should be configured as automatic.
Once the plugin is fully installed and set up, it is usually executed automatically within the workflow, so there is no manual interaction with the user. Instead, the workflow calls the plugin in the background and starts the generation of an identifier depending on the selected configuration.
The configuration of the plugin is structured as follows:
The block <config>
can occur repeatedly for different projects or workflow steps in order to be able to carry out different actions within different workflows. The other parameters within this configuration file have the following meanings:
Goobi Step Plugin for annotating automatically existing "location" NER tags in ALTO files with Geonames URLs.
This step plugin for Goobi workflow automatically annotates existing "location" NER tags in ALTO files with GeoNames URLs. The first hit of the search query is always taken. It is therefore recommended to check and correct the results again.
The plugin consists of the following files:
The file goobi_plugin_step_geonamesautoannotator-base.jar
must be installed in the correct directory so that it is available at the following path after installation:
In addition, there is a configuration file that must be located in the following place:
This plugin is integrated into the workflow in such a way that it is executed automatically. Manual interaction with the plugin is not necessary. For use within a workflow step, it should be configured as shown in the screenshot below.
The plugin searches the GeoNames database for all location
NER tags. If one or more search hits are returned, the first search hit in the list is transferred to the ALTO.
The configuration of the plugin is done via the configuration file plugin_intranda_step_geonamesautoannotator.xml
and can be adjusted during operation. The following is an example configuration file:
It is recommended to purchase a higher quota from geonames for the operation of the plugin. If this has been done, the geonamesApiUrl
must be changed to http://ws.geonames.net
.
Goobi Step Plugin for annotating and correcting previously created GeoNames identifiers in ALTO OCR results.
This Step Plugin for Goobi workflow allows the annotation with - respectively correction of - previously automatically created GeoNames identifiers in ALTO OCR results. For this purpose, the NER results annotated within ALTO with the type 'location' are displayed in a table. If automatically generated GeoNames identifiers are already present in the ALTO file, they are visualised on a map.
Das Plugin besteht aus folgenden Dateien:
These jar files must be installed in the correct directories so that they are available at the following path after installation:
In addition, there is a configuration file that must be located in the following place:
To put the plugin into operation, it must be activated for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the plugin intranda_step_geonamescorrection
from the list of installed plugins.
After entering the plug-in, all named entities of the type location
found are displayed in a table on the left. If GeoNames URLs with NER tags are already stored in the OCR results, they are visualised on a map on the right half of the screen.
By clicking on one of the markers in the map, the corresponding entry in the table can be marked and by clicking on an entry in the table, it is zoomed in on the map. A click outside the table and the map zooms out again and all markers are displayed.
Entries can be deleted by clicking on the delete icon and edited by clicking on the edit icon. This opens a new input mask on the left.
The search results for the term are displayed in a new table. The user can also change the search term in the input slot at the very top and initiate a new search by clicking on Search
.
To adopt a GeoNames identifier, the corresponding icon (one tick) must be clicked. The button with two ticks, on the other hand, adopts this identifier for all identical terms in the entire work.
By clicking on Save
or Save and exit
at the bottom right, the adapted GeoNames identifiers are written into the ALTO files. Save and exit
also exits the plugin.
The configuration of the plug-in is done via the configuration file plugin_intranda_step_ark.xml
and can be adjusted during operation. The following is an example configuration file:
It is recommended to purchase a higher quota from geonames for the operation of the plugin. If this has been done, the geonamesApiUrl
must be changed to http://ws.geonames.net
.
The parameter viewer
defines the prefix that each DOI link receives. A DOI "10.80831/goobi-1", for example, receives the hyperlink here ""
The url parameter defines the prefix accorded to each DOI link. A DOI "10.80831/goobi-1", for example, will here be given the hyperlink ""
Identifier
intranda_step_exportPackage
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:59:15
project
This parameter determines the project for which the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
is to apply. The name of the step is used here. This parameter can occur several times per <config>
block.
target
This parameter defines the main path where the export of the process as a subfolder with the process name should be exported to.
useSubFolderPerProcess
This parameter determines whether a subfolder is to be created for each process.
createZipPerProcess
This parameter can be used to determine whether a zip file is to be created for each process.
imagefolder
Several directories can be specified for the images or digital copies. This can include, for example, the master images and the derivatives. If the METS file is to contain checksums for the individual images, the attribute filegroup
can be used here to specify for which mets:fileGrp>
the checksums of the files from this folder are to be used.
ocr
This parameter specifies whether the OCR results are to be exported as well.
source
If the contents of the source
folder should be included, this can be specified here.
import
If the contents of the import
folder should be included, this can be defined here.
export
If the contents of the export
folder are to be included, this can also be specified here.
itm
If the contents of the TaskManager directory itm
are to be exported as well, this is defined here.
validation
With this parameter you can specify that the contents of the validation
directory should also be exported.
uuid
If UUIDs (v4) are to be used for linking between <mets:structMap>
, <mets:fptr>
and <mets:fileGrp>
, <mets:file>
, this can be specified here.
checksum
When this option is enabled, the exported data is compared with previously generated checksums to verify successful export. If file groups were also configured when configuring the image folders
, the checksums are also entered into the corresponding file groups.
checksumValidationCommand
Contains the command line tool used to perform the verification.
transformMetaFile
This parameter defines whether the Goobi workflow internal METS file should be copied to the target directory.
transformMetaFileXsl
This parameter can be used to specify whether the internal METS file should be processed using the XSLT transformation file defined here.
transformMetaFileResultFileName
If the internal METS file is to be transformed using XSLT, you can specify here what the name of the file to be generated should be.
transformMetsFile
This parameter defines whether the export METS file from Goobi workflow should be copied to the target directory.
transformMetsFileXsl
This parameter can be used to specify whether the export METS file should be processed using the XSLT transformation file defined here.
transformMetsFileResultFileName
If the export METS file is to be transformed using XSLT, you can specify here what the name of the file to be generated should be.
Identifier
intranda_step_imageMetadataExtraction
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:58:06
project
This parameter determines for which project the current block <config>
should apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
should apply. The name of the work step is used here. This parameter can occur several times per <config>
block.
command
Within this parameter, the path to the programme 'ExifTool' is specified. This is a programme installed on the server that can read the metadata from image files.
field
For each desired metadata to be read out per image, a field
can be given in each case, consisting of the attributes line
and metadata
.
line
This parameter defines the name of the metadata within the result of ExifTool. Enter here accordingly the name how the metadata is present within the image.
metadata
This parameter determines under which metadata type the content of the read metatum is stored in the METS file. The internal name of the metadata type as defined in the corresponding rule set is used. It should be noted here that the metadata are always stored at the level of the highest logical structural element (e.g. a monograph) and not at subordinate logical or physical elements.
Identifier
intranda_step_file_validation
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:58:49
project
This parameter determines for which project the current block <config>
should apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which work steps the block <config>
should apply. The name of the work step is used here. This parameter can occur several times per <config>
block.
institution
Within the scope of dashboard delivery, this parameter controls for which institution the block is to apply. The name of the facility is used here. This parameter can occur several times per block.
inputFolder
Here it must be specified where the documents to be checked are located. Goobi variables such as {processpath}
can be used when specifying.
outputFolder
Here it must be specified where the reports generated by the tools (tools
) are to be stored. Goobi variables such as {processpath}
can be used when specifying.
fileFilter
A regular expression can be formulated here to limit which files are to be checked based on the file name (usually the file extension).
profileName
Here you can specify the checking profile to be used for this institution or this project
/step
combination.
targetLevel
Here it must be specified which level of the verification process must be achieved by the document.
name
Allows the name of the namespace to be specified. In the tool
, check
and setValue
elements, the namespace
can then be addressed by this name.
uri
The URI of the XML namespace must be specified here.
name
Allows the name of the tool to be specified. In the check
and setValue
elements, the tool
can then be referenced by this name.
uri
The URI of the XML namespace must be specified here.
cmd
Here the command must be specified with which the tool (e.g. jhove
) can be called. In the cmd
attribute the plug-in specific variables {pv.outputFile}
(path to the output file) and {pv.inputFile}
(path to the document) can be used.
stdout
Here you can specify whether the tool writes its output to stdout (true
) or to a configuration file (false
).
xmlNamespace
Here a namespace
element can be referenced by its name.
name
The name of the check must be specified here, e.g. isPDF
. With the help of the name, the check can then be referenced by other check
/setValue
elements. The check name is also used in the generated report.
group
This attribute is optional. Checks in the same group are OR-linked, i.e. the level is not considered to have been reached until all checks in this group have failed.
dependsOn
This attribute is optional. If it is specified, the check listed in dependsOn
must be successfully executed for this check to be executed.
tool
Here you must specify which tool generates the underlying XML report.
code
An error message must be specified here.
xpathSelector
Here the xpath selector must be specified, which selects the corresponding XML node in the XML document.
regex
This attribute is optional. If it is specified, it is checked whether the selected value matches the regular expression. If no regular expression is specified, only the existence of the XML element is checked.
xmlNamespace
This attribute is optional. This attribute can be used to specify a namespace
that differs from the namespace of the tool
. This may be necessary, for example, if different namespaces are used in a report.
name
The name of the setValue
element must be specified here, e.g. readPDFVersion
. The name is also used in the generated report.
dependsOn
This attribute is mandatory. A setValue
element always depends on a check. The check listed in dependsOn
must be successfully executed for this setValue
element to be evaluated.
tool
Here you must specify which tool generates the underlying XML report.
code
An error message must be specified here.
xpathSelector
Here the xpath selector must be specified, which selects the corresponding XML node in the XML document.
xmlNamespace
This attribute is optional. This attribute can be used to specify a namespace that differs from the namespace of the `tool'. This may be necessary, for example, if different namespaces are used in a report.
processProperty
This attribute is optional. Here you can specify in which process property the read-in value is to be saved.
mets
This attribute is optional. Here you can specify in which metadatum of the uppermost structure element the read-in value is to be saved. For this, it must be ensured that the specified values match the rule set.
Identifier
intranda_step_herisimport
Repository
Licence
GPL 2.0 or newer
Last change
07.09.2024 14:17:33
Alte Objekt-ID
DMDBID
Gehört zu alter Objekt-ID
ParentElement
Katalogtitel
TitleDocMain
Typ
HerisType
Hauptkategorie grob
MainCategory1
Hauptkategorie mittel
MainCategory2
Hauptkategorie fein
MainCategory3
Gemeinden politisch (lt. Katastralgemeinden)
PoliticalCommunity
Katastralgemeinde
CadastralCommune
Bezirk
PoliticalDistrict
Bundesland
FederalState
Grundstücksnummern
PropertyNumber
Bauzeit von
ConstructionTimeFrom
Bauzeit bis
ConstructionTimeTo
Publiziert
Published
Straße
Street
Hausnummer
StreetNumber
PLZ
ZIPCode
Zusatztext aus Adresse
AdditionalAddressText
Weitere Adressen
OtherAddress
Gehört zu HERIS-ID
ParentElement
Ort
Community
Staat
Country
Identifier
intranda_step_handle_mets
Repository
Licence
GPL 2.0 or newer
Last change
26.08.2024 09:19:33
project
This parameter defines which project the current block <config>
should apply to. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls which work steps the <config>
block should apply to. The name of the work step is used here. This parameter can occur several times per <config>
block.
PEMFile
Path to the Private Key .PEM file provided by GWDG.
HandleInstitutionAbbr
Abbreviation for the institution.
HandleIdPrefix
Prefix for the Handles (e.g., for the application or project).
HandleBase
Identifier for the institution.
UserHandle
Identifier for the user of the Handle registration.
URLPrefix
URL where the documents can be found with their Handle ID after publication.
<doiElt>
Dublin Core element for which this mapping is defined.
<localElt>
Name of the metadata in the METS file whose value should be used for <doiElt>
.
<altLocalElt>
Alternative names for the metadata, searched if no entry is found with <localElt>
.
<default>
Specifies the value to be used if neither <localElt>
nor <altLocalElt>
provide suitable entries.
<title>
, <author>
, <publisher>
, <pubdate>
, <inst>
These are currently the only five required and at the same time maximum permitted fields for metadata used for registration.
Identifier
intranda_step_generateIdentifier
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:58:41
project
This parameter determines the project for which the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
is to apply. The name of the step is used here. This parameter can occur several times per <config>
block.
field
This parameter can be used to specify in which metadata field the generated identifier is to be written.
type
This parameter allows you to choose between different types for generating the identifier. Available are random numbers (random
), time stamps (timestamp
) and UUIDs (uuid
).
length
If a random number was selected as the type, the number of digits can be specified here.
overwrite
If this parameter is set to true
, a new identifier will always be created when the plugin is run again, thus overwriting any existing identifier again. Otherwise (false
) a new identifier would only be created if the configured field (field
) is still empty or does not exist.
Identifier
intranda_step_geonamesautoannotator
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:58:33
project
This parameter determines the project for which the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
is to apply. The name of the step is used here. This parameter can occur several times per <config>
block.
geonamesAccount
This parameter defines the account name for GeoNames access.
geonamesApiUrl
The URL for accessing the GeoNames API is set here.
Identifier
intranda_step_geonamescorrection
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:58:21
project
This parameter determines the project for which the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which workflow steps the block <config>
is to apply. The name of the step is used here. This parameter can occur several times per <config>
block.
geonamesAccount
This parameter defines the account name for GeoNames access.
geonamesApiUrl
The URL for accessing the GeoNames API is set here.
With the plugin for the selection of images, images can be visually selected.
Identifier
intranda_step_image_selection
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:57:45
This plugin is used for the visual selection of images. It allows you to select, deselect and sort the selected images using drag & drop.
To use the plugin, these two files must be copied to the following locations:
The configuration of the plugin takes place within its configuration file plugin_intranda_step_image_selection.xml
. It is expected to be located under the following path:
To put the plugin into operation, it must be configured for one or more desired tasks in the workflow. This is done as shown in the following screenshot by selecting the intranda_step_image_selection
plugin from the list of installed plugins.
The plugin displays some images from the configured folder in the left-hand area. If you scroll down within the area, more images are loaded if there are more images available. If the mouse pointer is positioned over an image, an enlarged view of the image is displayed, which can be used to check the details of the image.
Images can be selected from the left-hand area using drag & drop. If the relative position for dropping is detected correctly, the newly selected image is inserted there, otherwise it is appended at the end. The selected images can be reorganised using drag & drop.
If the configured maximum number of selected images is reached or if an attempt is made to select the same image multiple times, the selection will not work.
Selected images can be deselected again using drag & drop. Simply drag the image from the right-hand box and drop it in the left-hand box.
The relative position of a selected image can be swapped with its neighbour by clicking on the top or bottom half of the image. There are two exceptions without swapping: if you click on the top half of the first selected image, it will be appended to the end of the list; if you click on the bottom half of the last image, it will be moved to the beginning of the list. To move a selected image to the top of the list, you can also right-click on this image.
Please note that the save button must be clicked in order to save the information of the selected images within the process properties.
The configuration of the plugin is structured as follows:
The parameters within this configuration file have the following meanings:
defaultNumberToLoad
With this parameter you can define how many images shall be loaded in the beginning. Default 20
.
defaultNumberToAdd
With this parameter you can define how many more images shall be loaded when scrolled to the bottom. Default 10
.
folder
Specify here the configured name of the folder from which the images are to be displayed. Possible values are master
, main
, jpeg
, source
etc. as long as they are correctly configured.
max
Here you can define the maximum number of thumbnails that could be selected.
min
Here you can define the minimum number of thumbnails that should be selected in order to save as a process property.
allowTaskFinishButtons
With this parameter you can define whether or not to enable the button to finish the task directly.
This step plugin scales images to configurable maximum sizes and renders a watermark into the scaled images.
Identifier
intranda_step_image_resize_and_watermark
Repository
Licence
GPL 2.0 or newer
Last change
25.07.2024 11:57:55
This plugin allows you to scale images to a maximum size and then render watermarks into the previously scaled images. The maximum size and the watermark to be rendered can be configured flexibly.
To install the plugin, the following file must be installed:
A configuration file is also required to run the plugin correctly:
Furthermore, a successful installation of the following two packages is also required on the system:
Both packages are included in common package managers and can be easily installed from them.
To use the plug-in, it must be activated for one or more desired tasks in the workflow. This is done by selecting the plugin intranda_step_image_resize_and_watermark
from the list of installed plugins.
After running the plugin the images have the expected size and have the configured watermark.
The configuration of the plugin allows you to define the maximum size of the images and the watermark (images and text watermarks are supported). Also the positioning of the watermark can be defined individually. Several configurations are possible for this purpose, which are differentiated by the project, the name for the work step within the workflow, the digital collection as well as a media type (special metadata within the METS file of the respective process). When the plugin is executed, the first configuration that matches the currently processed task is used.
Please note that the correct paths for GraphicMagick and ImageMagick must also be specified at the top of the configuration.
An example configuration for the file plugin_intranda_step_image_resize_and_watermark.xml
looks like this:
The block <config>
can occur repeatedly for different projects or workflow steps in order to be able to carry out different actions within different workflows. The other parameters within this configuration file have the following meanings:
gmPath
Path to install GraphicsMagick
convertPath
Path to install ImageMagick
project
This parameter defines the project for which the current block <config>
is to apply. The name of the project is used here. This parameter can occur several times per <config>
block.
step
This parameter controls for which work steps the block <config>
is to apply. The name of the work step is used here. This parameter can occur several times per <config>
block.
sourceDir
Path to the directory to be used as the source directory.
destDir
Path to the directory in which the scaled and watermarked images are to be saved.
mediaType
Restriction to operations whose metadata of type Type
corresponds to the configured value. Alternatively, *
can be used to make no restriction.
collection
Restriction to the operations that belong to a selected digital collection.
resizeTo
Maximum size of the image on the longest side. Specified in pixels.
watermark/image
Path to an image to be used within the watermark.
watermark/shadeSize
Define here which size specification should be used as shade.
watermark/text
Text to be used within the watermark.
watermark/font
Specify here which font should be used for the text. This font must be installed on the system.
watermark/boxSize
Define here what dimensions the box should have within which the text is to be rendered. This thus determines the size of the displayed font.
watermark/location
Determines where within the image the watermark should be rendered. Possible specifications are north
, northeast
, east
, southeast
, south
, southwest
, west
, northwest
watermark/xDistance
Lateral distance of the watermark
watermark/yDistance
Distance of the watermark up or down