Renaming files

This step plugin allows you to automatically adjust file names within Goobi processes.

Renaming files

Introduction

This plugin is used to conditionally rename files within the different folders of an operation of Goobi workflow. The naming is dependent on a configuration file, which may be structured differently for different workflows.

Overview

Details

Identifier

intranda_step_rename-files

Source code

Licence

GPL 2.0 or newer

Compatibility

Goobi workflow 2020.02

Dokumentation date

14.02.2023

Installation

To install the plugin, the following file must be installed:

/opt/digiverso/goobi/plugins/step/plugin_intranda_step_rename-files.jar

To configure how the plugin should behave, different values can be adjusted in the configuration file. The configuration file is usually located here:

/opt/digiverso/goobi/config/plugin_intranda_step_rename-files.xml

As an example, the content of this configuration file looks like this:

<config_plugin>
    <config>
        <project>Manuscript_Project</project>
        <step>*</step>
        <folder>*</folder>
        <startValue>2</startValue>
        <!-- Replacement removed anything in the process title before and including the '_' -->
        <namepart type="variable">
            {processtitle}
            <replace regex="^.*?_" replacement=""/>
            <condition value="{meta._imageFilePrefix}" matches="^$"/>
        </namepart>
        <namepart type="static">_</namepart>
        <namepart type="counter">00000</namepart>
    </config>

    <config>
        <project>Archive_Project</project>
        <step>*</step>
        <folder>*</folder>
        <namepart type="originalfilename" />
        <namepart type="static">_ARCHIVE</namepart>
    </config>

    <config>
        <project>*</project>
        <step>*</step>
        <folder>*</folder>
        <startValue>0</startValue>
        <namepart type="variable">{processtitle}</namepart>
        <namepart type="static">_</namepart>
        <!-- Conditional name parts (mimic old barcode feature) -->
        <namepart type="counter">
            0000
            <condition value="{originalfilename}" matches="^(?!.*barcode).*$" />
        </namepart>
        <namepart type="static">
            0000
            <condition value="{originalfilename}" matches="^.*barcode.*$" />
        </namepart>
    </config>
</config_plugin>

General configuration of the plugin

The configuration of the plugin is done within the already mentioned configuration file. There you can configure various parameters. The block <config> can occur repeatedly for different projects or work steps in order to be able to perform different actions within different workflows. The elements <namepart> are decisive for the generation of the file names.

ValueDescription

project

This parameter determines the project for which the current block <config> is to apply. The name of the project is used here. This parameter can occur several times per <config> block.

step

This parameter controls for which work steps the block <config> should apply. The name of the workflow step is used here. This parameter can occur several times per <config> block.

folder

This parameter allows the user to control which directories are to be considered for renaming. If the value * is specified here, the parameter is missing or the value is not configured, the default settings are used.

startValue

This value controls with which start value the incrementing counter should start.

namepart

This parameter, which can also be used several times, controls the generation of the file names. It can contain static elements (static), use the original file name (originalfilename), use variables from Goobi (variable) and generate a counter. The originalfilename is the original filename of the file before this plugin is executed for the first time. The number of digits defined is crucial for generating the counter (counter). For example, the value 00000 would generate five-digit numbers with any leading zeros. The components of the file name defined in this way are concatenated together for naming purposes and then supplemented by the actual file extension to name the file.

In addition, <namepart> elements can contain multiple <condition> and <replace> elements.

Condition elements have the form

<condition value="{VARIABLE}" matches="REGEX" />

and can match any VARIABLE against the specified regular expression REGEX. If multiple conditions are specified, all must match in order to activate the corresponding <namepart>.

Replace elements have the form

<replace regex="REGEX" replacement="REPLACEMENT"/>

and can replace anything the regular expression REGEX evaluates to in the <namepart> value with the REPLACEMENT (which can also be empty). If multiple replacements are specified, all are processed in the order they are defined to form the final <namepart> value.

Mode of operation

The plugin is usually executed fully automatically within the workflow. It first determines whether there is a block within the configuration file that has been configured for the current workflow with regard to project name and work step. If this is the case, the individual elements <namepart> are evaluated, assigned the appropriate values for the counter and variables from Goobi workflow and then linked together. The file names created in this way are now applied to all the relevant directories in the Goobi process and are supplemented with the correct file name extensions (e.g. .tif).

To ensure correct renaming, that is based on the original file name, the plugin saves the original filename for every file in a process property called plugin_intranda_step_rename_files. This property ensures, that multiple executions of this plugin, with possible changes in the configuration, will still resolve the originalfilename namepart correctly.

Details of the Goobi workflow variables that can be used in this plugin can be found in this documentation.

The plugin considers by default the files within the following subdirectories for naming:

  • master

  • media

  • jpeg

  • alto

  • pdf

  • txt

  • xml

How to use the plugin

This plugin is integrated into the workflow in such a way that it is executed automatically. Manual interaction with the plugin is not necessary. For use within a step of the workflow it should be configured as shown in the following screenshot.

Migration

This section describes required steps to migrate from older plugin versions.

Barcode

Previously, the plugin contained the following feature:

If a file is found within the file that contains barcode within the file name, it will also be named according to the naming scheme. However, the value 0 is set as counter here.

This feature was removed from the current version of the plugin. Due to the introduction of <condition>s in <namepart> elements, this feature can be configured more flexible now, if required.

The barcode feature was always applied to nameparts of the type counter and replaced the counter value with 0, if the filename contained the word barcode. This behavior can now be explicitly forced. We show an example configuration for a counter namepart with four digits:

        <namepart type="counter">
            0000
            <condition value="{originalfilename}" matches="^(?!.*barcode).*$" />
        </namepart>
        <namepart type="static">
            0000
            <condition value="{originalfilename}" matches="^.*barcode.*$" />
        </namepart>

This snippet contains two <namepart> configurations that can not be both active at the same time. The first <namepart> is a counter and activates, if the {originalfilename} variable (which resolves to the original file name) does not contain the word barcode. The second <namepart> configuration is a static 0000 that is only active, if the {originalfilename} contains the word barcode.

It is exactly the same behavior as before. Now, the user can decide on his own to use this feature or to tweak the configuration if desired.

Last updated