Goobi workflow Plugins
Documentation homeGoobi workflow ManualGoobi workflow Digests
English
English
  • Overview
  • Administration
    • Archive Management
      • Using the plugin
      • Configuration of the plugin
    • Configuration editor
    • Copy Master-Anchor
    • Data Poller
    • Goobi-to-Goobi
      • Installation and configuration
      • Creation of the export directories
      • Transfer of the export directories
      • Importing the export directories
    • Reset pagination
    • Restoring archived image folders
    • Ruleset Compatibility
    • Ruleset editor
  • Dashboards
    • Barcode scanner Dashboard
    • Extended Dashboard
  • Exports
    • Customised export for the DMS Imagen Media Archive Management
    • Single Page Export
    • Configurable export
    • Fedora Export
    • Fedora Export PROV
    • Heris Export
    • Export for newspapers to the portal of the German Digital Library
    • PDF export to the NLI directory structure
    • Export of selected images
    • Stanford Export
    • VLM Export
    • HAAB Export
    • ZOP Export
  • Generic
    • Barcode Scanner
  • Imports
    • Legacy data import for the Austrian Federal Monuments Authority
    • Archive data import
    • Data import without catalogue query for ETH Zurich
    • Importing records from an Excel file
    • Import of card catalogues from KatZoom
    • Importing MAB Files
    • Import of Sisis SunRise Files
    • Import for journal articles from an Endnote Export
    • Data import with ALMA catalogue query for Zurich Central Library
    • Data import with CMI catalogue query for Zurich Central Library
    • Data import without catalogue query for the Zentralbibliothek Zurich
  • Metadata
    • Change Publication Type
    • Metadata extension for the creation of structural elements per image
  • OPAC
    • Ariadne Import
    • EAD data transfer
    • Generic XML Import
    • Generic JSON Import
    • Kalliope Import
    • MARC Import
    • PICA Import
    • Soutron Import
  • Repeated Jobs
    • Data import for the Austrian Housing Promotion Fund
    • HERIS Vocabulary Update
  • Statistics
    • Sudan Memory Translations
    • Visualisation of the throughput per user
  • Steps
    • ALMA API Plugin
    • Automatic pagination based on file names
    • Archiving image folders
    • Generating Archival Resource Keys (ARK)
    • Libsafe Integration
    • Assign batch
    • Batch Progress Plugin
    • Catalogue Request
    • Changing the workflow based on process properties
    • Generation of PDF files
    • Plugin for registering DOI via the DataCite API
    • Delay Workflow
    • Conditional workflow status delay
    • Delete Content
    • Display of metadata in a task
    • Plugin for DOI registration
    • Downloading and verifying files
    • Duplication of work steps
    • ePIC PID Registration (Handle & DOI)
    • EWIG Long term archiving
    • Metadata enrichment via Excel file
    • Package Export
    • Copying files from metadata fields
    • Upload files
    • File validation
    • Flex Editor
    • Generate ALTO IDs
    • Generate Identifier
    • Geonames Annotation
    • GeoNames Correction
    • Automatic Handle Assignment
    • Heris data import
    • Extraction of image metadata
    • Image scaling and watermarking
    • Selection of images
    • Quality control of images
    • Metadata transfer from a directory
    • Metadata Cleaning
    • Layout Wizzard
      • Using the plugin
        • Preview
          • Image area
          • Display and navigation options
        • Single page view
          • Folder and file options
          • Current image
          • General settings
          • File list
          • Save view
          • Working steps
          • Selected analysis step
          • Global cutting options
      • Technical details
        • Installation
        • Configuration of the LayoutWizzard
        • Configuration of the user interface
        • Workflow
    • Metadata edition
    • Capture metadata per image
    • Automatic enrichment of metadata from own vocabularies
    • Structure data import from an Excel file
    • Update Metadata Fields
    • Automatic METS enrichment with image files and pagination
    • Enrich METS file
    • Data migration from Visual Library
    • MIX Metadata Enrichment Plugin
    • OCR execution with mixed fonts
    • OCR page selection
    • Transfer OCR result to metadata field
    • Object Identifier Generation
    • Correction of tables of contents after an OLR
    • Data import for Book Interchange files
    • Split PDFs, extract full text and read table of contents
    • Electronic Publications
    • Generation of placeholder images
    • Process folder migration
    • Renaming files
    • Renaming files before the Rosetta ingest
    • Renaming Processes
    • Reorder Images
    • Replace images
    • Automatic setting of the representative
    • Reverse Image Order
    • Generation of docket files
    • Sending emails
    • Import of ECHO files as TEI
    • Tif-Validation
    • Transcription of image content
    • OCR using Transkribus
    • Import and download from Transkribus Collections
    • Creation of Uniform Resource Names (URN)
    • User Assignment
    • Vocabulary enrichment
    • Writing XMP metadata to image files
    • Metadata validation within a task
    • Invoices and delivery notes for user orders
  • Workflow
    • AEON data transfer
    • Barcode Generator
    • Close steps
    • Entity Editor - Artist Dictionary
    • Generic import plugin for excel files including validation
    • Process creation through file upload
    • Mass import from Excel data with EAD enrichment
    • Generic import plugin for JSON files
    • LayoutWizzard workflow plugin
    • Create process relationships
    • Mass upload
    • Import of newspaper issues as single pages
    • Project export as folder with images and Excel file
    • Mass import for brand studies and advertising material
    • Data transfer from AIM25
Powered by GitBook
On this page
  • General settings
  • Analysis configuration
Export as PDF
  1. Steps
  2. Layout Wizzard
  3. Technical details

Configuration of the LayoutWizzard

The central configuration of the LayoutWizzard takes place in a separate configuration file. This file can be located anywhere in the file system, because its path can be specified in any program component of the LayoutWizzard. Usually the path to this central configuration file is as follows:

/opt/digiverso/LayoutWizzard/layoutwizzard_config.xml

The content of such a configuration looks exemplary as follows:

<?xml version="1.0" encoding="UTF-8" ?>
<!-- intranda Layout Wizzard v1.1 configuration file -->

<config>
	<useOpenCV>false</useOpenCV>
	<!-- Higher values decrease the likelihood of images being labeled as outliers 
		("suspicious angle/pagesize") -->
    <outliers>        
       <type>NOT_PROCESSED</type>
	   <errorMultiplier>3.0</errorMultiplier>
	   <weightExponent>2.0</weightExponent>
    </outliers>
<!-- 	<contentServerUrl>http://G2GURL/goobi/cs/cs</contentServerUrl> -->
	<contentServerUrl>/cs/cs</contentServerUrl>
	<outputFolderSuffix>media</outputFolderSuffix>
	<analysisImagesBasePath>/opt/digiverso/git/layout-wizzard/LayoutWizzard/samples/debug
	</analysisImagesBasePath>
	<previews>
		<previewsPerPage>100</previewsPerPage>
		<maxPreviewsCached>100</maxPreviewsCached>
		<previewHeight>600</previewHeight>
		<largePreviewWidth>5000</largePreviewWidth>
	</previews>
	<processingThreads>4</processingThreads>
	<analysisTimeout>
		<duration>20</duration>
		<unit>SECONDS</unit>
	</analysisTimeout>
	<saving>
		<defaultCompression quality="85">JPEG
		</defaultCompression>
		<overwriteExistingImages>true</overwriteExistingImages>
		<ignoreImageMetadataErrors>false</ignoreImageMetadataErrors>
	</saving>

	<analysis id="bound_book">
    <info>
        <label>Bound book</label>
    </info>
		<pageMode>ALTERNATING_START_RIGHT</pageMode>
		<analysisStep name="PAGESKEW" use="true" order="1">
			<saveAnalysisImages visibility="INVISIBLE" path="deskew">false
			</saveAnalysisImages>
			<deskewerMode visibility="VISIBLE">ALL_EDGES</deskewerMode>
			<lineFinderMode visibility="INVISIBLE">OUTERCONTOURS
			</lineFinderMode>
			<lineGroupingMode visibility="INVISIBLE">GROUP_BY_DISTANCE
			</lineGroupingMode>
			<analysisImageSize>300</analysisImageSize>
			<rimAreaToIgnoreLines>0.05</rimAreaToIgnoreLines>
			<lowerCannyThreshold>2</lowerCannyThreshold>
			<cannyRatio>2</cannyRatio>
			<!-- <houghLineThreshold>10</houghLineThreshold> -->
			<minHoughLineLength>10</minHoughLineLength>
			<maxHoughLineGapSize>4</maxHoughLineGapSize>
			<featureSizeThreshold>5</featureSizeThreshold>
			<maxLineAngleDeviation>5</maxLineAngleDeviation>
			<maxLineDistance>10</maxLineDistance>
		</analysisStep>

		<analysisStep name="CONTENTAREA" use="true" order="2">
			<analysisImageSize>150</analysisImageSize>
			<saveAnalysisImages visibility="INVISIBLE" path="edgeDetection">false
			</saveAnalysisImages>
			<bitonalThreshold>220</bitonalThreshold>
			<featureSizeThreshold>2.0</featureSizeThreshold>
			<contentPadding visibility="VISIBLE">-10</contentPadding>
			<bitonalInvert visibility="HIDDEN">false</bitonalInvert>
			<rimAreaToIgnoreLines>0.0</rimAreaToIgnoreLines>
		</analysisStep>

		<analysisStep name="BOOKSPINE" use="true" order="3">
			<saveAnalysisImages visibility="INVISIBLE" path="spineDetection">false
			</saveAnalysisImages>
			<lineFinderMode visibility="HIDDEN">SUZUKICONTOURS
			</lineFinderMode>
			<lineGroupingMode visibility="HIDDEN">GROUP_BY_DISTANCE
			</lineGroupingMode>
			<croppingAggressiveness visibility="VISIBLE">CAUTIOUS
			</croppingAggressiveness>

			<analysisImageSize>400</analysisImageSize>
			<rimAreaToIgnoreLines>2</rimAreaToIgnoreLines>
			<lowerCannyThreshold>1</lowerCannyThreshold>
			<cannyRatio>3</cannyRatio>
			<minHoughLineLength>10</minHoughLineLength>
			<maxHoughLineGapSize>2</maxHoughLineGapSize>
			<featureSizeThreshold>0.1</featureSizeThreshold>
			<maxLineAngleDeviation>5</maxLineAngleDeviation>
			<maxLineDistance>5</maxLineDistance>
			<spineOffset visibility="VISIBLE">0</spineOffset>
		</analysisStep>
	</analysis>
	
	<analysis id="default">
	   [...]
	</analysis>
	
</config>

The configuration consists of some general settings and several <analysis> blocks. The <analysis> blocks mainly control the settings for the automatic analysis. Different projects or tasks can use different settings by passing the id of the <analysis> block to the automatic analysis.

General settings always affect all operations and are not overwritten by operation-specific settings.

General settings

The following list of general configuration paths is not complete. However, it contains all configurations that must be individually adapted for an installation.

Path
Description

previews/previewsPerPage

Number of images per page in the Goobi LayoutWizzard plugin preview

previews/previewHeight

Height of the thumbnail file in pixels displayed in the preview view. Smaller images allow faster display, but have a lower resolution.

previews/largePreviewWidth

Width in pixels of the thumbnail file displayed for the single-page view of the Goobi LayoutWizzard plugin Smaller images allow faster display, but have a lower resolution.

processingThreads

The maximum number of simultaneously running analysis or storage processes. This applies to Goobi and TaskManager separately. The images are processed sequentially for each operation. However, simultaneous processing may occur if several LayoutWizard jobs are running in parallel in the TaskManager.

analsisTimeout/duration

This value specifies the maximum time for analyzing or saving an image after which the execution for the image should be aborted. An analysis that was interrupted due to timeouts is noted, but the analysis of the following images is continued. The missing analysis data can be added in the manual control. However, a canceled save always ends the TaskManager job with an error. Useful values for the timeout are between 4 seconds and about one minute, depending on the performance and reliability of the system and the size and complexity of the images to be analyzed.

analysisTimeout/unit

This value defines the time unit in which analysisTimeout/duration is specified. Possible values are MICROSECONDS, MILLISECONDS, SECONDS and MINUTES.

saving/defaultCompression

This value determines the compression level that is used by default for saving the derivatives. Valid values here are NONE or JPEG. The quality attribute specifies the compression quality for JPEG compression. It must be between 0 and 100.

saving/overwriteExistingImages

This value can be used to determine whether existing image derivatives should be overwritten during saving.

saving/ignoreImageMetadataErrors

Here you can specify whether the derivatives should also be saved if not all image metadata can be transferred. This can happen, for example, if unknown metadata exists for the Java image library. It is therefore advisable to always leave this value set to false as long as this setting is not explicitly required.

Analysis configuration

Each <analysis> block has an attribute id that controls which block is used for a particular analysis. The last block must have the id="default". Settings from this block are always used if no analysis id is passed to an analysis call, or if a setting is not configured in the actually used block. All other blocks consist accordingly of the subset of configurations that differ from the default configuration.

The following settings can exist in each <analysis> block:

Path
Description

info/label

This is the name of the analysis setting in the plugin interface.

pageMode

externalCommands/@use

At this point it is determined whether the generation of images for the analysis and storage of the derivatives should be done by an external program. This can speed up the image generation considerably under certain circumstances, but it can also be more error-prone, since the generation then takes place outside of Java.

externalCommands/convert

This value defines the console command to call the external program for generating images. The execution specifics are appended to this command, following the format of ImageMagick. The called program must therefore be able to be called with parameters compatible with ImageMagick.

analysisStep

This value contains all internal parameters of the respective automatic analysis step.

analysisStep/@name

This defines the internal name of the block. It must correspond to one of the following values:

  • PAGESKEW: Align Page

  • CONTENTAREA: Cutting the page

  • BOOKSPINE: Detect the book spine

analysisStep/@use

This value can be used to determine whether an analysis step should be used. The value false deactivates the analysis step.

analysisStep/@order

At this point the sequence of the analysis step within the entire analysis is determined.

The settings in the <analysisStep> blocks concern specific parameters of the analysis algorithms. They are not described further here. However, users can potentially adjust any parameter in the interface. If the settings made in this way prove sufficient to be adopted in the configuration, the corresponding block in the configuration file can be set to the new value. The appropriate parameter block can be determined by finding the <analysisStep> for the respective analysis step in the configuration file and changing the block with the internal parameter name there. The internal parameter name is displayed in the user interface as a tooltip when the mouse pointer is held over the label of the changed parameter.

Additionally, all analysis parameter blocks can have the visibility attribute, which controls the visibility of the parameter in the user interface. If this attribute is missing, the default value HIDDEN is used.

Visibility
Description

VISIBLE

The parameter is always visible in the interface when the corresponding step is selected.

HIDDEN

The parameter is only visible in the user interface when the analysis step block in the user interface is in extended mode.

INVISIBLE

The parameter is not displayed at all in the interface.

PreviousInstallationNextConfiguration of the user interface

Last updated 9 months ago

This value defines the default page mode to be used. The specifications valid for this are defined within the .

folder and file options