2.1 Main configuration

The main configuration looks as follows by default:

<init>
    <sleep>1000</sleep>
    <minStorageSpace>2048</minStorageSpace>
    <solrUrl>http://localhost:8983/solr/collection1</solrUrl>
    <!-- <oldSolrUrl>http://localhost:8080/solr/collection1</oldSolrUrl> -->
    <viewerUrl>http://localhost:808/viewer/</viewerUrl>
    <viewerAuthorizationToken>CHANGEME</viewerAuthorizationToken>
    <deleteContentFilesOnFailure>true</deleteContentFilesOnFailure>
    <addVolumeCollectionsToAnchor>false</addVolumeCollectionsToAnchor>
    <representativeImage>
        <useFirstPageAsDefault>true</useFirstPageAsDefault>
    </representativeImage>
    <namespaces>
        <!--
        <list>
            <xyz>http://www.example.org/xyz/</xyz>
        </list>
        -->
    </namespaces>    
    <pageCountStart>1</pageCountStart>
    <addLabelToChildren>true</addLabelToChildren>
    <labelCleanup>true</labelCleanup>
    <authorityData enabled="true">
        <addFieldsToDefault>
            <field>NORM_IDENTIFIER</field>
            <field>NORM_NAME</field>
            <field>NORM_ALTNAME</field>
        </addFieldsToDefault>
    </authorityData>
    <aggregateRecords>true</aggregateRecords>
    <fulltextForceUTF8>true</fulltextForceUTF8>
    <mets>
        <preferredImageFileGroup>BOOKVIEWER</preferredImageFileGroup>
        <preferredImageFileGroup>ZOOMIFY</preferredImageFileGroup>
        <physicalElementTypes>
            <type>object</type>
            <type>audio</type>
            <type>video</type>
        </physicalElementTypes>
    </mets>
    <lido>
        <imageXPath>lido:resourceRepresentation[@lido:type='image_master']/lido:linkResource</imageXPath>
        <imageXPath>lido:resourceRepresentation[@lido:type='http://terminology.lido-schema.org/resourceRepresentation_type/provided_representation']/lido:linkResource</imageXPath>
        <imageXPath>lido:resourceRepresentation[@lido:type='http://terminology.lido-schema.org/lido00464']/lido:linkResource</imageXPath>
        <imageXPath>lido:resourceRepresentation[@lido:type='image_overview']/lido:linkResource</imageXPath>
        <imageXPath>lido:resourceID</imageXPath>
    </lido>
    <email>
        <recipients>admin@example.org</recipients>
        <smtpServer>localhost</smtpServer>
        <smtpUser></smtpUser>
        <smtpPassword></smtpPassword>
        <smtpSenderAddress>do-not-reply@goobi-viewer.example.org</smtpSenderAddress>
        <smtpSenderName>Goobi viewer Indexer</smtpSenderName>
        <smtpSecurity>NONE</smtpSecurity>
    </email>
    <viewerNotifications>
        <prerenderPdfs enabled="false" force="false" variant="small"/>
    </viewerNotifications>
</init>

The parameters are explained in detail in the following table:

Setting

Description

sleep

Waiting time of the Goobi viewer Indexer (in milliseconds) between monitoring cycles of the hotfolder. The default value is 3000.

minStorageSpace

Minimum free hard disk space (in MB) on the drive where the hotfolder is located for indexing.

If this size is not reached, the Goobi viewer Indexer automatically terminates and must be restarted manually (after increasing the available memory). The default value is 2048.

solrUrl

URL of the Apache Solr HTTP server that contains the index. All communication between the Goobi viewer indexer and Solr takes place via HTTP queries.

oldSolrUrl

f certain fields are to be transferred from an old Solr index to the new one during a complete new indexing, this element must contain the URL to the old Solr. The transferred fields are DATEECREATED, DATEUPDATED and THUMBNAILREPRESENT. For anchor records the IDDOC is added. If the records are in a DATAREPOSITORY, the information about the affiliation is also ported.

This switch was introduced with version 4.8.0 to allow the migration from Solr 4 to Solr 8. In a regular installation this element is not needed.

viewerUrl

URL to the Goobi viewer to automatically remove old images from the cache when reindexing a record, for example.

viewerAuthorizationToken

A token is required so that the Goobi viewer can be informed of any changes when records are moved between DataRepositories. The token is stored in config_viewer.xml. See also chapter 1.33.3.

deleteContentFilesOnFailure

It can happen that the indexing of an object fails due to an incorrect XML file, for example. In such a case, the file and all associated folders (media, full texts, Word coordinates, etc.) are removed from the hotfolder. Depending on the object, however, it can happen that the associated folders contain large amounts of data, so that copying this data into the hotfolder takes a lot of time. In such cases it is possible to leave these folders in the hotfolder even if they fail, so that they can be used with a corrected XML file. Remaining these folders in the hotfolder has no effect on the indexing of any other objects.

addVolumeCollectionsToAnchor

When indexing multi-volume records, volumes may belong to different collections. If this element is set to true, collection names from all volumes contained in the index are also copied to the complete record (collections to which only the complete record belongs are also retained). It should be noted here that any typographical errors in collection names can no longer be automatically removed from the complete record if this option is activated. The default value is false.

representativeImage/useFirstPageAsDefault

If true, the first page of a document is set as the representative image if no other page is specified in the source document. If this is set to false, and no page is explicitly set as representative, no representative image will be set. The default value is true.

namespaces

If additional XML namespaces are embedded in METS documents, these must be communicated to the Goobi view Indexer so that corresponding XPath expressions can be evaluated. The namespace name is defined as the element name within the element, the namespace URI as the text value of this element. If no additional namespaces are defined, there must be no empty <list> element.

pageCountStart

The Goobi viewer is expecting a page count starting with 1. Versions before that, on the other hand, start counting at 0. In order to counteract a constant deviation in page navigation, compatibility with older Goobi viewer versions can be established by setting the value to 0. The default value is 1.

addLabelToChildren

If this switch is set to true, the values of the LABEL fields of structure elements are written to the DEFAULT field of subordinate structure elements. The default value is false.

labelCleanup

If this switch is set to true, non-sort character sequences "<ns></ns>", "<<>>" and "¬" are removed from the value. The default value is false.

authorityData/@enabled

If the value is set to false, indexing of authority data is completely disabled. The default value is true.

authorityData/addFieldsToDefault/field

Values, certain standard data fields (for example, alternative spellings of a name) can be added to the DEFAULT search field to ensure direct searchability. A new configuration element is added for each desired field (for example, <field>NORM_ALTNAME</field>).

aggregateRecords

If set to true, the additional fields required for the aggregated search (such as aggregated full texts and metadata) are written to the index. The default value is false.

fulltextForceUTF8

If set to true, full texts are automatically converted to UTF-8 if another charset is detected. Default value is true.

mets/preferredImageFileGroup

If file groups are configured here and a group with this name exists, it is used for indexing image file paths.

mets/physicalElementTypes/type

In addition to "page" (always permitted), additional values can be defined here for mets:structMap[@TYPE="PHYSICAL"]/mets:div[@TYPE="physSequence"]/mets:div/@TYPE, which are to be processed as page documents.

lido/imageXPath

To consider the heterogeneity of image paths in LIDO documents, the possible XPath expressions (relative to lido:resourceSet) in which image links are searched are configurable. The list is processed from top to bottom, and the first expression that returns hits is used (the rest are ignored).

email/...

If errors occur during indexing, the Goobi viewer Indexer can notify the email addresses defined in the recipients element. Configuration is carried out in the same way as described in chapter 1.5.2.

viewerNotifications/prerenderPdfs/@enabled

If active, the Goobi viewer will be instructed to pre-generate PDF files when indexing a record that has images in the hotfolder. Default value is false

viewerNotifications/prerenderPdfs/@force

If active, the Goobi viewer is instructed to pre-generate PDF files even if there are no images in the hotfolder. This can be used for example during reindexing for the one-time generation of the files. Default value is false

viewerNotifications/prerenderPdfs/@variant

Specifies the pdfConfig variant from the contentserver configuration file to be used for generating the PDF files. Default value is small

Last updated