1.9.3 Individual PDF title page

PDF documents created in the Goobi viewer can have additional upstream pages containing optional metadata on the complete record and on individual structure elements. The metadata is read exclusively from the METS file of the process, and it is also possible to display linked image content in the METS file. The layout and static contents of the metadata pages are specified by XML documents, so-called templates, which can be adapted to individual requirements.

Types of metadata pages

For PDF generation, three different types of metadata pages are distinguished, which can be individually activated and configured:

Type

Description

Title page

This page is always the first page of a PDF document and may contain metadata about the complete record and the top structure element contained in the PDF.

Chapter page

These pages are prefixed to each structure element contained in the PDF except for the topmost and may contain metadata about the complete record and the respective structure element.

Single page

This page precedes the PDF document of a single page. It can only contain metadata on the complete record.

Basic Configuration

The use of the metadata pages is configured in the configuration file of the intranda ContentServer used for PDF creation, which is usually the following file for the Goobi viewer:/opt/digiverso/viewer/config/config_contentServer.xml.

Each type of metadata page has its own configuration element:

Type

Configuration element

Title page

<pdfTitlePage>

Chapter page

<pdfChapterTitlePages>

Single page

<singlePdfTitlePage>

All configuration elements have the same structure and contain the following attributes:

Attribute

Description

@use

Contains the value true if this type of metadata page is to be used, otherwise false.

@templateFolder

Contains the file URL to the folder containing the XML templates and the Apache fop configuration file.

@defaultTemplate

The file name of the template file, without the file extension to be used for metadata page creation.

<config>
    <pdfTitlePage use="true"
        templateFolder="file:///opt/digiverso/viewer/config/PdfTitlePage/"
        defaultTemplate="goobiviewer" />
    <pdfChapterTitlePages use="false"
        templateFolder="file:///opt/digiverso/viewer/config/PdfTitlePage/"
        defaultTemplate="goobiviewer-section" />
    <singlePdfTitlePage use="true"
        templateFolder="file:///opt/digiverso/viewer/config/PdfTitlePage/"
        defaultTemplate="goobiviewer-single" />
</config>

The templateFolders must contain at least for each active metadata page the XML file specified in defaultTemplate with file extension .fo, as well as the file fop.xconf, which controls the conversion to PDF using Apache fop. Details about Apache fop can be found here.

Configuration of Templates

Each xml file can contain elements from two xml namespaces:

  • fo = http://www.w3.org/1999/XSL/Format

  • cs = http://www.goobi.org/contentServer

The elements from the fo-namespace control the layout and the conversion to PDF using Apache fop.

The elements from the cs namespace control the integration of metadata and record-specific image content into the PDF. All these elements are completely replaced by text content or elements from the fo-namespace before the PDF transformation takes place. In the following, the possible elements and their meaning are presented in detail:

  • <cs:meta> Contains an xPath expression as text, which is resolved against an element of the METS file. If the xPath expression denotes an XML element or attribute that contains text, the entire <cs:meta>-element is replaced by this text. The element of the METS file where the xPath expression is resolved and some other parameters are controlled by the attributes of <cs:meta> . These are:

    • from: Describes the location, i.e. the XML element of the METS file, against which the xPath expression is resolved. The possible values are described in the corresponding chapter below.

    • repeat: If this attribute contains the value true, the <cs:meta>-element is replaced by the concatenation of all contents described by the xPath expression.

    • valueSeparator: If multiple contents are found by using repeat, they are separated by the text contained in valueSeparator. If no valueSeparator attribute is defined, the contents are separated by a single space.

  • <cs:alt> This element is only allowed as a child element of <cs:meta> and contains an xPath expression that replaces the expression in <cs:meta> if it does not return results. This element contains the only attribute from with the same meaning as in <cs:meta>.

  • <cs:block> This element must always contain other <cs:block>, <cs:meta>, <cs:graphic> or <cs:link> elements, but not necessarily as direct child elements. The <cs:block> element will be replaced by its full content if at least one of the contained elements contains <cs:meta>, <cs:graphic> or <cs:link> content - that is, resolved xPath expressions. Otherwise, the whole <cs:block> element and its content will be removed. <cs:block> elements are used to display or remove entire XML blocks depending on whether they contain relevant content. These elements have the following possible attributes:

    • repeat: If this attribute contains the value true, the entire content contained in <cs:block> is repeated as often as the xPath expression of at least one contained <cs:meta>, <cs:graphic> or <cs:link> element returns further hits. In the first repetition of the block all second hits of the contained xPath expressions are resolved - if present, and in the second repetition all third hits and so on.

    • separator: Has the same function as in <cs:meta>, except that the text is inserted between two XML blocks created by repeat.

    • renderIfExists: This element contains an additional condition to display the content of <cs:block>. If the location contained in renderIfExists does not exist, the entire block will be removed. Otherwise the normal display rules apply. For more information on locations, see the corresponding chapter below.

  • <cs:graphic> This element creates an element <fo:external-graphic> to display image content linked in the METS file. The attributes from and fileGroup of <cs:graphic> result in an image URL from the METS file. If the text content of <cs:graphic> is empty, this URL is used as URL for the <fo:external-graphic> element. Alternatively, a URL can be used in the text of <cs:graphic>. Parts of the image URL from the METS file can be inserted into this URL to form the actual URL. The URL can contain placeholders of the form {i} - where i must be an integer - which are replaced by the i-th path element from the image URL of the METS file. Path elements are the parts of the URL separated by /. If i is negative, the path element is counted from the end of the URL. {-1} is the last path element of the URL. Thus, for example, a IIIF URL for a 600 pixel wide JPEG image can be formed from the file URL of the METS file as http://viewer/iiif/image/{-2}/{-1}/full/600,/0/default.jpg. It is generally advisable to use URLs to such reduced and compressed versions of images, as large images can considerably inflate the storage requirements of the PDF. <cs:graphic> has the following attributes:

    • from: Describes the location, i.e. the XML element of the METS file, against which the xPath expression is resolved. The possible values are described in the corresponding chapter below. Only locations that describe a file are allowed in this element.

    • fileGroup: The name of the file group of the METS file from which the file described in from is to be taken.

  • <cs:link> Creates an element <fo:basic-link> to display links in the PDF. Currently these links are only displayed as text and do not work as actual links. The text content of <cs:link> is used as the <cs:ref>text of the link, and the content of the child element <cs:ref> as the URL after all cs elements contained in have been resolved.

  • <cs:ref> Creates the content for an enclosing <cs:link> element, details see there.

Places

The following locations, i.e. metadata elements of the METS file, can be used in the attribute from:

Structure elements

These locations refer to the <mets:div> elements in <mets:structMap>. They can be used in <ce:meta> and <cs:link>. The locations ANCHOR, TOP and DIV, which refer to elements of the logical structMap, also include the path mets:dmdSec/mets:mdWrap/mets:xmlData of the <mets:div> element belonging to the <mets:dmdSec> element. In this case, the xPath expressions start with mods:mods/ The following structure items are defined as locations:

Place

Description

ANCHOR

Anchor record of a physical record from the logical structure

TOP

Oberstes sich auf das physische Werk beziehende Strukturelement aus der logischen Struktur

DIV

Highest structure element from the logical structure that refers to the physical record.

DIV_START_PAGE

First physical structure element that is assigned to the logical structure element DIV.

DIV_END_PAGE

Last physical structure element that is assigned to the logical structure element DIV.

TOP_START_PAGE

First physical structure element that is assigned to the logical structure element TOP.

TOP_END_PAGE

Last physical structure element that is assigned to the logical structure element TOP.

Files

These locations refer to <mets:file> elements in the <mets:fileSec> of the METS file. They can only be used in <cs:graphic> elements and require the additional fileGroup attribute to differentiate the file groups. The following file locations are possible:

Place

Description

BANNER

The first <mets:file> element marked with@USE="banner"

DIV_START_FILE

<mets:file> element that is assigned to the structure element DIV_START_PAGE

DIV_END_FILE

<mets:file> element that is assigned to the structure element DIV_END_PAGE

TOP_START_FILE

<mets:file> element that is assigned to the structure element TOP_START_PAGE

TOP_END_FILE

<mets:file> element that is assigned to the structure element TOP_END_PAGE

Last updated