4.16. Creation of a PDF document for an entire work
This plugin creates a PDF document for a Goobi operation and saves it in the file system. It uses the structure of the METS file of the Goobi process for the table of contents and the image files referenced in the METS file for the page contents.
Depending on the configuration, existing single-page PDFs - if available - can also be used to generate the overall PDF. This makes the creation of the PDF much faster, since no conversion is necessary and full texts can be taken over from the single page PDFs.
Alternatively, full texts can also be written from ALTO documents into the PDF. This is also possible when creating from single-page PDFs, but is suppressed if these already contain full texts.
Starting the plugin
The creation of a PDF is triggered via the TaskClient by the call:
Parameters
The command parameters are explained in the following table:
Einzelseiten-PDFs und ALTO-Dateien werden aus dem Unterordner ocr
des source-Ordners - innerhalb dessen sich die Datei meta.xml
befindet - aus den Unterordnern <Vorgangsname>_pdf
bzw. <Vorgangsname>_alto
geladen. Die Dateien müssen, bis auf die Endung, so benannt sein wie die Bilddateien in der METS-Datei.
Ist der Zielordner, in den die PDF-Datei geschrieben wird, identisch mit dem Ordner der Einzelseiten-PDFs, so wird letzterer umbenannt in <Vorgangsname>_pdf_abbyy
.
Single page PDFs and ALTO files are loaded from the subfolder ocr
of the source folder - within which the meta.xml
file is located - from the subfolders <processtitle>_pdf
or <processtitle>_alto
. The files must be named like the image files in the METS file, except for the extension.
If the destination folder where the PDF file is written to is identical to the folder of the single-page PDFs, the latter is renamed to <processtitle>_pdf_abbyy
.
Configuration of the plugin
The plugin can be used without its own configuration file. In this case, single page PDFs and full text ALTO files are used for the creation, if available. The intrandaContentServer configuration file provided by Goobi is used for all other parameters of PDF creation.
If a different behaviour is required, the following configuration file must be created:
This file has the following content:
The parameter usePdfDirectory
can be used to switch the use of single-page PDFs on or off; the parameter useAltoDirectory
can be used to switch the use of ALTO files for integrating full texts.
The contentServerConfigPath
parameter can be used to specify a path to an alternative ContentServer configuration if a different behavior is desired than for PDF creation in Goobi.
Stopping the plugin
After stopping the plugin, a currently running PDF creation will be stopped before the plugin is terminated. Any jobs remaining in the queue will be processed after restarting the plugin.
The PDF creation is performed by the TaskManager plugin. So if the TaskManager itself is terminated, the creation is aborted immediately. However, the job is then neither on the waiting list, nor is it considered finished or aborted. In this case, the job must be manually aborted and restarted.
Last updated