4.15. PDF generation for the Goobi viewer
This TaskManager plugin creates PDF files based on METS/MODS documents and writes them to the file system. PDF file creation jobs are created through the TaskClient and can be monitored through a REST interface. When a job is completed, a status message is sent to a configured URL.
This plugin is designed so that interrupted jobs are re-processed and completed if the plugin has been switched off in the meantime, or the entire TaskManager is switched off. This also applies to an unexpected termination of the server.
To communicate with the TaskManager, a Goobi viewer with a version number of 3.1 or later is required. The viewer must have access to a task client, which may already be included in the scope of delivery of the Goobi viewer installation.
Starting the plugin
Unlike the other TaskManager plugins, this plugin is not requested via the usual TaskClient call, but programmatically directly from the Java code. The call in the source code is done as follows:
Parameter
Possible Goobi variable
Meaning
taskManagerUrl
http://localhost/itm/service
URL to intranda TaskManager interface
metsFilePath
/opt/digiverso/viewer/indexed_mets/12345.xml
The absolute path in the filesystem for the METS file
targetPath
/opt/digiverso/viewer/download_pdf/emos5dm.pdf
The absolute path to the download directory
priority
10
The priority in the TaskManager queue
logId
LOG_0003
The METS-ID of the structural element to be used. This value can be zero or empty; in this case, the top structure element of the plant is used
title
12345_LOG_0003
The METS-ID of the structural element to be used. This value can be zero or empty; in this case, the top structure element of the plant is used
dataRepository
/opt/digiverso/viewer/
The (absolute) file path to the viewer repository with the required content.
downloadIdentifier
The identifier of the PDF job
Configuration of the plugin
In the plugin configuration folder of the intranda TaskManager the configuration file viewerpdfconfig.xml
with the following content must be present:
The individual configuration elements have the following meaning:
Parameter
Meaning
jobsInProcessing.max
The maximum number of PDF generations that can be processed simultaneously.
contentServerconfig
The configuration file to be used for the content server. This should either be the content server configuration file of the viewer itself, or a copy of it.
ViewerServletUrl
The URL to which status updates should be sent when a PDF generation is terminated or fails. This is usually the URL of the harvester interface of the viewer.
You may need to save a copy of the Goobi viewer content server configuration to the location specified in the plug-in configuration.
Status queries via REST-API
The status of a PDF job can be queried via a REST interface. The URLs specified below must be preceded by the URL of the REST interface of the TaskManager. This usually corresponds to the URL of the TaskManager, supplemented by /rest. The following queries are supported:
Details of the order: /viewerpdf/info/{id}
This returns general information about the PDF job from the TaskManager database as a JSON object. The object has the following properties:
Parameter
Meaning
pdfID
The identifier of the PDF job.
title
The title of the PDF.
pages
The number of pages of the PDF.
size
The total size in bytes of all images required for the PDF, or of all required single-page PDFs, if they exist and can be used for PDF generation.
startDate
The date on which the order was received.
endDate
The date on which the order was completed, if applicable.
Position in the PDF queue: /viewerpdf/numJobsUntil/{id}
This returns as plain text the number of PDF jobs queued before the job with the identifier {id}
.
Number of pages before the job: /viewerpdf/numPagesBefore/{id}
This returns as plain text the number of pages of all PDF jobs queued before the job with the identifier {id}
.
Orders before the order: /viewerpdf/jobsUntil/{id}
This returns a JSON object with a list of all PDF jobs that are queued before the job with the identifier {id}. The list consists of objects with the same properties as those created when requesting job details.
For all requests the title of the requested PDF file can be used instead of the identifier. In this case the last job with this title will be used.
Feedback on completion
When a PDF job is completed, regardless of successful or unsuccessful execution, a Get Request with the following parameters is sent to the viewerServletUrl configured in the TaskManager plugin
Parameter
Meaning
identifier
The identifier of the order.
status
The status of the order as a string. This is either READY if the PDF was created successfully, or ERROR if the job was terminated due to an error.
message
This parameter is only added in the case of status=ERROR, and contains the error message with which the job was aborted.
If a job is manually canceled in TaskManager, no completion message is sent.
Stopping the plugin
If the plugin is stopped, the current PDF generation will be aborted after creating the PDF page currently being processed. The current PDF is then considered not yet generated. The next time the plug-in is started, the current PDF is generated again. This also applies to an unexpected stop of the TaskManager or the entire Tomcat server.
The PDF is created within the TaskManager plugin. Stopping the TaskManager will also stop the PDF generation immediately. The interrupted PDF is also re-generated after a restart of the TaskManager.
Last updated