4.15. PDF generation for the Goobi viewer

This TaskManager plugin creates PDF files based on METS/MODS documents and writes them to the file system. PDF file creation jobs are created through the TaskClient and can be monitored through a REST interface. When a job is completed, a status message is sent to a configured URL.

This plugin is designed so that interrupted jobs are re-processed and completed if the plugin has been switched off in the meantime, or the entire TaskManager is switched off. This also applies to an unexpected termination of the server.

To communicate with the TaskManager, a Goobi viewer with a version number of 3.1 or later is required. The viewer must have access to a task client, which may already be included in the scope of delivery of the Goobi viewer installation.

Starting the plugin

Unlike the other TaskManager plugins, this plugin is not requested via the usual TaskClient call, but programmatically directly from the Java code. The call in the source code is done as follows:

OcrClient.createPost(taskManagerUrl, 
    metsFilePath, targetPath, "", "", priority, logId, 
    title, dataRepository, "VIEWERPDF", downloadIdentifier, 
    "noServerTypeInTaskClient", "", "", "", "", false);

Configuration of the plugin

In the plugin configuration folder of the intranda TaskManager the configuration file viewerpdfconfig.xml with the following content must be present:

<config_plugin>
    <jobsInProcessing max="1"></jobsInProcessing>
    <contentServerconfig>
        /opt/digiverso/itm/plugins/config/config_contentServer.xml
    </contentServerconfig>
    <viewerServletUrl>http://localhost:8082/viewer/harvest</viewerServletUrl>
</config_plugin>

The individual configuration elements have the following meaning:

You may need to save a copy of the Goobi viewer content server configuration to the location specified in the plug-in configuration.

Status queries via REST-API

The status of a PDF job can be queried via a REST interface. The URLs specified below must be preceded by the URL of the REST interface of the TaskManager. This usually corresponds to the URL of the TaskManager, supplemented by /rest. The following queries are supported:

Details of the order: /viewerpdf/info/{id}

This returns general information about the PDF job from the TaskManager database as a JSON object. The object has the following properties:

Position in the PDF queue: /viewerpdf/numJobsUntil/{id}

This returns as plain text the number of PDF jobs queued before the job with the identifier {id}.

Number of pages before the job: /viewerpdf/numPagesBefore/{id}

This returns as plain text the number of pages of all PDF jobs queued before the job with the identifier {id}.

Orders before the order: /viewerpdf/jobsUntil/{id}

This returns a JSON object with a list of all PDF jobs that are queued before the job with the identifier {id}. The list consists of objects with the same properties as those created when requesting job details.

For all requests the title of the requested PDF file can be used instead of the identifier. In this case the last job with this title will be used.

Feedback on completion

When a PDF job is completed, regardless of successful or unsuccessful execution, a Get Request with the following parameters is sent to the viewerServletUrl configured in the TaskManager plugin

If a job is manually canceled in TaskManager, no completion message is sent.

Stopping the plugin

If the plugin is stopped, the current PDF generation will be aborted after creating the PDF page currently being processed. The current PDF is then considered not yet generated. The next time the plug-in is started, the current PDF is generated again. This also applies to an unexpected stop of the TaskManager or the entire Tomcat server.

The PDF is created within the TaskManager plugin. Stopping the TaskManager will also stop the PDF generation immediately. The interrupted PDF is also re-generated after a restart of the TaskManager.

Last updated