4.7. Internet Archive download

Working with the intranda TaskManager also allows Goobi to efficiently harvest the Internet Archive. For this to happen, two criteria have to be met. The WebDavCommunicator plugin must be present in TaskManager’s plugin folder. The default path is shown below:


The plugin itself is also copied to the plugin folder. The default path is:


Starting the plugin

Der Aufruf des Internet-Archive-Harvestings wird innerhalb von Goobi in einem Workflowschritt folgendermaßen konfiguriert:

/usr/bin/java -jar /opt/digiverso/itm/bin/TaskClient.jar 
    -itm http:~/~/localhost/itm/service 
    -s http:~/~/archive.org/download/${meta.CatalogIDDigital} 
    -d {imagepath}/source/ 
    -n template 
    -e -i {stepid} 
    –T {processtitle} 
    -gid {processid} 


The command parameters are explained in the following table:


Possible Goobi variable




URL to intranda TaskManager interface

-e, --returnError


If this parameter is met, the TaskClient will end with an error code to prevent the workflow from continuing automatically


0 – 10

Priority to execute this job



Goobi process ID



The workflow step ID that launches the call

-T, --title


The Goobi process title for which the call is launched

-t, --jobtype


The job type

-n, --templatename


Name of the previously generated configuration file

-s, --source


Path to the URL for the record in the Internet Archive

-d, --destination


Path to the process main directory

Operation of the plugin

To begin with, the plugin receives a URL for a volume in the Internet Archive together with a target directory. It then downloads the files shown below from the URL into the target directory:


If a download fails, that job’s priority will be reduced, and TaskManager will attempt to download the next job. Failed attempts are reported to Goobi, and a warning message is then recorded in the process log.

Initially, every job in TaskManager is assigned priority level 10. The higher the priority, the sooner a job will be processed. If a job’s priority is reduced to zero, four more download attempts will be made. If all of these fail, the job will be regarded as unsuccessful and will be sent back to Goobi with an error message.

Last updated