4.7. Internet Archive download
Working with the intranda TaskManager also allows Goobi to efficiently harvest the Internet Archive. For this to happen, two criteria have to be met. The WebDavCommunicator plugin must be present in TaskManager’s plugin folder. The default path is shown below:
The plugin itself is also copied to the plugin folder. The default path is:
Starting the plugin
Der Aufruf des Internet-Archive-Harvestings wird innerhalb von Goobi in einem Workflowschritt folgendermaßen konfiguriert:
Parameters
The command parameters are explained in the following table:
Parameter
Possible Goobi variable
Meaning
-itm
http://localhost/itm/service
URL to intranda TaskManager interface
-e
, --returnError
-
If this parameter is met, the TaskClient will end with an error code to prevent the workflow from continuing automatically
-p
0 – 10
Priority to execute this job
-gid
{processid}
Goobi process ID
-i
{stepid}
The workflow step ID that launches the call
-T
, --title
{processtitle}
The Goobi process title for which the call is launched
-t
, --jobtype
IADOWNLOAD
The job type
-n
, --templatename
template
Name of the previously generated configuration file
-s
, --source
http://archive.org/download/${meta.CatalogIDDigital}
Path to the URL for the record in the Internet Archive
-d
, --destination
{processpath}
Path to the process main directory
Operation of the plugin
To begin with, the plugin receives a URL for a volume in the Internet Archive together with a target directory. It then downloads the files shown below from the URL into the target directory:
If a download fails, that job’s priority will be reduced, and TaskManager will attempt to download the next job. Failed attempts are reported to Goobi, and a warning message is then recorded in the process log.
Initially, every job in TaskManager is assigned priority level 10. The higher the priority, the sooner a job will be processed. If a job’s priority is reduced to zero, four more download attempts will be made. If all of these fail, the job will be regarded as unsuccessful and will be sent back to Goobi with an error message.
Last updated
Was this helpful?