mySobek Home   |   Help
Skip Navigation Links.
MISSING BANNER

Batch Processing from an OAI-PMH Feed

Selecting an OAI-PMH Data Provider

Currently, the OAI-PMH batch processor will accept any OAI-PMH feed which includes 'oai_dc' as a possible metadata prefix.

By selecting the fourth link ( Harvest metadata via OAi-PMH ) from the batch process menu, you will be shown the form to enter the URL for the OAI-PMH repository. You will want to enter the base URL only, do not include any of the OAI-PMH verbs or query strings. In addition, include the prefix, such as 'http://', on the link.

Figure 1: OAI-PMH Repository URL Form

Once you enter a valid the URL for the data provider, you can select the TEST URL button to check the link. The system will perform a basic Identify against the data provider and validate the URL. If the URL is correct, you should see a form such as the one below:

Figure 2: Successful test of a URL

Selecting the CONTINUE button on the URL form will allow you to enter more details regarding the import on the OAI-PMH Import Form.

Refining your Instructions and Metadata

The form below allows you to enter more specifics for the batch processing:

Figure 3: OAI-PMH Import Form

This form allows you to select the set for which you would like to create METS files. In addition, you can enter a number of constants, which will be applied to all the resulting METS files. Finally, you must choose the destination folder for all the METS files created.

Defaults that you have set in the metadata preferences are NOT applied to the resulting METS files, as it is assumed these records are from external libraries. So, your decisions on bibliographic format and add-ons is used, but no system-wide defaults are applied. Any defaults to be applied must be entered on this form.

To continue you will also need to enter a value for the First BibID constant. The BibID is a ten digit alphanumeric identifier used for SobekCM libraries which begins with at least two characters and ends with at least four numbers. Whatever you enter will be the beginning of the final BibID's, which also act as part of the ObjectID for the METS files.

If you enter a string less than 10 characters long, this will be the prefix. For example, entering 'ca008' will cause the first BibID assigned to be 'CA00800000', the second 'CA00800001', the third 'CA00800002', etc..

Entering a full BibID, such as 'MANIOC0123' will break the item into two parts and compute each BibID from the provided BibID. For example, entering 'MANIOC0123' will cause the first BibID assigned to be 'MANIOC0123', the second 'MANIOC0124', the third MANIOC0125', etc..

Repository Details

You can view the details of the repository from the main menu ( ViewRepository Details ), from the main OAI-PMH Import Form.

Figure 4: OAI-PMH Repository Details

Performing Batch Process

Once you have entered all the necessary data, select the EXECUTE button to begin this process. A progress bar will display progress as the METS files are created in the destination folder you provided. You may see the progress bar reset several times; if the provider uses resumption tokens, it may have to pull records several times to get the full set of records. This happens automatically, but resets the progress bar each time.

If you are preparing these METS files for inclusion in a SobekCM library, you can simply move all the created METS files into the bulk loader's drop box and they should all load within a couple minutes.

Identifier Mapping

To allow you to update records ingested through this method, a mapping is retained between the OAI Identifier for a digital resource and the resulting METS ObjectID. This mapping is retained as a XML file for each repository processed. Depending on file permissions, the mapping file will either be under the application folder in an OAI subfolder, or in your My Documents folder. These mappings will automatically be found and you will be prompted to reuse them on subsequent runs of the process.