R4D Gateway

OAI PMH Implementation for R4D

08 November 2011

The XML Gateway for R4D is an implementation of the OAI-PMH protocol for documents and outputs that are available from the R4D research portal repository. It allows for the document metadata to be accessed via a web based API in a standardized XML format, which is typically used to provide links and cross referencing services to the repository research documents.

The OAI-PMH protocol is a simple lightweight protocol that allows repositories to expose the metadata about documents which they hold. The Protocol is built upon HTTP and XML, and uses standardized formats such as unqualified Dublin Core, for the purposes of processing the metadata.

The protocol makes distinctions between meta harvesters and data providers :
1. A data provider is a repository which serves up its metadata through the OAI PMH protocol, returning the XML responses to the requests made.
2. A meta harvester is an application that accesses a Data provider(s) to extract the metadata content that is to be used, so that this information can be used for searching, document linking and other services.

The R4D implementation is a data provider service. It will allow harvesters to access its content upon request and provide them with information for them to use in providing their services.

The R4D implementation of this gateway operates in the following manner:

1. There is a website / web interface that listens to OAIPMH protocol requests (The gateway is available at the virtual directory /R4D/Gateway). This is responsible for validating the requests made and returning the XML response to the client.
2. If the request is valid, then the records that match the request will be processed from the database and the metadata about them formatted in the requested format.

The R4D Gateway currently supports the following metadata formats:

1. OAI Dublin Core (OAI_DC) - the Open Access Initiatives implementation of the OAI Dublin Core.
2. Resource Description Framework (RDF) - a basic RDF metadata implementation that uses the Dublin core elements.
There are 6 request types that can be accessed via the OAIPMH protocol :

Identify
ListMetadataFormats
ListSets
ListIdentifiers
ListRecords
GetRecord

The last three, ListIdentifiers, ListRecords, GetRecord will require accessing records from the repository. The GetRecord request is for single record requests, whereas the List identifiers and ListRecords requests return bulk record sets.

A paging system of 100 records per request has currently been implemented on the R4D Gateway. Because of the number of documents held within the repository as well, we have also set up a system for resuming requests using a resumption token as specified by the OAIPMH protocol.

The base url for R4D OAI-PMH

The base url for R4D OAI-PMH is http://www.dfid.gov.uk/r4d/Gateway/?verb=

Here are some example uses:

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListSets
Lists the sets/collections in the repository

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&metadataPrefix=oai_dc
List all records in repository, using oai_dc metadata prefix

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListMetadataFormats
List of metadata formats supported by the repository

http://www.dfid.gov.uk/r4d/Gateway/?verb=Identify
Identify the repository

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListIdentifiers&metadataPrefix=oai_dc
List all identifiers of items in the repository, using OAI Dublin Core

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListMetadataFormats&identifier=oai:dfid.gov.uk/r4d/oaipmh:Outputs_5001
List metadata formats by the item specified

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&from=2008-10-21&metadataPrefix=oai_dc
List records added or modified since date specified in specified set (checks Date Added/Date Modified in DB)

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&resumptionToken
Lists next 100 records from last query’s Resumption token (please add the resumption token from your result set at the end of the url)

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&from=2008-10-21&until=2009-10-21&metadataPrefix=oai_dc
List records added or modified between dates specified in specified set (checks Date Added/Date Modified in DB)

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&metadataPrefix=oai_dc&set=42E8C114-5E17-4265-8F63-9FC9C5462005 
List records in selected set (Abstracts here) (from 1st url in this table) in OAI Dublin Core

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&metadataPrefix=rdf&set=42E8C114-5E17-4265-8F63-9FC9C5462005
List records in selected set (Abstracts here) (from 1st url in this table) in RDF

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&metadataPrefix=oai_dc&set=42E8C114-5E17-4265-8F63-9FC9C5462005
List records in selected set (Livestock Production here) (from 1st url in this table) in OAI Dublin Core

http://www.dfid.gov.uk/r4d/Gateway/?verb=ListRecords&metadataPrefix=rdf&set=42E8C114-5E17-4265-8F63-9FC9C5462005
List records in selected set (Livestock Production here) (from 1st url in this table) in RDF

http://www.dfid.gov.uk/r4d/Gateway/?verb=GetRecord&Identifier=oai:www.dfid.co.uk:Output_178260&metadataPrefix=oai_dc Get Record by Identifier in OAI Dublin Core