How the Oxford University Research Archive (ORA) uses the CORE API

Jason Partridge – Open Access Service Manager at the Bodleian Libraries, University of Oxford.

One of the fundamental functions of CORE is to support Open Access. One of the most effective ways to achieve this is through automated data gathering, using the CORE API. CORE harvests and aggregates information of research papers collected from institutional and subject repositories, and from open access and hybrid journals, and makes the content available via an API (Application Programming Interface). The CORE API offers a wealth of metadata and full text content from its many data providers.

For ORA (Oxford University Research Archive), the use of the CORE API offers an opportunity to enhance workflows, and to streamline the process of reviewing and curating articles for inclusion in the repository. The open-source CORE API enables institutions to have a unified view of the many sources it indexes. The API provides access to metadata fields for each document, including title, authors, date of publication, DOI, and other publication information. Users can search for documents by DOI or title.

Institutions need to know when research is published or updated with publication information. Funding councils such as Wellcome and UKRI have open access policies that require researchers to ensure their content is shared at the point of publication. The University of Oxford is also operating a Rights Retention policy supporting self-archiving and open access to research that may otherwise be subject to an embargo. For a repository to be able to adequately release research at this point, publication needs to be known.

As happens in most other repositories in the UK, ORA repository staff check content manually when an article is deposited. If the full text is to be released on publication, review staff mark the record for checking after a set period has passed; sometimes two or three checks are required until the paper is published. The ORA team have set up a process by which incomplete or newrecords are automatically updated from the CORE API data, providing DOIs and publication date where available. Once an object has been updated with information from CORE, it is flagged to the review team to show that the update came from CORE.

An “updated by CORE” tag is added to the ORA record in the review interface

Obtaining these programmatic updates help ORA to ensure accurate and timely release of files from embargo, supporting the University to comply with funder requirements and Rights Retention policies. The CORE API offers flexibility and customisation, allowing developers to tailor the publication information to their specific needs and requirements.

Automated updates to metadata in ORA provide a more efficient workflow for the ORA Review staff, allowing information to be added programmatically where possible. Alleviating some of the effort and time taken by staff to check for updates manually. Of course, any system that shares metadata from different sources depends for its success on the quality of metadata. ORA uses simple decision criteria to determine which metadata to add, but the long-term solution is for all institutions to ensure their metadata is of consistent quality and accuracy. For now, this automated mechanism for ORA is to provide an indication of publication, and to bring ORA objects into the ‘check back’ process more quickly than would otherwise be the case.

For more detailed information, please read the full article by Jason Partridge at http://dx.doi.org/10.5287/ora-nb1bawday