7 tips for successful harvesting

7tipsThe CORE (COnnecting REpositories) project aims to aggregate open access research outputs from open repositories and open journals, and make them available for dissemination via its search engine.  The project indexes metadata records and harvests the full-text of the outputs, provided that they are stored in a PDF format and are openly available. Currently CORE hosts around 24 million open access articles from 5,488 open access journals and 679 repositories.

Like in any type of partnership, the harvesting process is a two way relationship, were the content provider and the aggregator need to be able to communicate and have a mutual understanding. For a successful harvesting it is recommended that content providers apply the following best practices (some of the following recommendations relate generally to harvesting, while some are CORE specific): read more...

CORE Repositories Dashboard: An infrastructure to increase collaboration of Aggregators with Open Repositories

In an effort to improve the quality and transparency of the harvesting process of the open access content and create a two way collaboration between the CORE project and the providers of this content, CORE is introducing the Repositories Dashboard. The aim of the Dashboard is to provide an online interface for repository providers and offer, through this online interface, valuable information to content providers about:

  • the content harvested from the repository enabling its management, such as by requesting metadata updates or managing take-down requests,
  • the times and frequency of content harvesting, including all detected technical issues and suggestions for improving the efficiency of harvesting and the quality of metadata, including compliance with existing metadata guidelines,
  • statistics regarding the repository content, such as the distribution of content according to subject fields and types of research outputs, and the comparison of these with the national average.

In the CORE Dashboard there is a designated page for every institution, where repository managers will be able to add all the information that corresponds to their own repository, such as the institution’s logo, the repository name and email address. read more...

CORE releases a new API version

We are very proud to announce that CORE has now released CORE API 2.0. The new API offers new opportunities for developers to make use of the CORE open access aggregator in their applications.

The main new features are:

  • Support for looking up articles by a global identifier (DOI, OAI, arXiv, etc.) instead of just CORE ID.
  • Access to new resource types, repositories and journals, and organisation of API methods according to the resource type.
  • Enables accessing the original metadata exactly as it was harvested from the repository of origin.
  • Supports the retrieval of the changes of the metadata as it was harvested by CORE.
  • Provides the possibility of retrieving citations extracted from the full-text by CORE.
  • Support for batch request for searching, recommending, accessing full-texts, harvesting history, etc.

The goals of the new API also include improving scalability, cleaning up and unifying the API responses and making it easier for developers to start working with it.

The API is implemented and documented using Swagger, which has the advantage that anybody can start playing with the API directly from our online client. The documentation of the API v2.0 is available and the API is currently in beta. Those interested to register for a new API key can do so by completing the online form. read more...

CORE among the top 10 search engines for research that go beyond Google

Using search engines effectively is now a key skill for researchers, but could more be done to equip young researchers with the tools they need? Here, Dr Neil Jacobs and Rachel Bruce from JISC’s digital infrastructure team shared their top ten resources for researchers from across the web. CORE was placed among the top 10 search engines that go beyond Google.

More information on the JISC’s website.

Related content recommendation for EPrints

We have released the first version of a content recommendation package for EPrints available via the EPrints Bazaar ( http://bazaar.eprints.org/ ). The functionality is offered through CORE and can be seen, for example, in Open Research Online EPrints ( http://oro.open.ac.uk/36256/ ) or on the European Library portal ( http://www.theeuropeanlibrary.org/tel4/record/2000004374192?query=data+mining ). I was wonderring if any EPrints repository manager would be interested to get in touch to test this in his/her repository. As the
package is available via the EPrints Bazaar, the installation requires just a few clicks. We would be grateful for any suggestions for improvements and also for information regarding how this could be effectively provided to DSpace and Fedora repositories. read more...

Final blog post

The main idea of this blog post is to provide a summary of the CORE outputs produced over the last 9 months and report the lessons learned.

Outputs

The outputs can be divided into (a) technical, (b) content and service and (c) dissemination outputs.

(a) Technical outputs

According to our project management software, to this day, we have resolved 214 issues. Each issue corresponds to a new function or a fixed bug. In this section we will describe the new features and improvements we have developed. The technology on which the system is built has been decribed in our previous blog post. read more...

Technical Approach

In the last six months, CORE has made a huge step forward in terms of the technology solution. According to our project management software, to this day, we have resolved 214 issues. Each issue corresponds to a new function or a fixed bug.

The idea of this blog post is to provide an overview of the technologies and standards CORE is using and to report on the experience we had with them during the development of CORE in the last months. We will provide more information about the new features and enhancements in the following blog posts. read more...