CORE released a new Dataset

picture1We are pleased to announce that we have released a new version of our dataset, which contains data aggregated by CORE in a downloadable file.

It is intended for (possibly computationally intensive) data analysis. Here you can read the dataset description and the download page. If you need fresh data, and your requirements are not computationally intensive, you can also use our API.

CORE Recommender

* This post was authored by Nancy Pontika, Lucas Anastasiou and Petr Knoth.

The CORE team is thrilled to announce the release of a new version of our recommender; a plugin that can be installed in repositories and journal systems to suggest similar articles. This is a great opportunity to improve the functionality of repositories by unleashing the power of recommendation over a huge collection of open-access documents, currently 37 million metadata records and more than 4 million full-text, available in CORE.


‘Measuring’ and managing mandates

An investigation by Research Support staff at Brunel University London considers the role CORE might play in supporting funder compliance and the wider transition to open scholarship…

By David Walters (Open Access officer at Brunel) and Dr Christopher Daley (Research Publications Officer at Brunel)

In 2001, the Budapest Open Access Initiative (BOAI) brilliantly and simply encapsulated the aspirational qualities of ‘openness’ that funders, scholars, institutions, services and publishers have since driven forward. This simplicity has been lost in the detail of implementing funder mandates over copyright restrictions, resulting in significant administrative overheads to support staff whose primary role is to smoothly progress a cultural change. Although the momentum is undeniable, the transition to open scholarship is now fraught with complexity.


CORE wins Best Poster Award at the Open Repositories Conference #OR2016

Last week, the CORE team attended the 11th Annual Conference on Open Repositories, an international conference addressed mainly to subject and institutional repository managers, focusing on open access, open data and open science tools, projects and services.

2016-06-15 20.04.57

At the conference the team had six submissions:

  1. A workshop presentation on “How can repositories support the text-mining of their content and why?” where Nancy Pontika explained how repository managers should be supportive of text-mining practices and Petr Knoth described the technical requirements that can enable the text mining of repositories. In addition to that, the CORE team was the workshop organiser, as part of its involvement with the OpenMinTeD project, an EU-funded project on text and data mining. The workshop has been described in two blog posts, one hosted at the OpenMinTeD blog (which includes all workshop presentations), and another post composed by Rebecca Sutton Koeser, a workshop participant.
  1. A full presentation on “Exploring Semantometrics: full text-based research evaluation for open repositories” by Petr Knoth. The presentation explored semantometrics, a new class of research evaluation metrics, which builds on the premise that full text is needed to assess the value of a publication. (Presentation available here.)
  1. A 24×7 presentation on the “Implementation of the RIOXX metadata guidelines in the UK’s repositories through a harvesting service”, where Matteo Cancellieri and Nancy Pontika described how the RIOXX metadata guidelines are now a new embedded feature in the CORE Repositories Dashboard. (Presentation slides here.)
  1. & 5. Two demo presentations during the Developer Track sessions. The first one was on “Mining Open Access Publications in CORE”, where Matteo Cancellieri demonstrated the new CORE API and the second was entitled “Oxford vs Cambridge Contest: Collecting Open Research Evaluation Metrics for University Ranking” where Petr Knoth used the traditional Oxford University vs Cambridge University contest to show how to freely gather and compare the research performance of universities. (The code for both demo presentations is on Github.)
  1. A poster on the “Integration of the IRUS-UK Statistics in the CORE Repositories Dashboard”, by Samuel Pearce and Nancy Pontika, which showed the process of embedding the existing IRUS-UK statistics service to the CORE Repositories Dashboard. We were delighted also that our poster won the best poster award (yay!). We would like to thank all the conference participants who stopped by our poster, got the CORE freebies and voted for us! (You can access the poster here.)

IMG_1252Based on the fact that this conference has a clear focus on repository services and that the CORE service uses or is being used by these services, we were also extensively mentioned in other presentations as well. For example: Richard Jones in his presentation on Lantern mentioned that the project is using the CORE API; Paul Walk described how CORE is using the RIOXX metadata application profile; the Repositories of the Future panel, organised by COAR, stressed on the importance of the role of aggregators in the repository environment specifically naming CORE; and the “Ideas Challenge”, a thought-provoking and brainstorming group exercise consisting of programmers and repository managers that focused on how to make the lives of academics easier, proposed CORE as a runner up for the development of a cross-repository journal and topic browse interface. Finally, CORE was also presented in the Jisc poster on “Jisc’s Open Access Services”.


Waving goodbye to API v1

Back in March, we announced the beta release of the CORE API v2. This added new features such as searching via DOI and retrieving citations for full texts.

This new API should be more reliable and have a higher quality metadata output compared to the old version.

Over the next few months, we aim to finalise the API v2 and finally close access to v1. The scheduled date of the v1 switch off is Monday, 25th April 2016. 

We hope that most users have already had an opportunity to test v2 of the API but if not, we suggest that you check out the documentation here.


How about them stats?

Every month Samuel Pearce, one of the CORE developers, collects the CORE statistics – perhaps a boring task, but useful for us to know where we stand as a service. A very brief report of the accumulative statistics of all years that CORE operates as a project, 2011 – 2015, are as follows.
Users can retrieve from CORE,

  • 25,363,829 metadata records and
  • 2,954,141 open access full-text records, 

from 689 repositories (institutional and subject) and 5,488 open access journals. In addition, 122 users have access to the CORE API

In the playful Christmas spirit we attempted this time to have some fun with the statistics.


7 tips for successful harvesting

7tipsThe CORE (COnnecting REpositories) project aims to aggregate open access research outputs from open repositories and open journals, and make them available for dissemination via its search engine.  The project indexes metadata records and harvests the full-text of the outputs, provided that they are stored in a PDF format and are openly available. Currently CORE hosts around 24 million open access articles from 5,488 open access journals and 679 repositories.

Like in any type of partnership, the harvesting process is a two way relationship, were the content provider and the aggregator need to be able to communicate and have a mutual understanding. For a successful harvesting it is recommended that content providers apply the following best practices (some of the following recommendations relate generally to harvesting, while some are CORE specific):


CORE Repositories Dashboard: An infrastructure to increase collaboration of Aggregators with Open Repositories

In an effort to improve the quality and transparency of the harvesting process of the open access content and create a two way collaboration between the CORE project and the providers of this content, CORE is introducing the Repositories Dashboard. The aim of the Dashboard is to provide an online interface for repository providers and offer, through this online interface, valuable information to content providers about:

  • the content harvested from the repository enabling its management, such as by requesting metadata updates or managing take-down requests,
  • the times and frequency of content harvesting, including all detected technical issues and suggestions for improving the efficiency of harvesting and the quality of metadata, including compliance with existing metadata guidelines,
  • statistics regarding the repository content, such as the distribution of content according to subject fields and types of research outputs, and the comparison of these with the national average.

In the CORE Dashboard there is a designated page for every institution, where repository managers will be able to add all the information that corresponds to their own repository, such as the institution’s logo, the repository name and email address.


CORE releases a new API version

We are very proud to announce that CORE has now released CORE API 2.0. The new API offers new opportunities for developers to make use of the CORE open access aggregator in their applications.

The main new features are:

  • Support for looking up articles by a global identifier (DOI, OAI, arXiv, etc.) instead of just CORE ID.
  • Access to new resource types, repositories and journals, and organisation of API methods according to the resource type.
  • Enables accessing the original metadata exactly as it was harvested from the repository of origin.
  • Supports the retrieval of the changes of the metadata as it was harvested by CORE.
  • Provides the possibility of retrieving citations extracted from the full-text by CORE.
  • Support for batch request for searching, recommending, accessing full-texts, harvesting history, etc.

The goals of the new API also include improving scalability, cleaning up and unifying the API responses and making it easier for developers to start working with it.

The API is implemented and documented using Swagger, which has the advantage that anybody can start playing with the API directly from our online client. The documentation of the API v2.0 is available and the API is currently in beta. Those interested to register for a new API key can do so by completing the online form.