I am currently working with Martin Klein, Matteo Cancellieri and Herbert Van de Sompel on a project funded by the European Open Science Cloud Pilot that aims to test and benchmark ResourceSync against OAI-PMH in a range of scenarios. The objective is to perform a quantitative evaluation that could then be used as evidence to convince data providers to adopt ResourceSync. During this work, we have encountered a problem related to the scalability of ResourceSync and developed a solution to it in the form of an On Demand Resource Dump. The aim of this blog post is to explain the problem, how we arrived to the solution and how the solution works.
A couple of days ago we released a new version of our website and if you visit our main page it now looks slightly different.
One of our aims was to showcase in a more clear way the CORE testimonials, i.e. what others think of the project and how the community uses our products, mainly our API and Datasets. In an effort to give credit to the universities and companies that are using our services, such as our Recommender and API, we are now displaying their logos on our main page. Our last new item is our research partners; CORE could not offer some of its services without co-operating with other projects, such as IRUS-UK, RIOXX and more.
* This post was authored by Matteo Cancellieri, Petr Knoth and Nancy Pontika.
Last month, CORE attended the JISC ORCID hackday events in Birmingham and London. (ORCID is a non-profit organisation that aims to solve the author disambiguation problem by offering unique author identifiers). Following the discussions that sparked off at the two events, we decided to test the CORE data towards ORCID’s API and we discovered some information that we think is of interest to the scholarly community.
* Post updated on June 20th and June 23rd with links to presentations.
In this year’s Open Repositories 2016, an international conference addressed to the scholarly communications community with a focus on repositories, open access, open data and open science, CORE had 6 items accepted; 1 Paper, 1 Workshop, 1 Repository Rave presentation, 1 Poster and 2 showcases in the Developer Track and Ideas Challenge. The titles and summaries of our accepted proposals are:
Paper: Exploring Semantometrics: full text-based research evaluation for open repositories / Knoth, Petr; Herrmannova, Drahomira
Over the recent years, there has been a growing interest in developing new scientometric measures that could go beyond the traditional citation-based bibliometric measures. This interest is motivated on one side by the wider availability or even emergence of new information evidencing research performance, such as article downloads, views and twitter mentions, and on the other side by the continued frustrations and problems surrounding the application of citation-based metrics to evaluate research performance in practice. Semantometrics are a new class of research evaluation metrics which build on the premise that full text is needed to assess the value of a publication. This talk will present the results of an investigation into the properties of the semantometric contribution measure (Knoth & Herrmannova, 2014). We will provide a comparative evaluation of the contribution measure with traditional bibliometric measures based on citation counting. Our analysis also focuses on the potential application of semantometric measures in large databases of research papers.
Back in March, we announced the beta release of the CORE API v2. This added new features such as searching via DOI and retrieving citations for full texts.
This new API should be more reliable and have a higher quality metadata output compared to the old version.
Over the next few months, we aim to finalise the API v2 and finally close access to v1. The scheduled date of the v1 switch off is Monday, 25th April 2016.
We hope that most users have already had an opportunity to test v2 of the API but if not, we suggest that you check out the documentation here.
The CORE (COnnecting REpositories) project aims to aggregate open access research outputs from open repositories and open journals, and make them available for dissemination via its search engine. The project indexes metadata records and harvests the full-text of the outputs, provided that they are stored in a PDF format and are openly available. Currently CORE hosts around 24 million open access articles from 5,488 open access journals and 679 repositories.
Like in any type of partnership, the harvesting process is a two way relationship, were the content provider and the aggregator need to be able to communicate and have a mutual understanding. For a successful harvesting it is recommended that content providers apply the following best practices (some of the following recommendations relate generally to harvesting, while some are CORE specific):
In an effort to improve the quality and transparency of the harvesting process of the open access content and create a two way collaboration between the CORE project and the providers of this content, CORE is introducing the Repositories Dashboard. The aim of the Dashboard is to provide an online interface for repository providers and offer, through this online interface, valuable information to content providers about:
- the content harvested from the repository enabling its management, such as by requesting metadata updates or managing take-down requests,
- the times and frequency of content harvesting, including all detected technical issues and suggestions for improving the efficiency of harvesting and the quality of metadata, including compliance with existing metadata guidelines,
- statistics regarding the repository content, such as the distribution of content according to subject fields and types of research outputs, and the comparison of these with the national average.
In the CORE Dashboard there is a designated page for every institution, where repository managers will be able to add all the information that corresponds to their own repository, such as the institution’s logo, the repository name and email address.
The University of London Computer Centre is now using the CORE Plugin for cross-repository recommendation of similar documents.
KMI and the European Library/Europeana jointly organised the 1st International Workshop on Mining Scientific Publications associated with JCDL 2012 – the most prestigious conference in the world of digital libraries. The workshop was attended by major players in the field including the National Library of Medicine, Library of Congress, CiteSeerX, Elsevier and British Library. Although Barack in the end didn’t come, the workshop was very successful, the only problem being the lack of chairs in the room. We (the workshop organisers – Petr Knoth, KMi; Zdenek Zdrahal, KMI and Andreas Juffinger, The European Library/Europeana) were motivated by the positive response of the community to the importance of issues researchers face when mining research publications to improve the way research is carried out and evaluated.