There are many reasons why a repository may end up with multiple copies of an article, for example, having the author’s original manuscript and the final post-review copy is a common scenario of near-duplicate content. Another example might be when multiple co-authors deposit the same manuscript without being aware of each other. Detecting (near-)duplicates and distinguishing them from different versions of the same article is both challenging and time-consuming. We have seen that a typical repository will have hundreds of duplicates and near-duplicate records, signifying the scale of this issue.
CORE + GROBID: Structured Text from 34 Million Scientific Documents (and counting)
We very recently surveyed our CORE members to ask what was most important to them and we received wide-ranging feedback. The CORE dashboard provides a range of tools for our data providers and their repository managers and users. Much of the feedback we received was regarding providing additional or enhanced tools for managing repository content via the dashboard. For example, metadata validation and enrichment tools were regarded as highly important.
Interestingly however, what was most important was making repository content machine-readable. This is closely linked to identifying funding information and rights-retention strategies. Ensuring content is machine-readable allows for the extraction of far richer information from full-text documents than that available in the metadata alone. In the U.S., the recent OSPT memo on ‘Ensuring Free, Immediate, and Equitable Access to Federally Funded Research‘ includes machine-readability as a required component of the archiving and deposition of federally funded research.
Asking CORE members what matters to them…
We recently held the inaugural meeting of the CORE Board of Supporters where we were joined by 32 representatives from the organisations that have committed to supporting the ongoing sustainability of CORE by joining our membership program.
These amazing institutions are critical to the survival of CORE and we’re incredibly grateful for the support they provide us.
Current CORE members
We work with our members as part of our commitment to The Principles of Open Scholarly Infrastructure (POSI), by listening to our members we can understand precisely what is most important to them. Prior to this kickoff meeting, we therefore sent a wide-ranging survey to gauge what really matters to our members’ repositories, their users and the staff that manage them.
CORE published in Nature Scientific Data
We’re proud and excited to announce that the paper authored by our team entitled ‘CORE: a Global aggregation Service for Open access Papers’, was accepted for publication and is now available as an open access article via Nature.com.
This paper is the culmination of work by the whole CORE team, with contributions from team members both past and present. It discusses how CORE has grown from a research project initiated by Dr. Petr Knoth in 2010 to the service it is today, serving over 30 million unique users each month. The paper also elaborates on the continuously growing CORE dataset and details the systematic challenges associated with gathering research papers from thousands of data providers worldwide at an unprecedented scale and the novel solutions developed to address these challenges.
Update on Delivering the CORE Membership Programme
We’re keen to update you with the latest developments as we continue to welcome more CORE Members and keep improving the tools and support for members while delivering on our mission to index all open access research worldwide. In March, we welcomed another six new institutions who have joined CORE as Supporting and Sustaining members; University of Exeter, Cardiff University, Manchester Metropolitan University, University of Hull, University of Nottingham and University of Strathclyde. A huge thank you goes out to all of these amazing folks!
CORE-GPT: Combining Open Access research and AI for credible, trustworthy question answering
Update 6th July 2023 – Our paper entitled “CORE-GPT: Combining Open Access research and large language models for credible, trustworthy question answering.” has been accepted to TPDL2023 and will be published in the LCNS series by Springer.
The public release of ChatGPT-3 in November last year captured the public’s imagination and turned this technology into front page news overnight. Only this week we saw the release of its much more powerful sibling, ChatGPT-4. In just a few short weeks there have already been some frankly startling demonstrations of the capabilities of these models, from writing poetry to code completion amongst many others.
CORE welcomes 10 new members
As part of our ongoing sustainability plan, in December 2022, we launched the CORE Membership program for data providers. CORE is a not-for-profit service dedicated to the open access mission and one of the signatories of the Principles of Open Scholarly Infrastructures POSI. Following the recently announced changes to our status, to remain free for public use, CORE is leveraging a membership model to help sustain its operations.
We are therefore delighted today to announce that, in the very short time since the membership programme has launched, we have already welcomed ten institutions who have made a public and financial commitment to supporting Open Access infrastructure by becoming Supporting or Sustaining members of CORE.
CORE to become an independent Open Access service from August 2023
Jisc and The Open University have had a long-standing relationship delivering CORE (core.ac.uk) for over 10 years. During this time, the service has grown from a project to an important and widely used Open Research infrastructure.
The current Jisc – OU contract for delivering CORE to the open scholarly community is expiring in July 2023. From this time onwards CORE will be operated by The Open University and will no longer receive direct funding from Jisc. The Open University is grateful to Jisc for its support of CORE over the last ten years.
CORE runner-up at Open University Research Excellence Awards 2022 
The annual OU Research Excellence Awards highlight the diverse research undertaken at the OU and recognise the impact that this research has for the economy, the environment, and society as a whole. More than 250 Open University (OU) staff, students, funders and partners came together in London on the 22nd September for this year’s awards.
We are extremely pleased that the CORE team was announced as runner-up in the highly-contested ‘Best External Collaboration and Knowledge Exchange Award’ for its knowledge exchange activities with a diverse range of external partners spanning from innovators, AI technology companies, digital library providers, plagiarism detection providers, academic social networks, funders and others, including its over decade long partnership with Jisc, the digital solutions provider for UK education and research. CORE received a small financial award to support research activities from the OU in recognition of this achievement.
CORE Membership – launching soon!
CORE (core.ac.uk), a not-for-profit service delivered by The Open University in partnership with Jisc, has been serving the scholarly community since 2011 and in that time has experienced phenomenal growth in every way. CORE collates Open Access research from over 10,500 data providers across the world and is now the largest collection of open access research literature. Over 30 million users each month access CORE, either via search or one of our services. We have also worked hard to develop services for our data providers and support them with tools to help better manage the content in their repositories, including improving discoverability, registering unique persistent identifiers, enriching content with data such as missing DOIs and helping monitor that their content remains compliant with Open Access policies and mandates.