Technical Standards

The CORE system currently relies on the following technologies (this blog post will be updated to keep the information current):

– OCLC OAIHarvester2 – a set of Java classes for the OAI-PMH metadata harvesting
– J2EE and Spring libraries for the development of the web based interface of the application
– Apache Lucene – for the indexing of the metadata and full-text documents
– Apache Tika – for the extraction of text from pdf documents
– Sesame – as a triple store for exposing the extracted triples
– MySQL – as a backend for Sesame and the Harvester application

The standards used in CORE include the use of OAI-PMH for harvesting and RDF and OWL for the representation of the generated data and protocols.