Contents
The COnnecting REpositories (CORE) project aims to facilitate the access and navigation across relevant scientific papers stored in Open Access repositories. The project will make a new open metadata repository available in the Linked Data format describing the semantic relatedness between resources stored across a selection of UK repositories, including the Open University Open Research Online (ORO). A resource discovery web-service and a demonstrator client will be provided to allow UK repositories to easily navigate their users to relevant open access content stored across repositories. The usefulness of this service will be demonstrated on the ORO repository which will provide navigation links to related content in other repositories. CORE will also focus on the development of good practice for the service reuse and uptake in collaboration with OpenDOAR. The resulting metadata repository will be released under a Creative Commons license.
Open Access scientific content is today distributed across more than 1,700 quality checked Open Access repositories, out of which about 168 are based in the UK. Despite some efforts, commercial academic search engines, such as Google Scholar or Vascoda, do not differentiate between Open Access and subscription-based content. Therefore the user information needs are only satisfied if the links resulting from scientific searches lead to the full-text versions of articles, access to which is covered by a subscription paid by the researcher’s library or institute.
Users searching only for Open Access content only typically have to submit queries to a number of relevant Open Access repositories or to use systems that harvest metadata from multiple sources. When relevant content is found, institutional repositories are unable to provide information and navigate users to semantically related Open Access content stored in other repositories. The fact that navigation across Open Access repositories is - due to its distributed nature - difficult for users, gives commercial publishers with very large scholarly databases, such as IEEE, Elsevier or Springer, a competitive advantage.
The CORE project aims to facilitate the access and navigation to relevant scientific papers distributed in Open Access institutional repositories.
CORE will:
Release a new open metadata collection in the Linked Data format describing the semantic relations between resources stored across a selection of UK institutional repositories. The project will assign dereferenceable URIs to all resources in the collection and will make them publicly available.
Develop a web-service reusable by other Open Access repositories and a demonstrator tool for the Open Research Online (ORO) repository.
Develop good practice for the uptake of the provided repository and service in collaboration with the Directory of Open Access Repositories (OpenDOAR) and UKOLN.
CORE project will contribute to the realization of the Resource Discovery Taskforce Vision (RDTF). The project will produce a new open metadata repository containing information about relationships between distributed resources. This will be achieved by generating new metadata and by processing both the full-text and the metadata already provided by existing aggregators and repositories. Please see Section 2.4 for a quantitative estimate of metadata that will be generated by the projects. In particular, the project envisages the use of metadata harvested by the previously funded JISC project RepUK or UK Institutional Repository Search commissioned by JISC in partnership with MIMAS, UKOLN and SHERPA.
CORE is now available as a demo only, the full application will be released at the end of July 2011.
The target users of the CORE open metadata repository will be institutions that administer repositories. The proposed solution will make it possible to offer better browsing and navigation capabilities to the visitors. Institutional repositories will be able to easily dynamically link to related content stored in distributed repositories by using services built on top of the CORE open metadata repository.
The CORE project will create and expose new open metadata reusable by other UK institutional repositories. The new metadata repository will be a complementary service to currently existing aggregations and metadata harvesting systems relying on the OAI-PMH protocol. It will provide access to relationship metadata in the Linked Data format and via a web-service that will be implemented as a part of the project. This will allow more effective ways of resource discovery and will improve the access to resources for research and learning.
The Open University supports releasing the resource relationships open metadata in order to improve the service and user experience of the visitors of the Open Research Online (ORO) repository. ORO is being regularly visited by 30,000 to 50,000 unique users every month. It provides access to about 15,000 peer-reviewed Open Access research papers. The CORE demonstrator tool will improve the browsing and navigation capabilities of ORO.
The metadata about ORO resources are available using the OAI-PMH protocol an also as Linked Data through the work of the JISC funded LUCERO project. The difference between the objective of LUCERO and CORE is the following: LUCERO aims to expose metadata about Open University resources as Linked Data. CORE aims to provide metadata about relationships between resources stored in repositories owned and maintained by different institutions. While target audience for LUCERO are services interested in reusing data published by the Open University, target audience for CORE are any Open Access repositories interested in improving the browsing and navigation capabilities of their users.
Open Access repositories will be encouraged to take part and benefit from the project. They will only be required to submit their OAI-PMH base URL that will be used for the metadata harvesting. Once metadata and content has been harvested and URIs assigned with respect to the government recommendations , repositories will be able to take advantage of the “related resource” web-service. The project will also develop a demonstrator widget which repositories would be able to easily add to their web pages (in a similar way as, for example, adding a Google map). The previously funded JISC project UK IRS provides some similar functionality, but does not allow repositories to embed navigation tools directly into their system, nor it does allow access to the metadata using a web-service and by querying Linked Data.
The project will create a new open metadata repository about resource relationships by processing the metadata and full-texts of the existing Open Access resources. The CORE project will collaborate with OpenDOAR and UKOLN in developing and sharing good practice in Open Access metadata discovery.
The technology for the automatic discovery of semantically related content has been developed and successfully applied as a part of the EU supported project Eurogene. The project has created a large multilingual multimedia repository for learning resources in the domain of human genetics submitted by the network of more than 30 European Universities. The Eurogene team at KMI has strong research experience and a record of high profile research publications in this field and possess ready-to-use technology for full-text processing and linking research articles and learning materials. This will be achieved by using automatic term extraction and shallow parsing techniques to enrich the metadata and applying in turn techniques for link discovery.
The newly generated metadata will be made available in the Linked Data format under Creative Commons or PDDL license and will be regularly updated. The frequency and policy for updates will be developed as a part of the good practice by the project team in collaboration with the Advisory Board. A web-service will be made available to support other institutional repositories in using the provided metadata. Repositories will be also able to download the metadata or to link to them.
Zdenek Zdrahal (Project Director) is a Senior Research Fellow at the Knowledge Media Institute. His research interests include knowledge modelling and management, reasoning, knowledge based system in learning, engineering design, and Web technology. He has been PI in a number of national and European projects, including TINY-IN, SILVER (EPSRC), Clockwork, Cipher, Eurogene, Tech-IT-Easy (EU funded).
Owen Stephens (Project Manager) for the CORE project. He joined the Open University in 2009. He is currently Project Manager for the JISC funded LUCERO project and was previously Project Manager for the JISC funded TELSTAR (Technology enhanced learning supporting students to achieve academic rigour) project delivered at the Open University. Owen also works as an independent consultant to the library sector. He has been on the management team of the library services of two leading UK Universities, he has been responsible for a number of innovative projects at both institutional and national levels. Owen was Project Director for the EThOSNet project to launch national e-theses service based at the British Library, and is the founder of the ‘Mashed Libraries’ events in the UK.
Petr Knoth is a researcher in KMi focusing on various topics in natural language processing and information retrieval. His particular interests lie in methods that can automatically link related parts of documents in large digital libraries and semantically type the relationships based on discourse characteristics. He has been involved in four European Commission funded projects (KiWi, Eurogene, Tech-IT-EASY and DECIPHER) and has a number of publications at international conferences based on this work.
Annika Wolff has worked on several KMi projects. She was the main researcher on the MGT, Tiny-in and SILVER projects and has a number of published papers resulting from this work. Her research interests include knowledge modeling and narrative hypermedia with a particular interest in using narrative to support inquiry-based learning from multimedia resources.
Bill Hubbard (OpenDOAR, SHERPA) is the Head of the Centre for Research Communications at the University of Nottingham, which houses work on a portfolio of open access projects, including the SHERPA consortium, RoMEO, JULIET, OpenDOAR, RSP and activity in European projects like OpenAIRE, NECOBELAC and DART-Europe. Bill speaks widely on open access and related issues – repository network development, institutional integration, cultural change, IPR and policy development and is currently the JISC Research Communications Strategist.
Non Scantlebury (OU Library), BA (Hons) PGCE DipLib, is Head of Research and Innovation at The Open University Library Services. Non has contributed to many internal and exsternally funded projects based at the OU, including the JISC funded DeVIL, PROWE and TELSTAR projects. She is currently on the project team of the LUCERO project and her interests are in digital libraries, resource discovery and services to elearners, teachers and researchers.
Paul Walk (UKOLN) has worked predominately in Higher Education in the UK since the early nineties. Paul has been involved in many community and standards activities, notably with JA-SIG, the International DOI Foundation and IMS Enterprise, and was one of the founders of the XCRI specification for course descriptions which has been adopted widely across Europe. Paul joined UKOLN at the University of Bath in 2006 and became Deputy Director in 2010. He has guided the development of UKOLN as a JISC Innovation Support Centre and the founding of a sustainable community of developers in HE through the JISC-funded DevCSI project. Paul is an advisor to the Resource Discovery Taskforce on technical matters.

