Body: 

What have we produced:

Software tools

Publication:

Knoth, P., Robotka, V. and Zdrahal, Z. (2011) Connecting Repositories in the Open Access Domain using Text Mining and Semantic Data, International Conference on Theory and Practice of Digital Libraries 2011 (TPDL 2011), Berlin, Germany

Poster presentation:

Knoth, P. and Zdrahal, Z. (2011) CORE: Connecting Repositories in the Open Access Domain, CERN workshop on Innovations in Scholarly Communication (OAI7), Geneva, Switzerland

Youtube video presentation:

http://www.youtube.com/watch?v=_YuOJnjCEAA&feature=player_embedded

Linked Data in Libraries event (London) presentation:

http://www.slideshare.net/petrknoth/core-presentation-8593721?from=ss_embed

Next steps:

  • Find ways how to further develop CORE to enable the inclusion of larger amounts of content, i.e. the aggregation of content from more repositories.
  • Integration of CORE with currently emerging Research Data management and repository systems to allow the linking of publications with data.
  • Further dissemination of the service to increase its adoption

    Evidence of Reuse:

    • Data and services currently being reused by the Open Research Online Repository.
    • Positive feedback received from the participants the OAI7 workshop, namely Astrid van Wesenbeeck (SPARC Europe).
    • A positive feedback about CORE received by email as a reaction on the upload of the CORE video on YouTube from Graham Steel.
    • Our team has discovered a set of OAI-PMH base URLs that were not up to date in the OpenDOAR repository and provided this feedback to OpenDOAR. Bill Hubbard of OpenDOAR appreciated this collaboration.

    Skills:

    The project has helped us to further develop skills needed to technically handle large amounts of data. It also increased our understanding of the current state-of-the-art technologies for access and retrieval of Open Access content. These skills will help us to further develop CORE in the future.

    Most significant lessons:

    • Though OAI-PMH harvesting is considered messy by the digital library community, harvesting and processing full-text content is by a magnitude more difficult.
    • Do not use Java tools for thumbnail generation, use ImageMagick instead.
  • Add new comment

    Filtered HTML

    • Web page addresses and e-mail addresses turn into links automatically.
    • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
    • Lines and paragraphs break automatically.

    Plain text

    • No HTML tags allowed.
    • Web page addresses and e-mail addresses turn into links automatically.
    • Lines and paragraphs break automatically.