developing infrastructure to support closer collaboration of aggregators with open repositories

30
Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories Dr. Nancy Pontika & Dr. Petr Knoth COnnecting Repositories (CORE) Open University, UK LIBER 2015, 24 – 26 June, London

Upload: nancy-pontika

Post on 03-Aug-2015

176 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Developing Infrastructure to Support Closer Collaboration of Aggregators with Open

Repositories

Dr. Nancy Pontika & Dr. Petr KnothCOnnecting Repositories (CORE)

Open University, UK

LIBER 2015, 24 – 26 June, London

Page 2: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Mission of CORE

Aggregate all open access content distributed across different systems worldwide, enrich this content and provide access to it through a set of services …

[Source: http://core.ac.uk/about#mission]

Page 3: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Need for a UK aggregator

Bringing the UK’s open access research outputs together:• Feasibility study commissioned

by Jisc, published June 2014• Referred to as “Open Mirror”

[Source : https://repository.jisc.ac.uk/5570/1/JISC_REPORT_open_mirror_09051

4_FINAL_WEB.pdf]

Page 4: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Three levels of support

Programmable Data Access

- CORE API - CORE Data Dumps

- Researchers- Developers - Companies

Transaction Information

Access

- CORE Portal- CORE Mobile - CORE Plugin

- Researchers- Students

- Life long learners

Analytical Information

Access

- CORE Policy -CORE Compliance

Analytics- CORE Dashboard

- Funders - Governments- Data Providers

[Source: http://www.dlib.org/dlib/november12/knoth/11knoth.html]

Page 5: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

CORE Statistics• Content: 20M+ records, 600+ repositories, 1.8M+

full-texts • The UK national aggregator - Jisc• Full-text aggregator (not just metadata)• Placed among Top 10 search engines for research

that go beyond Google [Jisc, 2013]• Listed among Top 100 Thesis and Dissertation

Resources• Part of Jisc’s Repositories Shared Services Project

(RSSP)

Page 6: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Aggregation process • Metadata download, extraction and cleaning• Full-text harvesting• Text extraction• Language detection• Extraction of citation references from text• Identification of related content• Detection of duplicate items• Parsing of author names• Indexing

Page 7: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

CORE Applications • CORE Portal– Search engine providing open access content

• CORE Mobile – Android and iOS apps

• CORE Plugin– For repositories and journals

• CORE API– Programmable access to million of resources

• CORE Dashboard – Tool for repository managers

Page 8: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

CORE Dashboard : purpose

• Harvested Records

• Metadata

• Harvesting Process

• Standards

• Repository Managers

• Funders

• Repositories• Journals

Data Providers Collaboration

QualityTransparency

Page 9: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Institution main page

Page 10: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Edit repository information

Page 11: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Invitations

Page 12: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Content

Page 13: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Manage record visibility statusTake down

Page 14: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Manage record visibility statusTake down

Page 15: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Manage record visibility statusTake down

Take up

Page 16: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Manage record visibility statusTake down

Take up

Page 17: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Update metadata records

• Asynchronous process • Item is queued in the CORE system• Record is updated within 12 hours

Page 18: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Statistics

Page 19: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues : 3 types

When harvesting your repository/document we encountered an error that we couldn't resolve. These errors need to be fixed in order to to harvest your repository/document.

We encountered an error but we were still able to harvest the repository/document. We strongly recommend that these issues are resolved as they may lead to incompatibility problems in the future.

This may not be a problem but it may be a clue for misconfiguration or future incompatibilities.

Page 20: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues : good news

Page 21: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues : good news

Page 22: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues : bad news…

Page 23: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues: Robots.txt

Page 24: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues: Robots.txt

Page 25: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues: Document Issues

Page 26: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Issues: Malformed PDF url

Page 27: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Dashboard benefits - Increased and simplified collaboration between

aggregators and content providers- Improved control of the content provider over the

harvested content- Reduction of scepticism and fear of sharing

content with other systems- Improvement of the harvesting process- Broadening of the open access content

discoverability and thus reuse of the open access content where permitted

Page 28: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Would you like to take a look?

Dashboard still in BETA but we welcome volunteer testers

Email me at [email protected]

Page 29: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Many thanks to…CORE developers: • Matteo Cancellieri• Samuel Pearce• Drahomira Herrmannova• Lucas Anastasiou

Volunteer testers: • Chris Biggs, Metadata & Repository Specialist, Open University• Nick Sheppard, Repository Developer, Leeds Beckett University

Page 30: Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

Thank you

Questions

CORE Contacts: Nancy Pontika [email protected] Knoth [email protected] Website: http://core.ac.uk Twitter: @oacore