implementing durham etheses - sebastian palucha (pecha kucha)

20
Implementing Durham E-Theses Presented by Sebastian Palucha #rfringe13 CC BY jitze http://www.flickr.com/photos/jitze1942/3521700792

Upload: repository-fringe

Post on 26-Jan-2015

105 views

Category:

Education


1 download

DESCRIPTION

Pecha Kucha slides on Durham University's experience of implementing their Etheses system, presented by Sebastian Palucha, on Friday 2nd August 2013 at Repository Fringe 2013.

TRANSCRIPT

Page 1: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Implementing Durham E-Theses

Presented by Sebastian Palucha#rfringe13

CC BY jitze http://www.flickr.com/photos/jitze1942/3521700792

Page 2: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Durham E-Theses

Initial project spring/summer 2009

First deposit September 2009 ~ 300 research theses per year Simple deposit, single PDF EThOS interoperability EPrints 3.1.3 (born 2009)

CC BY didbygraham http://www.flickr.com/photos/didbygraham/5646920685/

Page 3: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Registered: EThOS, Driver, OCL Digital Gateway (2010 spr.)

EThOS harvest in operation (2010 sum.)

Google Analytics stats (2010 dec.)

EThOS digitised theses loaded (2011 sum.)

Google Custom Search (aut. 2011)

Collaboration with The BL

to improve EThOS services

(aut. 2011 – spr. 2012)

EU/ICO Cookie Law support (2013 sum.)

local digitisation project,

10k (2012 spr2 – )

MySQL migrated to UTF-8 (2013 spring)

Creative Common Licences introduced (2012 aut.)

CC BY AlishaV http://www.flickr.com/photos/alishav/3156574283

Key milestones

Page 4: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Branding: uniform user experience• Issues: browsers, branding

changes• Durham University CMS CSS

• Eprints 3 CSS

Page 5: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Simplistic single PDF deposit• Details > Upload > Deposit• LDAP integration + user field population

• Embargo implemented in first screen

CC BY Pink Sherbet Photography http://www.flickr.com/photos/pinksherbet/236299644

Page 6: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Cover pages

Highly customized LaTeX code Issues with UTF-8 both LaTeX

and plugin Issues with dynamic if/else

Page 7: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Google Analytics: full text downloads

• Two steps:

1. PDF download link (core code)

2. special GA profile• URL structure include

department codes?DDD32

• Internal code modification

Page 8: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

EThOS interoperabilitythrough OAI-PMH harvest• Issues with out of the box plug-in, changes to XML schema needed

• uketdterms:qualificationlevel not defined in EPrints data model

• Embargo date not included. Plugin assumes embargo on an record level, whereas EP on an document level!

• Added department names

• Occasional issues with UTF-8 encoding

Page 9: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

EThOS download WS

• Script for mass download https://github.com/paluchas/ethos-bl

groovy EthosDownloadClient.groovy -i 238830 –m download

Page 10: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

EThOS avoiding duplication• We store EThOS persistent IDs

• We modified /cgi/oai2 script to conditionally exclude ethos records

• Modified record can be exposed to EThOS harvest in future

Page 11: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

UTF-8 issuesUnknown copy/paste issues

seen: OAI/PMH Cover Pages LaTeX Abstract pages

Solution: Code modification Whole MySQL database migration to

UTF-8, fortunately double encoding

CC BY familymwr http://www.flickr.com/photos/familymwr/5548057120 //

Page 12: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Creative Common Licences Approached by student:

specific query about particular CC to be used

A lot of redefinition is code

Page 13: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

CC outreach

Page 14: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Better search, DRO integrationGoogle Custom Search with modified search results

Page 15: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Retrospective digitisation project• 10k paper theses being digitised by local company

• Mass upload with metadata in XML file and digitised material in PDF files, web and archive version. A lot of metadata and quality issues

• Interesting samples of other materials: big prints, DVDs, CDs, cassette tapes, microfilms, small datasets and research software.

Page 16: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

EU/ICO Cookies Law

CC BY USAG-Humphreys http://www.flickr.com/photos/31687107@N07/6206906748

Page 17: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Repository versus real life• Users would like to deposit other than PDF files.

• Requested “Dark” storage

• Encrypted PDFs

• Take down requests, and Web cached content. How far should we liaise with external world

• Some students are not aware about consequences of web deposits: 3rd party copyright, sensitive data not embargoed etc.

• Disciplinary differences; not only humanities vs. sciences.

• External user requesting contact with author or supervisors

Page 18: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Sustainability• Operational:

virtualization, operating systems support, database

• Customization: Bespoken changes and technology deficit

• Support: hard to coordinate across the University departments

CC BY Rennett Stowe http://www.flickr.com/photos/tomsaint/4515448425

Page 19: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Future plans Review process, be paper free, include pass list, extend workflow to exam

board Actively encourage students to use CC licences by demonstrate its benefit Encourage deposit of key data sets and explore data visualization Migrate to new repository framework Integration with Durham University RIS Google Analytics live stats, integration with IRUS-UK

CC BY Boston Public Library http://www.flickr.com/photos/boston_public_library/8902381985/

Page 20: Implementing Durham Etheses - Sebastian Palucha (Pecha Kucha)

Repository of the future

CC by http://www.flickr.com/photos/keoni101/7069578953CC BY Keoni Cabral http://www.flickr.com/photos/52193570@N04/7069578953