Jenny O Neill 'Librarian as Databrarian'

Download Jenny O Neill 'Librarian as Databrarian'

Post on 15-Jul-2015

544 views

Category:

Education

2 download

TRANSCRIPT

Slide 1

Jenny ONeillDRI Data Curator Trinity College Dublin

joneill@tchpc.tcd.ie

Librarian as DatabrarianASL2015

Digital Repository of IrelandData Curator1641 Depositions2Mission

DRI is a trusted digital repository for Humanities and Social Sciences Data

DRI links and preserves the rich data held by Irish institutions, providing a central access point and multimedia toolsASL2015First I would like to give a brief introduction to the DRI. The DRI is a trusted digital repository for Humanities and Social Sciences data

The DRI is funded under the HEA PRTLI 5 program and is built by a research consortium of six academic partners working together to deliver the repository, policies, guidelines and training. These are the RIA, TCD, NUIM, DIT, NUIG, and NCAD. The DRI now consists of 36 people across 6 sites and has partners in the cultural and academic sectors, and as well as in industry.

3Cheeky plug!

ASL2015

First I would like to give a brief introduction to the DRI. The DRI is a trusted digital repository for Humanities and Social Sciences data

4Data Curator

This position will involve assembling and curating diverse data sets from the humanities and social sciences for ingestion into the Digital Repository of IrelandBut alsoMetadata TaskforceWorkflowsOrganisational LiasionLinked data factsheetMetadata guidelines for MARCFeature champion for IngestionASL2015

I started working in the Trinity Centre of High Performance Computing in June of this year. Dont ask me what High Performance Computing is, despite working there for four months I still have no idea. The sentence above is taken from the job ad for my job. When I went back to the job ad the day before I started the job I realised this was the only line describing what I would actually be doing.

Lots of other areas of DRI to get involved in, my advice to any new librarians is get stuck in. If you see a gap that your skills can fill jump in. Mention each of these briefly but dont go into detail.

51641 Depositions

The 1641 Depositions are witness testimonies from all social backgrounds, concerning their experiences of the 1641 Irish rebellion.

Digitised images of the Depostions (TIFF)Transcriptions encoded in TEI (XML)

ASL2015My first dataset, images and text, 9,000+ of each. Already digitised and transcribed. Project is finished. Images sitting on a server in TCHPC, tei/xml files, some metadata in the tei header, also MySQL database with the rest of the metadata. My job is to create Qualified Dublin Core records for each digital object (18,000+ objects).

6Step 1 Explore the database

ASL2015

https://class.stanford.edu/courses/Home/Databases/Engineering/about7Step 1

ASL2015

8Step 1

ASL2015

{

9Step 1

ASL2015

My first dataset, images and text, 9,000+ of each. Already digitised and transcribed. Project is finished. Images sitting on a server in TCHPC, tei/xml files, some metadata in the tei header, also MySQL database with the rest of the metadata. My job is to create Qualified Dublin Core records for each digital object (18,000+ objects).

10Step 1

ASL2015

{My first dataset, images and text, 9,000+ of each. Already digitised and transcribed. Project is finished. Images sitting on a server in TCHPC, tei/xml files, some metadata in the tei header, also MySQL database with the rest of the metadata. My job is to create Qualified Dublin Core records for each digital object (18,000+ objects).

11Step 2 Choose a metadata standardASL2015

Simple Dublin CoreQualified Dublin CoreEADMODSMARC21 encoded as MARCXML12Step 2ASL2015

http://www.dri.ie/publications#guidelines13Step 3 Map to QDC

ASL2015Mandatory in DRITitleCreatorCreatedDescriptionRightsOptional in DRIIdentifierModified

Recommended in DRIContributorLanguageSourceSpatialTemporalTypeRelation

14Step 3

ASL2015TitleCreator

CreatedSource

Subject

Identifiertitlesurname, forename, patronymics, age,person_type_descday, month, yearmanuscript_number, folio_start,folio_end, pagegender_desc, nationality_desc,religion_descdeposition_id15Step 4 Clean the metadata

ASL2015http://openrefine.org/http://www.dri.ie/publications#guidelines

The DRI is funded under the HEA PRTLI 5 program and is built by a research consortium of six academic partners working together to deliver the repository, policies, guidelines and training. These are the RIA, TCD, NUIM, DIT, NUIG, and NCAD. The DRI now consists of 36 people across 6 sites and has partners in the cultural and academic sectors, and as well as in industry. 16Step 4

ASL2015

ISO 8601 = YYYY-MM-DD

17Step 4

ASL2015

cells["year"].value + "-" + cells["month"].value + "-" + cells["day"].valueFirst I removed any white space,Next I added leading zeros to the cells that needed two didgetsThen I merged three columns so that date is now in the correct format.18Step 4

ASL2015

First I removed any white space,Next I added leading zeros to the cells that needed two didgetsThen I merged three columns so that date is now in the correct format.19Next stepsASL2015

Create relationships between objectsCreate QDC XMLhttps://www.utsc.utoronto.ca/digitalscholarship/content/blogs/converting-spreadsheets-modsxml-using-open-refineIngest into DRI20ASL2015

21ASL2015

THANK YOU!22

Recommended

View more >