digital medieval data curation

46
Digital Medieval Data Curation CLIR Postdoctoral Fellowship Seminar Bryn Mawr, 2013 Benjamin Albritton, Stanford University Libraries [email protected] @bla222

Upload: blalbritton

Post on 18-Jul-2015

101 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Digital Medieval Data Curation

Digital Medieval Data Curation

CLIR Postdoctoral Fellowship SeminarBryn Mawr, 2013Benjamin Albritton, Stanford University [email protected]@bla222

Page 2: Digital Medieval Data Curation

Current State: A World of Silos

Roman de la Rose Parker on the Web e-codices And so on…

Page 3: Digital Medieval Data Curation

Data Interoperability

• Break down silos

• Separate data from applications

• Share data models and programming interfaces

• Enable interactions at the tool and repository level

Page 4: Digital Medieval Data Curation

Designing Modular Repositories and

Tools

Image Data (Canonical)

Image

Viewer

Discovery

Annotation

Non-image data (Canonical)

Transcription

Image Viewer

Image

AnalysisDiscovery Tool X?

Repository

Repository

User

Interface

3rd-Party

Tools

Page 5: Digital Medieval Data Curation

Image Data (Canonical)

Image

Viewer

Discovery

Annotation

Non-image data (Canonical)

Transcription

Image Viewer

Image

AnalysisDiscovery Tool X?

Repository

Repository

User

Interface

3rd-Party

Tools

Designing Modular Repositories and

Tools

Page 6: Digital Medieval Data Curation

Image Data (Canonical)

Image

Viewer

Discovery

Annotation

Non-image data (Canonical)

Transcription

Image Viewer

Image

AnalysisDiscovery Tool X?

Designing Modular Repositories and

Tools

Page 7: Digital Medieval Data Curation

Iterative Interactions

Page 8: Digital Medieval Data Curation

Multiple Data Sources

• Existing structured data (catalogs)

• User-added

– Comments

– Transcriptions

– Etc.

• Digital images

• Machine processing

Page 9: Digital Medieval Data Curation

Motivating Questions

What does this mean for medieval data?

• How do we rethink medieval object data in a shared, distributed, global space?

• How do we enable collaboration and encourage engagement?

• How do we deal with tools that are producing new data on digital surrogates that are implicitly about a real world object?

Page 10: Digital Medieval Data Curation

Transcribing from Digital Surrogates

La Terre de Secille

Page 11: Digital Medieval Data Curation

Naïve Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiR

Page 12: Digital Medieval Data Curation

Naïve Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiR Fold A Open

Page 13: Digital Medieval Data Curation

Naïve Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiR Fold A Open Fold A and B Open

Page 14: Digital Medieval Data Curation

Naïve Approach: Attach Transcription to ImageOne problem example: Multiple Representations

CCC 26 f. iiiR Fold A Open Fold A and B Open f. iiiV

Page 15: Digital Medieval Data Curation

The Shared Canvas

• Represents a real world thing we want to “talk” about

• Has a unique name• http://dms-data.stanford.edu/Parker/CCC026/canvas-12

Page 16: Digital Medieval Data Curation

Data Model: SharedCanvas

http://www.shared-canvas.org

Page 17: Digital Medieval Data Curation

Data is “about” a real thing

Page 18: Digital Medieval Data Curation

Canvas Paradigm• A Canvas is an empty space in which to build up a display

• Makes explicit that the image is a surrogate

Page 19: Digital Medieval Data Curation

Open Annotation Model• Annotation (a document)

• Body (the ‘comment’ of the annotation)

• Target (the resource the Body is ‘about’)

Page 20: Digital Medieval Data Curation

Model: Annotations to Paint Canvas

• The Canvas represents the empty page

• Annotation links Image with Canvas

Page 21: Digital Medieval Data Curation

Model: Annotations to Paint Canvas

• Annotation links Text with Canvas

Page 22: Digital Medieval Data Curation

Model: Annotations to Paint Canvas

Page 23: Digital Medieval Data Curation

Model: Missing Pages

Page 24: Digital Medieval Data Curation

Medieval Data Use-Cases: A Sampler

• Structured data from existing sources

• Transcription and glyphs

• Structured data from new sources

Page 25: Digital Medieval Data Curation

Structured Data from Existing Sources

A Catalog of the Manuscripts of Salisbury Cathedral Library

Page 26: Digital Medieval Data Curation

Drives Discovery

Page 27: Digital Medieval Data Curation

Transcription:T-PEN (Saint Louis University) http://t-pen.org

• Transcription tool

• Provides image parsing

– Columns

BNF fr. 9221 – column parsing

Page 28: Digital Medieval Data Curation

T-PEN (Saint Louis University)http://t-pen.org

• Transcription tool

• Provides image parsing

– Columns

– Lines

BNF fr. 9221 – line parsing

Page 29: Digital Medieval Data Curation

T-PEN (Saint Louis University)http://t-pen.org

BNF fr. 9221 – transcription view

Page 30: Digital Medieval Data Curation

Drives Full-Text Search

http://t-pen.org/TPEN

Page 32: Digital Medieval Data Curation

T-PEN’s PaleoTool

BNF fr. 1586 – glyph parsing

Page 33: Digital Medieval Data Curation

Results for “matching” glyphs

Page 34: Digital Medieval Data Curation

Glyphs with multiple letters

Page 35: Digital Medieval Data Curation

Comparing results across manuscripts

BNF fr. 1586 CCCC 324

Page 36: Digital Medieval Data Curation

User-created Structured Data

Beinecke MS 310, f. 1r

• Each row = 1 day (January 1, here)• Lists the feast of the Circumcision• Optionally provides additional information

Page 37: Digital Medieval Data Curation

Distributed Resources / Distributed Environments

Page 38: Digital Medieval Data Curation

Data capture in T-PEN

http:t-pen.org – Saint Louis University

Page 39: Digital Medieval Data Curation

Front-end: Exhibit

http://guillaumedemachaut.com/kalendar/sharedkalendar.htmlSimple (really simple) Exhibit based on kalendar transcriptions(Exhibit: http://www.simile-widgets.org/exhibit/)

Page 40: Digital Medieval Data Curation

For each record:

Page 41: Digital Medieval Data Curation

Enabling rapid comparison

Two mss. include the entry “Thimotheus apostel”

Page 42: Digital Medieval Data Curation

Distributed Resources / Distributed Environments

Page 43: Digital Medieval Data Curation

SharedCanvas Demo Implementation

http://www.shared-canvas.org/impl/demodh

Page 44: Digital Medieval Data Curation

SharedCanvas Demo Implementation

http://www.shared-canvas.org/impl/demodh

Page 45: Digital Medieval Data Curation

SharedCanvas Demo Implementation

http://www.shared-canvas.org/impl/demodh

Page 46: Digital Medieval Data Curation

A Sea of Manuscript Data• Thousands of manuscripts currently available

interoperably, with more coming rapidly

• Discovery data is a mixed bag

• Tools provide data back into the system that can be re-used

• New data drives new discovery, new interfaces, and new visualization challenges

• Management and manipulation of that “wild” data is a serious challenge