the enrich project and tei customisation

20
TEI @ Oxford, 21.07.2009 Dias 1 Nordisk Forskningsinstitut The ENRICH project and TEI customisation M. J. Driscoll Den Arnamagnæanske Samling Nordisk Forskningsinstitut

Upload: adin

Post on 10-Feb-2016

52 views

Category:

Documents


0 download

DESCRIPTION

The ENRICH project and TEI customisation. M. J. Driscoll Den Arnamagnæanske Samling Nordisk Forskningsinstitut. It began with a simple idea. What if you could look at all the catalogues of all the manuscript repositories everywhere in Europe at the same time?. Studley Priory, 1996. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 1

Nordisk Forskningsinstitut

The ENRICH project and TEI customisation

M. J. DriscollDen Arnamagnæanske SamlingNordisk Forskningsinstitut

Page 2: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 2

Nordisk Forskningsinstitut

It began with a simple idea

What ifyou could look at all the catalogues of all the manuscript repositorieseverywhere in Europeat the same time?

Page 3: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 3

Nordisk Forskningsinstitut

Studley Priory, 1996

In November 1996 a meeting was held at Studley Priory, near Oxford, organised by Peter Robinson of De Montfort University (UK). There were representatives from a number of manuscript-holding institutions in Great Britain, France, Germany, The Netherlands, Denmark, The Czech Republic, Italy (The Vatican) and the USA, who met to discuss whether there was need for an international standard for manuscript description, and, if so, what form that standard should take.

Page 4: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 4

Nordisk Forskningsinstitut

MASTER, 1999-2001

MASTER (Manuscript Access Through STandards for Manuscript Records) was an internation project, funded by the EU, the goal of which was to establish a standard for the electronic cataloguing of manuscripts using SGML/XML.

Principal project members were: • The Centre for Technology and the Arts at De Montfort

University, Leicester (UK)• Oxford University’s Humanities Computing Unit (UK)• Koninklijke Bibliotheek, Den Haag (NL)• L'Institut de recherche et d'histoire des textes, Paris (FR)• Národní knihovna České republiky, Praha (CZ)• The Arnamagnæan Institute, Copenhagen (DK)• along with a number of associate partners from various

places in Europe.

Page 5: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 5

Nordisk Forskningsinstitut

The Text Encoding Initiative

The TEI is an international and interdisciplinary standards project established in 1987 to develop, maintain and promulgate hardware- and software-independent methods for encoding humanities data in electronic form.

The current version of the TEI Guidelines, TEI P5, released in November 2007, contains a major new chapter on manuscript description, based largely on the work of the MASTER project.

Page 6: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 6

Nordisk Forskningsinstitut

Manuscriptorium

Manuscriptorium is a digital library of manuscripts and early printed books developed and maintained by the Czech National Library in Prague.

It is ‘a system for collecting and making accessible on the internet information on historical book resources, linked to a virtual library of digitised documents’.

The underlying metadata in Manuscriptorium is in ‘Master+’, which is standard Master with additional structural metadata.

http://www.manuscriptorium.com

Page 7: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 7

Nordisk Forskningsinstitut

Towards a European digital library of manuscripts

The ENRICH project (2007-2009) is funded under the eContent+ programme. It’s main aim is to create seamless access to distributed information on manuscripts and early printed books in Europe, based on the Manuscriptorium platform but implementing TEI P5.

http://enrich.manuscriptorium.com/

Page 8: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 8

Nordisk Forskningsinstitut

ENRICH project partners

Národní knihovna České republiky, Praha (CZ) AIP Beroun, s.r.o., Beroun (CZ)Oxford University Computing Services (UK)Centro per la comunicazione e l’integrazione dei media,

Università degli Studi di Firenze (IT)Matematikos ir informatikos institutas, Vilnius (LT)SYSTRAN s.a., Paris (FR)Biblioteca Nacional de España, Madrid (ES)Biblioteca Nazionale Centrale di Firenze (IT)Vilniaus universiteto biblioteka (LT)Biblioteka Uniwersytecka we Wrocławiu (PL)Stofnun Árna Magnússonar í íslenskum fræðum, Reykjavík (IS)Universität zu Köln (DE)Monasterium Projekt, Diözese St. Pölten (AT)Landsbókasafn Íslands – Háskólabókasafn, Reykjavík (IS)Budapesti Műszaki és Gazdaságtudományi Egyetem (HU)Poznańskie Centrum Superkomputerowo-Sieciowe (PL)Nordisk Forskningsinstitut, Københavns Universitet (DK)

Page 9: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 9

Nordisk Forskningsinstitut

The ENRICH project and TEI P5

One of the project’s central work packages, WP3, deals with the ‘standardisation of shared metadata’. Its goal is ‘to ensure interoperability of the metadata used to describe all the shared resources by analysing the various standards used by different partners and ensuring their mapping to a single common format, which will be expressed in a way conformant with current standards.’

Page 10: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 10

Nordisk Forskningsinstitut

ENRICH and TEI P5

• Differences between TEI P5 and Master+ were reviewed• All differences were resolved, either by constraining

Manuscriptorium practice, or by adapting P5 proposals• Actual praxis in a wide sample (1000+) of existing

manuscript description records in many formats • Identified common core of practice, much smaller than potential

of existing TEI schema

Page 11: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 11

Nordisk Forskningsinstitut

What was done, II

• TEI P5 is designed to support a huge range of document types and encoding practices

• For ENRICH, a much more constrained subset was defined, reflecting actual practice

• constraining value lists• making certain attributes obligatory• reducing structural choices• reducing scope for redundancy

Page 12: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 12

Nordisk Forskningsinstitut

What was done, III

• The ENRICH schema is formally defined using the TEI ODD system

• This XML vocabulary allows us to generate automatically:• full multilingual documentation• formal schemata in DTD, RelaxNG or W3C Schema

• Its TEI-conformance makes it accessible to many other projects

Page 13: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 13

Nordisk Forskningsinstitut

Scope of the ENRICH schema

• The ENRICH schema provides a formal way of recording information about a manuscript resource, expressed in XML

• Such records can be managed and stored independently of the resources they describe

• It also provides a formal way of encoding in XML: • A detailed transcription of the resource• Information about images (etc) of the resource• Information about real-world entities associated with the

resource, i.e. people, places and events

Page 14: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 14

Nordisk Forskningsinstitut

Some challenges

• Synchronising ENRICH requirements with TEI P5• We worked closely with the TEI Council, which was revising the

manuscript module at the same time• Reaching consensus among partners

• We worked closely with AIP to ensure that Manuscriptorium was able to support the full complexity of TEI P5

• We were able to use the TEI I18N features to produce reference documentation in French, Italian, Spanish as well as English (other languages will follow)

Page 15: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 15

Nordisk Forskningsinstitut

Outreach and training

• We have tested the ideas behind the ENRICH schema in many different training contexts

• We have produced a suite of training materials covering • Basic ideas of XML markup• TEI modules for metadata, basic document structure,

manuscript description and transcription, persons and places, facsimiles, non-standard writing systems...

Page 16: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 16

Nordisk Forskningsinstitut

Conversion tools

• We have developed a suite of XSLT stylesheets and associated workflows to convert between existing metadata formats and ENRICH

• So far we have worked with • MASTER (+)• EAD• MARC

• We are currently developing the ‘ENRICH Garage’ concept...

Page 17: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 17

Nordisk Forskningsinstitut

How does ENRICH help?

ENRICH provides a system which facilitates:

• lossless conversion of existing data• creation of completely new data• integration of existing data from many different sources

It is based on open formats and open technologies

Page 18: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 18

Nordisk Forskningsinstitut

The ENRICH conceptual model

• Each description deals with a particular object (not a class ofobjects)• Each description is organised using the same possible set ofcomponents• Major components:

• an identification for the object• descriptions of the “intellectual content” of the object• a description of the object’s physical makeup• a description of the object’s history

• The description has a special place in the TEI model of thedigital edition

Page 19: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 19

Nordisk Forskningsinstitut

Result: The complete digital surrogate

• A collection of digital images representing the appearance ofthe manuscript• An associated TEI Header containing a manuscript description• An encoded transcription, optionally incorporating arbitrarilycomplex layers of scholarly interpretation and analysis• An associated body of factual information about e.g. events,persons and places

XML provides the tools to represent all these and to link theircomponents seamlessly together

Page 20: The ENRICH project  and TEI customisation

TEI @ Oxford, 21.07.2009 Dias 20

Nordisk Forskningsinstitut

ENRICH, for all your metadata needs

Why not join us today?Our operators are standing by to take your call.

Just dial 1-800-I-ENRICH.