interoperability aspects in europeana
DESCRIPTION
Interoperability Aspects in Europeana. Antoine Isaac [email protected]. Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen. europeana.eu: the mission. Making European cultural heritage better (web-)accessible - PowerPoint PPT PresentationTRANSCRIPT
Interoperability Aspects in Europeana
Antoine [email protected]
Workshop on Research Metadata in Context7./8. September 2010, Nijmegen
europeana.eu: the mission
• Making European cultural heritage better (web-)accessible
• Federating (online) cultural collections across countries and domains
• Hundreds of institutions, millions of objects
europeana.eu in practice
• We rely on aggregating from our providers:• Metadata• References to digital objects
• We have a portal• End-user “show-case”
• We will strive to become a metadata distributor• Allowing partners to get enriched (contextualized) data for their
objects• Allowing third-parties to deploy object access functions similar to
Europeana’s, in their own services
Current status – provider data
• Very heterogeneous: different communities, different institutions, different interests and means
• Descriptions of original objects and digital objects uses hundreds of vocabularies, e.g.:
• Libraries: “MARC-style” records• Museums: very diverse, richest ones with event-based
descriptions (CIDOC-CRM)• Archives: “EAD-style” hierarchical finding aids• Cross-field “container” formats: METS
Current status – provider data
• Grain varies
• Quality varies• Free keyword indexing• Explicit or implicit use of controlled vocabularies
• Adhoc vs. more standard (DDC, AAT, etc.)
• Persistent identifier usage not widespread• (National) Libraries are doing better
Current Europeana metadata stream
• Europeana Semantic Elements for ingestion of descriptive metadata and pointers to digital objects
• Dublin Core fields + Europeana-specific one• Providers do the mapping from their data to ESE
• Ingestion process: OAI-PMH, still often via files
• (Fielded) full-text search using SOLR/Lucene
Limitations of ESE
• Simple “flat” format• Loosing richer (structured) data• OK for full-text indexing and search• Not ok for all the rest (display, access to data, richer search)
• Variations of DC field usage across collections• dc:coverage• dc:rights
Digression: talk about rights?
• Lots of objects, with rights not cleaned yet• Collection-level approaches are difficult to implement• Rights of metadata different of rights for “real” objects• Result: users don’t know in Europeana the rights status of the
object they can access• They have to go to providers’ site for each object• Deterring reference and re-use
• Recent developments: trying to• Encourage provision of rights at object-level• Use “controlled vocabularies” for rights (CC)• Promote public domain (esp. for metadata)
EDM requirements & principles
1. Distinction between “provided object” (painting, book, program) and digital representation
2. Distinction between object and metadata record describing an object
3. Allow for multiple records for same object, containing potentially contradictory statements about an object
4. Support for objects that are composed of other objects5. Standard metadata format that can be specialized6. Standard vocabulary format that can be specialized7. EDM should be based on existing standards
EDM basics
Re-using available vocabularies• OAI ORE for organization of metadata about an object• Dublin Core for core metadata representation• SKOS for vocabulary representation
EDM basics
• A semantic web-inspired model• E.g., DC would not be used with text fields alone, reference to
controlled vocabularies (via URIs) will be encouraged
• Keeping original descriptive metadata• Achieving interoperability through mapping (cf. Peter’s “profile
matching”?)
• Flexibility–ingesting richer original metadata– is a main requirement
• Even though we might not really use ourselves all of the data at its full potential, e.g. for search
Around the data model
• Opportunity (and need) to get and produce richer metadata• De-duplication• Semantic enrichment with contextual resources (thesauri, authority
lists) within and outside Europeana• Alignment of contextual resources
• Linked Data: serving data on the web, pointing to others’ data• Fits very well Europeana missions
Around the data model
Rationalization of data ingestion, archival and dissemination process (OAIS) makes more explicit what Europeana needs to do to behave more as a real metadata archive
• Not only feeding a Lucene/SOLR instance
• Cope with enrichments, versions, pointers to external resources
• Registries of vocabularies (metadata structures and controlled value vocabularies) and links between them
Encouraging community initiatives
Best practices for representing and providing metadata can be seen as a complement to the general EDM.Building interoperability cores at community-level
• Museums:• ATHENA project (LIDO format)
• Audio/visual: • PrestoPrime, European Film Gateway
• Archives:• APEnet (using EAD)
Thank you!
References for ESE and EDM:http://version1.europeana.eu/web/guest/technical-requirements/http://version1.europeana.eu/web/europeana-project/technicaldocuments/